Knowledge Graph Entity Alignment

Version	Summary	Created by	Modification	Content Size	Created at	Operation
1		han zhaoyang	--	1450	2023-05-30 16:03:57	\|
2	update references and layout	Rita Xu	Meta information modification	1450	2023-05-31 03:08:39	\|

This entry is adapted from the peer-reviewed paper 10.3390/app13105876

The objective of the entity alignment (EA) task is to identify entities with identical semantics across distinct knowledge graphs (KGs) situated in the real world, which has garnered extensive recognition in both academic and industrial circles.

knowledge graphs entity alignment character embeddings

1. Introduction

As technology for storing complex structured and unstructured data, knowledge base (KB) has been widely applied in various fields relating to artificial intelligence. Among them, knowledge graphs (KGs), as the most common representation of knowledge bases, have made significant progress and have been extensively applied across diverse application scenarios such as recommendation systems, information retrieval, machine translation, and so on, drawing high levels of attention from both industry and academia ^[1]. However, different institutions and organizations construct knowledge graphs using different technologies and languages for their own purposes, resulting in heterogeneous structures and complementary contents. The same entity can exist in various forms across distinct knowledge graphs. Therefore, how to efficiently organize this redundant information to form a more comprehensive knowledge graph for downstream tasks is an important challenge currently faced by this field. The objective of entity alignment is to establish connections between identical entities present in two distinct knowledge graphs. This facilitates the transfer of valuable information from one KG to its corresponding entity in another KG, thus enriching the content of both and contributing significantly to the performance of downstream applications.

Traditional entity alignment methods ^[2] typically focus on structured data sources, such as relational databases, and use heuristic or data mining methods to calculate the similarity between different entities, aiming to improve the effectiveness of entity matching. However, with the growth of data, the efficiency of traditional methods for entity matching needs to be improved. Moreover, knowledge graphs are semi-structured data structures, and the accuracy of the traditional entity alignment techniques is limited, while heuristic algorithms are also difficult to generalize.

With the emergence of Word2Vec ^[3], the task of entity alignment has gradually been bifurcated into approaches that are based on translation and those that are based on graph neural networks ^[4] (GNNs). The fundamental concept behind both of these approaches is consistent. It involves acquiring an efficient vector representation of the KG within a low-dimensional space and subsequently executing entity alignment tasks based on the learned vector representation. This technique is collectively known as representation learning. A large number of experiments have evinced that translation-based methods are more suitable for link prediction, whereas GNNs excel at incorporating the neighborhood features of nodes. These capabilities can be harnessed to devise pertinent techniques for feature acquisition in entity alignment tasks, thereby enabling the attainment of superior accuracy and generalization capacity ^[5].

Although GNNs methods have achieved remarkable results, there are still three limitations. Firstly, most methods ^[6]^[7]^[8] regard KGs as homogeneous graphs and do not consider the heterogeneity of edges between different entities, whereas heterogeneous information can augment the accuracy and resilience of the model. Secondly, although many methods consider some semantic information beyond the relational structure, such as entity property information ^[9], entity description information ^[10], and entity name information ^[11], the more semantic information integrated into the methods, the more data are required, which is difficult to satisfy in many practical scenarios where the seed entities are often insufficient. Thirdly, some other works ^[12] only use the relational structures of different KGs to extract inter-graph information using graph matching networks (GMN) ^[13] to explore more analogous features between aligned entities. However, the introduction of the matching module throughout the training process results in an increase in the space and time complexity of the model, thereby impacting its efficiency.

The proposed heterogeneous graph transformer with relation awareness (HGTRA) addresses the first limitation by effectively extracting similar features from aligned entities within their respective heterogeneous structures. To address the aforementioned latter two limitations, a new embedding model is introduced. The model initially generates property embeddings from the knowledge graph’s property information and then relocates the entity embeddings of two knowledge graphs to the same vector space by leveraging the property embeddings. Thus, the similarity of properties between two knowledge graphs is crucial for generating a unified embedding space, which is also a major challenge in knowledge graph alignment tasks. Utilizing property embeddings, the PCE-HGTRA model can reciprocally transform the entity embeddings of both knowledge graphs to the identical vector space, allowing the entity embeddings to capture the property similarity from both knowledge graphs.

2. Knowledge Graph Entity Alignment

Entity Alignment based on Translation. This methodology presented in this research is chiefly rooted in TransE ^[14] and certain of its adaptations, with the core idea of representing the relationship between two entities as the transformation between their embedded representations in order to ensure that entities with analogous structures in different KGs are in close proximity in the embedding space, achieving the goal of preserving entity structural information. MtransE ^[15] is the first work to introduce embeddings in a multilingual setting. This work models entities and relationships using TransE, embedding each entity and relationship in different embedding spaces in each knowledge graph, and based on pre-aligned entities, the transformation between the two vector spaces is evaluated. The model includes a knowledge module for encoding and an alignment module for learning. This work proposes three learning strategies, including linear transformations, translation vectors, and distance-based axis alignment, with linear transformations having the best performance. JAPE ^[16] utilizes embeddings of relations and properties to optimize the embedding effect of the knowledge graph. Specifically, the method jointly embeds two distinct knowledge graphs into a shared vector space and improves the effect by embedding property information. In addition, customized data preprocessing techniques are used to facilitate the sharing of the same or similar embeddings among aligned entities in the seed alignment, allowing the model to achieve cross-lingual entity alignment. IPTransE ^[17] adopts semi-supervised learning and a margin-based loss function and uses bootstrapping techniques to add newly aligned entities to the seed entities, thereby expanding the number of available resources while ensuring quality. This model improves upon the underlying TransE method with PTransE, which captures indirectly connected entities by observing the paths between entities and constructing transformations between entities based on the path information composed of relation predicates connecting multiple entities. This model relies on seed entities and divides the transformation between the embedding spaces of both knowledge graphs into three strategies: translation, linear transformation, and parameter sharing, with parameter sharing being the most effective.

Entity alignment based on GNNs. The main approach for entity alignment using graph attention networks (GATs) and graph convolutional networks (GCNs) involves aggregating the neighborhood features of each entity to obtain neighborhood similarities between the corresponding aligned entities. GCN_align ^[6] was the first GNN-based EA study, which achieved alignment through the margin-based loss function. This study treated property triplets as relation triplets, learned entity embeddings from structural information, and used two GCNs to embed entities from two knowledge graphs into a unified space with a shared weight matrix. RDGCN ^[7], also a margin-based loss function, integrated relation information through an attentional interaction mechanism and extended GCNs with relation information and a high-speed gating mechanism to capture neighborhood structural information, similar to HGCN ^[11]. SEA ^[18] achieved alignment by utilizing cyclic consistency constraints and aligning unaligned entities.

Entity alignment based on heterogeneous GNNs. Recently, numerous academic studies have attempted to apply graph neural networks (GNNs) for modeling heterogeneous graphs. Among them, RGCNs ^[19] and RGATs ^[20] describe heterogeneous graphs by utilizing weight matrices for each relationship. HAN ^[21] has innovatively proposed a hierarchical attention mechanism, learning weights of nodes and meta-paths from both the node level and semantic level. Meanwhile, HetGNN ^[22] employs various recurrent neural networks (RNNs) to integrate multimodal features to deal with various types of nodes. However, due to the existence of a large number of relations in knowledge graphs (KG), the application of these methods to KG models results in high training complexity. Recently, HGT ^[23] and RHGT ^[24] have attempted to describe heterogeneity using heterogeneous graph transformers. Nevertheless, these methods are not specifically designed to capture neighbor similarities; thus, they are not directly applicable to entity alignment tasks. Consequently, researchers propose an enhanced heterogeneous graph transformer that takes into account the heterogeneity of knowledge graphs and provides high-quality entity embeddings for entity alignment tasks.

Self-Supervised learning models for knowledge graphs. To capture the semantic discrepancies between entities and relationships in knowledge graphs, contrastive learning has emerged as a viable technique. Cutting-edge research has recently merged knowledge graph representation with contrastive learning, giving rise to the development of a universal knowledge graph contrastive learning framework, KGCL ^[25]. The framework aims to reduce noise in the underlying data supporting recommendation systems and provides stronger knowledge representation capabilities. The CKGC ^[26] method differentiates between descriptive attributes and traditional relationships in the knowledge graph, connecting the remaining parts as a structure to broaden the descriptive information scope of the knowledge graph.

References

Zhang, F.; Yang, L.; Li, J.; Cheng, J. A survey of entity alignment research. J. Comput. Sci. 2022, 45, 1195–1225.
Zhuang, Y.; Li, G.; Feng, J. A survey on knowledge base entity alignment techniques. J. Comput. Res. Dev. 2016, 53, 165–192.
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781.
Meng, P. A survey on entity alignment based on graph neural networks. Mod. Comput. 2020, 2020, 37–40.
Xu, Y.; Zhang, H.; Cheng, K.; Liao, X.; Zhang, Z.; Li, L. A survey of knowledge graph embedding. J. Comput. Eng. Appl. 2022, 58, 30–50.
Wang, Z.; Lv, Q.; Lan, X.; Zhang, Y. Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks. In Proceedings of the EMNLP, Brussels, Belgium, 31 October–4 November 2018.
Wu, Y.; Liu, X.; Feng, Y.; Wang, Z.; Yan, R.; Zhao, D. Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs. In Proceedings of the IJCAI, Macao, China, 10–16 August 2019.
Sun, Z.; Wang, C.; Hu, W.; Chen, M.; Dai, J.; Zhang, W.; Qu, Y. Knowledge Graph Alignment Network with Gated Multi-Hop Neighborhood Aggregation. In Proceedings of the AAAI, New York, NY, USA, 7–12 February 2020.
Teong, K.S.; Soon, L.K.; Su, T.T. Schema-Agnostic Entity Matching using pre-trained Language Models. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, Ireland, 19–23 October 2020; pp. 2241–2244.
Kang, S.; Ji, L.; Liu, S.; Ding, Y. A Cross-lingual Entity Alignment Model Based on Entity Descriptions and Knowledge Vector Similarity. J. Electron. 2019, 47, 1841–1847.
Wu, Y.; Liu, X.; Feng, Y.; Wang, Z.; Zhao, D. Jointly Learning Entity and Relation Representations for Entity Alignment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 240–249.
Wu, Y.; Liu, X.; Feng, Y.; Wang, Z.; Zhao, D. Neighborhood Matching Network for Entity Alignment. In Proceedings of the ACL, Online, 5–10 July 2020.
Li, Y.; Gu, C.; Dullien, T.; Vinyals, O.; Kohli, P. Graph Matching Networks for Learning the Similarity of Graph Structured Objects. In Proceedings of the ICML, Long Beach, CA, USA, 9–15 June 2019.
Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the NeurIPS, Lake Tahoe, Nevada, USA, 5–10 December 2013.
Chen, M.; Tian, Y.; Yang, M.; Zaniolo, C. Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment. In Proceedings of the IJCAI, Melbourne, Australia, 19–25 August 2017.
Sun, Z.; Hu, W.; Li, C. Cross-lingual entity alignment via joint property-preserving embedding. In Proceedings of the International Semantic Web Conference (ISWC), Vienna, Austria, 21–25 October 2017; pp. 628–644.
Zhu, H.; Xie, R.; Liu, Z.; Sun, M. Iterative Entity Alignment via Joint Knowledge Embeddings. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017.
Pei, S.; Yu, L.; Hoehndorf, R.; Zhang, X. Semi-supervised entity alignment via knowledge graph embedding with awareness of degree difference. In Proceedings of the WWW, San Francisco, CA, USA, 13–17 May 2019; pp. 3130–3136.
Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. In Proceedings of the ESWC, Heraklion, Crete, Greece, 3–7 June 2018.
Busbridge, D.; Sherburn, D.; Cavallo, P.; Hammerla, N.Y. Relational Graph Attention Networks. In Proceedings of the ICLR, New Orleans, LA, USA, 6–9 May 2019.
Wang, X.; Ji, H.; Shi, C.; Wang, B.; Ye, Y.; Cui, P.; Yu, P.S. Heterogeneous Graph Attention Network. In Proceedings of the WWW, San Francisco, CA, USA, 13–17 May 2019.
Zhang, C.; Song, D.; Huang, C.; Swami, A.; Chawla, N.V. Heterogeneous Graph Neural Network. In Proceedings of the SIGKDD, Anchorage, AK, USA, 4–8 August 2019.
Hu, Z.; Dong, Y.; Wang, K.; Sun, Y. Heterogeneous Graph Transformer. In Proceedings of the WWW, Taiwan, China, 20–24 April 2020.
Mei, X.; Cai, X.; Yang, L.; Wang, N. Relation-aware Heterogeneous Graph Transformer based drug repurposing. Expert Syst. Appl. 2022, 190, 116165.
Yang, Y.; Huang, C.; Xia, L.; Li, C. Knowledge Graph Contrastive Learning for Recommendation. In Proceedings of the SIGIR, Madrid, Spain, 11–15 July 2022.
Cao, X.; Shi, Y.; Wang, J.; Yu, H.; Wang, X.; Yan, Z. Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation. In Proceedings of the ACM MM, Lisboa, Portugal, 10–14 October 2022.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.

Upload a video for this entry

Information

Subjects: Computer Science, Artificial Intelligence

Contributors MDPI registered users' name will be linked to their SciProfiles pages. To register with us, please refer to https://encyclopedia.pub/register :

Hongchan Li

Zhaoyang Han

Haodong Zhu

Yuchao Qian

View Times: 204

Update Date: 31 May 2023

Table of Contents

Video Upload Options

Confirm

1. Introduction

2. Knowledge Graph Entity Alignment

References