Community-Specific Overview of Knowledge Graph Research

Community-Specific Overview of Knowledge Graph Research: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor: Mayank Kejriwal

Knowledge graphs (KGs) have rapidly emerged as an important area in AI over the last ten years. Building on a storied tradition of graphs in the AI community, a KG may be simply defined as a directed, labeled, multi-relational graph with some form of semantics. In part, this has been fueled by increased publication of structured datasets on the Web, and well-publicized successes of large-scale projects such as the Google Knowledge Graph and the Amazon Product Graph. However, another factor that is less discussed, but which has been equally instrumental in the success of KGs, is the cross-disciplinary nature of academic KG research. Arguably, because of the diversity of this research, a synthesis of how different KG research strands all tie together could serve a useful role in enabling more ‘moonshot’ research and large-scale collaborations.

knowledge graphs
applications
natural language processing
semantic web
data mining
knowledge representation
graph databases

1. Introduction

With accelerating growth of the Web over the 2000s, and the rise of both e-commerce and social media, knowledge graphs (KGs) have emerged as important models for representing, storing and querying heterogeneous pieces of data that have some relational structure between them, and that typically have real-world semantics ^[1]. The semantics are closely associated with the domain for which the KG has been designed ^[2]. A formal way to define such a domain, favored in the Semantic Web (SW) community, is through an ontology ^[3].

The most common definition of a KG is that it is a directed graph where both edges and nodes have labels. Nodes are considered to be entities, ranging from everyday entities such as people, organizations and locations to highly domain-specific entities such as proteins and viruses (assuming the domain is a biological one). Edges, also known as properties or predicates, represent either relations between entities (e.g., an ‘employed_at’ relation between a person and organization entity) or an attribute of an entity (e.g., a person’s date of birth), typically represented as a literal. Edges and nodes may also be used to represent an entity’s attribute (e.g., the ‘date_of_birth’ of a person entity) and the attribute’s value (e.g., ‘1970-01-01’), respectively. Even definitionally, diversity is observed in KG research. For example, the SW community makes formal distinctions between the two uses of nodes and edges mentioned above, while others, such as NLP, are less formal. (Within SW, nodes representing entities and attribute values are generally referred to as ‘resources’ and ‘literals’, respectively. Similarly, edges representing entity-relations and attributes are, respectively, referred to as ‘object properties’ and ‘datatype properties’.)

An illustrative KG fragment from the tourism domain is visualized in Figure 1. The fragment contains both the actual KG fragment (called the A-Box) and the concepts (nodes shaded in orange) that are part of the T-Box or ontology that models the domain of interest. Put differently, concepts are the types or classes of entities allowable in the domain. Another important aspect of the domain is the set of allowable edge-labels (called properties or predicates) and the constraints associated with them. For example, the ‘employed_at’ relation can be constrained to only map from an entity of type ‘Person’ to an entity of type ‘Organization’. Formally, ‘Person’ and ‘Organization’ would be declared as the allowable domain and range of the predicate ‘employed_at’, similar to a functional constraint in mathematics. The ontology can also have other axioms and constraints. (An intuitive example is a cardinality constraint, e.g., the requirement can be imposed that a ‘married_to’ predicate can be linked to at most one entity-object.) A special predicate called rdf:type serves as an explicit bridge between the A-Box and the T-Box by declaring an entity’s type (which, by definition, is in the T-Box).

/media/item_content/202204/6246768b38fafinformation-13-00161-g001.png

Figure 1. A knowledge graph (KG) fragment. Concepts (that typically belong in the T-Box) are shaded in orange. Links in the figure were accessed on 17 March 2022.

Per the brief formalism above, the semantics of the KG are provided for by the ontology itself, in conjunction with a reasoning engine that (in principle) can detect when the KG is violating the ontology in some way. However, while this formalism is among the most mature in the AI community for expressing, codifying and manipulating the semantics of domain knowledge, it is not the only way. The NLP, knowledge discovery and database communities have much more lightweight and implicit notions of an ontology (usually denoted a ‘schema’ in the academic work, if mentioned explicitly at all).

2. Community-Specific Overview of KG Research

Given that different aspects of KG research are prioritized in different communities, an important component of this entry is to first review the main research priorities (as pertinent to KGs) within those communities. The treatment herein does not imply exclusivity, e.g., information extraction (IE), which is predominantly researched in NLP, has also witnessed interesting research in knowledge discovery and SW ^[4]^[5]. However, an attempt is made to capture the norms and priorities of the overall community to a reasonable extent. One manner in which this attempt was made systematically was to consider the tutorials, workshops and demonstrations published in the top conferences covering these sub-fields over the last 5 years, including the International Semantic Web Conference (ISWC), the Knowledge Discovery and Data Mining (KDD) conference, the Association for Computational Linguistics (ACL), the Web Conference (WebConf; formerly known as the World Wide Web Conference) and core machine learning conferences, such as NeurIPS, International Conference on Learning Representations (ICLR) and International Conference on Machine Learning (ICML). In all of these conferences, there was at least one tutorial, and multiple workshops and demonstrations involving an important aspect of KG research. Some recent (non-exhaustive) examples of such workshops include Heterogeneous Graph Deep Learning and Applications (KDD 2021), Mining Knowledge Graph for Deep Insights (KDD 2020), International Workshop on Semantic Evaluation (ACL 2021) and Workshop on Deep Learning for Knowledge Graphs (ISWC 2021).

In short, only those communities where substantial KG-related research has been published, demonstrated or otherwise promoted (e.g., through tutorials and workshops) to date are considered. A good example of an important AI community that would not meet this condition is Computer Vision. Although some KG research has been published in Computer Vision ^[6], including the construction of multimodal KGs ^[7], the number of KG-related publications is still relatively small compared to the other communities that are covered in this section. Finally, it bears noting that, because KG research is rapidly advancing as a field, some of the areas discussed below may become less relevant for presenting advances in KG research, and others (not currently discussed in depth, such as computer vision) may gain in importance. Hence, this selection of areas should be interpreted as being only quasi-objective and subject to change even in the near future.

2.1. Natural Language Processing (NLP)

KG research can trace its origins to at least two different research areas (NLP and the Semantic Web, which is re-visited subsequently). Within NLP, KGs first emerged as a result of progress in the domain of information extraction (IE), starting from the 1990s with the institution of the Message Understanding Conferences ^[8]. The majority of IE research published over the last three decades has involved either named entity recognition (NER) or relation extraction (RE). Good surveys on the former include work by ^[9]^[10] (the second of which focuses on deep learning methods), while ^[11]^[12] provide a recent, comprehensive survey on the latter.

Since RE research has almost always involved 2-arity relations (where the relation is assumed to exist between a pair of entities), extracted relations and entities can be modeled as triples and placed into (what has been traditionally denoted as) a knowledge base (KB). Prior to the growth of the Web, there was no reason to model these KBs as graphs. Connections between entities became more apparent and important both when the same entity started getting extracted from multiple documents and (much later) when it was discovered that the structural properties of the KB, such as entity and relation co-occurrence features, could lead to improved performance on related tasks such as entity linking ^[13]. Entity linking is the problem of automatically linking an extracted entity to its equivalent in an agreed-upon ‘canonical’ KB like Wikipedia ^[14]. To take a simple example of the utility of a structural feature like co-occurrence, suppose that both ‘V. Williams’ and ‘Wimbledon’ were extracted from a single document. If the entity extraction system attempts to link these two extractions to Wikipedia independently, it becomes difficult to decide whether V. Williams refers to Venus Williams (the tennis player) or Vanessa Williams (the actress), and also whether Wimbledon refers to the tennis grand slam tournament of the same name or Wimbledon, London (where the championships are held, but which is technically different from the event itself). Co-occurrence helps resolve this ambiguity by not linking independently. More complex features help improve performance even further, and a similar philosophy would also apply to related tasks such as co-reference resolution ^[15], which is the problem of determining when words and phrases (including pronouns) refer to a unique entity.

From the perspective of KG research, IE, entity linking and other problems such as co-reference resolution, all play a vital role because they ultimately lead to a higher-quality initial KG. If two extractions, such as ‘V. Williams’ and ‘Venus Williams’, can indeed be linked to the same Wikipedia entry, for example, then they can be modeled as a single node in the KG. Good co-reference resolution can help add more data to the KG (e.g., more facts and relations). For these reasons, and also because of other applications that have arisen over the years (such as question answering ^[16]), improving performance through the design of more sophisticated algorithms and representation learning techniques has always been an important goal in the community. IE problems such as Open IE and event extraction continue to pose challenges ^[17]^[18].

2.2. Semantic Web

Earlier, the concept of the A-Box and the T-Box were briefly introduced. These notions are primarily inspired by description logics, which have heavily influenced KG research in the SW community ^[19]. For example, ^[20] describe how description logics serve as ontology languages for the semantic web. However, in the broader community, modeling and representing KGs is only one part of the equation. An equally important goal is to devise better ways of publishing, linking and accessing this data on the Web. According to a seminal article by ^[21], the Semantic Web is fundamentally an effort to transform the Web by ‘augmenting Web pages with data targeted at computers’.

With the advent of a movement called Linked Data ^[22], KGs modeled in formal graph-friendly languages like Resource Description Framework (RDF) started becoming more common on the Web ^[23], although they are still dwarfed by the volume of natural language text. The KG fragment that was illustrated earlier in Figure 1 is an RDF graph. Data are represented as a set of triples of the form (subject, predicate, object), intuitively representing a directed edge in the graph, where the subject and predicate must be uniform resource identifiers or URIs (and are typically just uniform resource locators for actual datasets), while the object may be a URI or a literal. (Technically, they must be internationalized resource identifiers, which subsume URIs.)

Linked Data are defined as a set of four best practices (https://www.w3.org/wiki/LinkedData accessed on 17 March 2022) for publishing ‘structured data’ (that are, by and large, KGs) on the Web: (i) use URIs as names for things, (ii) use HTTP URIs to enable people to look up those names, (iii) provide useful information when a person looks up a URI and (iv) include links to other URIs to enable greater discoverability ^[22]. Linked Open Data started in 2007 with only a handful (<10) of datasets that has since grown to hundreds of datasets in recent years ^[24], spanning domains as varied as social media ^[25]^[26], biology and life sciences ^[27]^[28], and computational linguistics ^[29]^[30]. The fourth principle, in particular, has made this possible, since without it, different datasets obeying the other three Linked Data principles may still have been siloed. Both classic and recent research in the 50-year-old problem of entity resolution (ER) has made automatic linking of equivalent entities in independent datasets to one another (even at the Web scale, e.g., the author’s previous work on entity name systems ^[31]) much more feasible ^[32]^[33].

Other research priorities in SW include the development of efficient KG querying infrastructures, such as triplestores ^[34]. Recently, such triplestores (along with the related technology of graph databases, which has been a subject of heavy research in the core Database research community, as subsequently detailed) have also started gaining prominence, with at least one major cloud service (Amazon Neptune) available for it ^[35]. Another paradigm that has recently been proposed for data integration and access is the Virtual Knowledge Graph (VKG) paradigm. This paradigm is inspired by the literature on Ontology-Based Data Access (OBDA), which is a well known problem in the Semantic Web community. The key difference between VKGs and OBDA is that the former replaces rigidly structured tables that are a key feature of the latter with flexible graphs. Similar to OBDA, however, the graphs do not have to be ‘materialized’ but can be maintained as a virtual layer and used to capture and represent domain knowledge. A comprehensive overview of systems and use-cases for VKGs is provided in ^[36].

2.3. Core Machine Learning: Representation Learning and Probabilistic Graphical Models

Representation learning and probabilistic graphical models, the best known examples of which are Markov logic networks and Bayesian networks ^[37]^[38]^[39]^[40], have played an equally important role in recent KG research. Representation learning is a more recent phenomenon, with the structured embedding paper by ^[41], followed by influential architectures such as TransE, ConvE and the neural tensor network. Several surveys of such KG-embedding approaches have been published, examples being ^[42]^[43]^[44]. The basic purpose of these methods is to ‘embed’ each node and relation in the KG into a dense, continuous, real-valued vector space. Similar to word embeddings, operations such as link prediction can then be optimized in vector space. In recent years, KG representation learning and refinement have also become popular in other KG communities, such as SW ^[45], natural language processing ^[46] and broad AI topics such as commonsense reasoning ^[47]. More recent surveys on KG embeddings and representation learning include ^[43]^[44]. Beyond surveys, in the SW, examples of KG applications and algorithms include ^[48]^[49]^[50]. Unsurprisingly, the success of these approaches closely mirrors the success of deep learning methods and architectures in related areas. Representation learning has been particularly successful in ‘refining’ KGs by predicting links, detecting incorrect triples and resolving entities.

As KG embeddings have become more advanced, several authors have sought to use other classes of interesting ‘information sets’ with which to obtain higher-quality embeddings. One such type of information is temporal information. Since KG facts can be time-sensitive in some domains (e.g., X co-authored a paper with Y in a given year), the goal is to use time-aware embedding models to further improve KG embeddings ^[51]^[52]. One way in which this can be accomplished is by imposing temporal order constraints on time-sensitive relation pairs. Another way is to model the temporal evolution of KGs by using quadruples rather than triples. This kind of representation is especially well suited for medical or sensor-based domains (e.g., Internet of Things). Other kinds of information sets that have inspired similar research in the machine learning community include relation paths, which are designed to help incorporate richer context into the relationship between a pair of entities ^[53]^[54], rather than ‘single-hop’ relations represented using an edge in the KG, and even logical rules. Although the use of such rules, once a staple of expert systems, is more common in communities such as Semantic Web, their use as regularizers when learning better KG embeddings shows the interdisciplinary connections between these fields. Examples of systems that use rules or rule-based constraints to refine KG embeddings include ^[55]^[56]^[57].

The application of probabilistic graphical models and probabilistic soft logic (PSL) to problems like link prediction predates representation learning by several years ^[58]^[59]. PSL is well suited for large-scale KGs because its optimization is convex. A particularly interesting use case is knowledge graph identification (KGI), wherein the confidence-annotated outputs of tasks like IE and ER (the ‘initial’ KG) are fed into a PSL program, along with ontological constraints ^[60]. The output of the program is a much cleaner KG. The advantage of PSL is that it is able to incorporate a combination of domain knowledge and probabilistic reasoning to ‘identify’ the true KG. Results have been promising. The possible synergy of such probabilistic models with representation learning is an interesting avenue for future research.

2.4. Databases, Data Mining, and Knowledge Discovery in Databases (KDD)

Although distinct from the SW or NLP communities, the knowledge discovery in databases (KDD) and data mining communities have also had a significant influence on KG research in the last 5 years. KGs have been used in innovative applications, including recommender systems ^[61]^[62]. One reason that KGs can make a difference in recommender systems’ performance is their ability to provide useful external knowledge. Combined with deep learning, the external knowledge can make quite a difference. Gao et al. provide a survey on deep learning on KGs for recommender systems ^[63]. They cite the emergence of graph neural networks (GNNs) as an important recent advance in this space ^[64]. Using GNNs in tandem with KGs, recommender systems can be adapted to become more knowledge-aware, and in turn, this also helps such systems adapt to problems such as cold-start. In their survey ^[63], Gao et al. also cite publicly available open-source code and benchmark datasets (examples of which include ^[65]^[66]), showing that the ecosystem is starting to mature, making it more likely that these algorithms will be adopted and refined by independent developers (and possibly, smaller companies who may not have a significant research and development budget) in the near future. Although the use of external knowledge and even taxonomies is not novel in this space ^[67]^[68]^[69]^[70]^[71]^[72]^[73], KGs have historically been difficult to work with due to both scale and noise. GNNs present a robust solution to the problem ^[74].

KGs have also been studied under the umbrella of heterogeneous information networks or HINs ^[75]. The HIN model resembles a KG and it is also a directed graph, but the schema (called a network schema ^[76]) is less formal than the ontologies that are commonly found in the SW community. HINs have found applications in many of the domains that KGs have, including social media, healthcare and bibliographic domains ^[76]. To take healthcare as an example, Ding et al. ^[77] propose considering a biological system to be a ‘complex HIN’ that can be used to explore heterogeneous and complicated relationships between biological entities such as molecules to study distinct phenotypes. This treatment of HINs is reminiscent of domain-specific KGs, especially in biology and medicine (including recently proposed KGs for COVID-19) ^[78]^[27]. HINs have also been applied to recommender systems ^[79], as well as for tasks such as sentiment link prediction and learning structure-aware embeddings ^[80]^[81].

Last but not least, because efficient querying is an important problem in KG research ^[82], techniques developed by the database community, especially in query reformulation and graph databases, have also been influential ^[83]^[84]. Indeed, as argued in a synthesis lecture series on querying graphs ^[85], executing queries on modern graph database systems involves a ‘complete lifecycle’ of processing, with relevant topics of research including graph data models and query languages, graph constraints, query specification and formulation, and query processing. There are many outstanding challenges still in the community, including defining schemas for property graphs, understanding graph representations in a comprehensive and comparative framework, understanding and formalizing advanced graph query optimization techniques, and efficiently evaluating certain classes of queries. These topics are directly relevant to building, maintaining and optimizing KG access (which is fundamentally a graph querying problem), and they continue to be explored in the database community (in particular), with recently published work including ^[86]^[87]^[88]^[89].

This entry is adapted from the peer-reviewed paper 10.3390/info13040161

References

Ehrlinger, L.; Wöß, W. Towards a Definition of Knowledge Graphs. SEMANTiCS (Posters Demos SuCCESS) 2016, 48, 1–4.
Kejriwal, M. Domain-Specific Knowledge Graph Construction; Springer: Berlin/Heidelberg, Germany, 2019.
Hitzler, P. A review of the semantic web field. Commun. ACM 2021, 64, 76–83.
Kejriwal, M.; Szekely, P. Information extraction in illicit web domains. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 997–1006.
Lin, B.Y.; Sheng, Y.; Vo, N.; Tata, S. Freedom: A transferable neural architecture for structured information extraction on web documents. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 1092–1102.
Zhu, Y.; Zhang, C.; Ré, C.; Fei-Fei, L. Building a large-scale multimodal knowledge base system for answering visual queries. arXiv 2015, arXiv:1507.05670.
Zhu, X.; Li, Z.; Wang, X.; Jiang, X.; Sun, P.; Wang, X.; Xiao, Y.; Yuan, N.J. Multi-Modal Knowledge Graph Construction and Application: A Survey. arXiv 2022, arXiv:2202.05786.
Grishman, R.; Sundheim, B.M. Message understanding conference-6: A brief history. In Proceedings of the COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics, Copenhagen, Denmark, 10–14 August 1996.
Nadeau, D.; Sekine, S. A survey of named entity recognition and classification. Lingvist. Investig. 2007, 30, 3–26.
Li, J.; Sun, A.; Han, J.; Li, C. A survey on deep learning for named entity recognition. IEEE Trans. Knowl. Data Eng. 2020, 34, 50–70.
Smirnova, A.; Cudré-Mauroux, P. Relation extraction using distant supervision: A survey. ACM Comput. Surv. (CSUR) 2018, 51, 1–35.
Kumar, S. A survey of deep learning methods for relation extraction. arXiv 2017, arXiv:1705.03645.
Cai, Z.; Zhao, K.; Zhu, K.Q.; Wang, H. Wikification via link co-occurrence. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, San Francisco, CA, USA, 27 October–1 November 2013; pp. 1087–1096.
Shen, W.; Wang, J.; Han, J. Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 2014, 27, 443–460.
Sukthanker, R.; Poria, S.; Cambria, E.; Thirunavukarasu, R. Anaphora and coreference resolution: A review. Inf. Fusion 2020, 59, 139–162.
Clark, J.H.; Choi, E.; Collins, M.; Garrette, D.; Kwiatkowski, T.; Nikolaev, V.; Palomaki, J. TyDi QA: A Benchmark for Information-Seeking Question Answering in Ty pologically Di verse Languages. Trans. Assoc. Comput. Linguist. 2020, 8, 454–470.
Niklaus, C.; Cetto, M.; Freitas, A.; Handschuh, S. A survey on open information extraction. arXiv 2018, arXiv:1806.05599.
Nguyen, T.H.; Cho, K.; Grishman, R. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 300–309.
Nardi, D.; Brachman, R.J. An introduction to description logics. Descr. Log. Handb. 2003, 1, 40.
Baader, F.; Horrocks, I.; Sattler, U. Description logics as ontology languages for the semantic web. In Mechanizing Mathematical Reasoning; Springer: Berlin/Heidelberg, Germany, 2005; pp. 228–248.
Berners-Lee, T.; Hendler, J.; Lassila, O. The semantic web. Sci. Am. 2001, 284, 34–43.
Bizer, C.; Heath, T.; Berners-Lee, T. Linked data: The story so far. In Semantic Services, Interoperability and Web Applications: Emerging Concepts; IGI Global: Pennsylvania, PA, USA, 2011; pp. 205–227.
Miller, E. An introduction to the resource description framework. Bull. Am. Soc. Inf. Sci. Technol. 1998, 25, 15–19.
Bauer, F.; Kaltenböck, M. Linked open data: The essentials. Ed. Mono Monochrom Vienna 2011, 710. Available online: https://africa-toolkit.reeep.org/sites/default/files/LOD-the-Essentials_0.pdf (accessed on 17 March 2022).
Nechaev, Y.; Corcoglioniti, F.; Giuliano, C. Type prediction combining linked open data and social media. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Torino, Italy, 22–26 October 2018; pp. 1033–1042.
Sansonetti, G.; Gasparetti, F.; Micarelli, A.; Cena, F.; Gena, C. Enhancing cultural recommendations through social and linked open data. User Model. User-Adapt. Interact. 2019, 29, 121–159.
Jupp, S.; Malone, J.; Bolleman, J.; Brandizi, M.; Davies, M.; Garcia, L.; Gaulton, A.; Gehant, S.; Laibe, C.; Redaschi, N.; et al. The EBI RDF platform: Linked open data for the life sciences. Bioinformatics 2014, 30, 1338–1339.
Kamdar, M.R.; Fernández, J.D.; Polleres, A.; Tudorache, T.; Musen, M.A. Enabling web-scale data integration in biomedicine through linked open data. NPJ Digit. Med. 2019, 2, 1–14.
Chiarcos, C.; McCrae, J.; Cimiano, P.; Fellbaum, C. Towards open data for linguistics: Linguistic linked data. In New Trends of Research in Ontologies and Lexical Resources; Springer: Berlin/Heidelberg, Germany, 2013; pp. 7–25.
Bond, F.; Foster, R. Linking and extending an open multilingual wordnet. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria, 4–9 August 2013; pp. 1352–1362.
Kejriwal, M. Populating entity name systems for big data integration. In International Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2014; pp. 521–528.
Getoor, L.; Machanavajjhala, A. Entity resolution for big data. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11 August 2013; p. 1527.
Kejriwal, M.; Miranker, D.P. An unsupervised instance matcher for schema-free RDF data. J. Web Semant. 2015, 35, 102–123.
Sakr, S.; Al-Naymat, G. Relational processing of RDF queries: A survey. ACM SIGMOD Rec. 2010, 38, 23–28.
Bebee, B.R.; Choi, D.; Gupta, A.; Gutmans, A.; Khandelwal, A.; Kiran, Y.; Mallidi, S.; McGaughy, B.; Personick, M.; Rajan, K.; et al. Amazon Neptune: Graph Data Management in the Cloud. In Proceedings of the International Semantic Web Conference (P&D/Industry/BlueSky), Monterey, CA, USA, 8–12 October 2018.
Xiao, G.; Ding, L.; Cogrel, B.; Calvanese, D. Virtual knowledge graphs: An overview of systems and use cases. Data Intell. 2019, 1, 201–223.
Richardson, M.; Domingos, P. Markov logic networks. Mach. Learn. 2006, 62, 107–136.
Marra, G.; Kuželka, O. Neural markov logic networks. In Proceedings of the Uncertainty in Artificial Intelligence, Online, 27–30 July 2021; pp. 908–917.
Pearl, J. Bayesian Networks. 2011. Available online: https://ftp.cs.ucla.edu/pub/stat_ser/r277.pdf (accessed on 17 March 2022).
Koski, T.; Noble, J. Bayesian Networks: An Introduction; John Wiley & Sons: Hoboken, NJ, USA, 2011; Volume 924.
Bordes, A.; Weston, J.; Collobert, R.; Bengio, Y. Learning structured embeddings of knowledge bases. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 7–11 August 2011; Volume 25.
Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge graph embedding: A survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 2017, 29, 2724–2743.
Choudhary, S.; Luthra, T.; Mittal, A.; Singh, R. A survey of knowledge graph embedding and their applications. arXiv 2021, arXiv:2107.07842.
Dai, Y.; Wang, S.; Xiong, N.N.; Guo, W. A survey on knowledge graph embedding: Approaches, applications and benchmarks. Electronics 2020, 9, 750.
Paulheim, H. Knowledge graph refinement: A survey of approaches and evaluation methods. Semant. Web 2017, 8, 489–508.
Huang, X.; Zhang, J.; Li, D.; Li, P. Knowledge graph embedding based question answering. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, Melbourne, Australia, 11–15 February 2019; pp. 105–113.
Guan, N.; Song, D.; Liao, L. Knowledge graph embedding with concepts. Knowl.-Based Syst. 2019, 164, 38–44.
Kejriwal, M.; Szekely, P. Neural embeddings for populated geonames locations. In International Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2017; pp. 139–146.
Ristoski, P.; Paulheim, H. Rdf2vec: Rdf graph embeddings for data mining. In International Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2016; pp. 498–514.
Nayyeri, M.; Vahdati, S.; Zhou, X.; Shariat Yazdi, H.; Lehmann, J. Embedding-based recommendations on scholarly knowledge graphs. In European Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2020; pp. 255–270.
Liu, Y.; Hua, W.; Xin, K.; Zhou, X. Context-aware temporal knowledge graph embedding. In International Conference on Web Information Systems Engineering; Springer: Berlin/Heidelberg, Germany, 2020; pp. 583–598.
Xu, C.; Nayyeri, M.; Alkhoury, F.; Yazdi, H.S.; Lehmann, J. TeRo: A time-aware knowledge graph embedding via temporal rotation. arXiv 2020, arXiv:2010.01029.
Jia, N.; Cheng, X.; Su, S. Improving knowledge graph embedding using locally and globally attentive relation paths. In European Conference on Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2020; pp. 17–32.
Xiong, S.; Huang, W.; Duan, P. Knowledge graph embedding via relation paths and dynamic mapping matrix. In International Conference on Conceptual Modeling; Springer: Berlin/Heidelberg, Germany, 2018; pp. 106–118.
Guo, S.; Wang, Q.; Wang, L.; Wang, B.; Guo, L. Jointly embedding knowledge graphs and logical rules. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 192–202.
Wang, P.; Dou, D.; Wu, F.; de Silva, N.; Jin, L. Logic rules powered knowledge graph embedding. arXiv 2019, arXiv:1903.03772.
Guo, S.; Wang, Q.; Wang, L.; Wang, B.; Guo, L. Knowledge graph embedding with iterative guidance from soft rules. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32.
Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques; MIT Press: Cambridge, MA, USA, 2009.
Kimmig, A.; Bach, S.; Broecheler, M.; Huang, B.; Getoor, L. A short introduction to probabilistic soft logic. In Proceedings of the NIPS Workshop on Probabilistic Programming: Foundations and Applications, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1–4.
Pujara, J.; Miao, H.; Getoor, L.; Cohen, W. Knowledge graph identification. In International Semantic Web Conference; Springer: Berlin/Heidelberg, Germany, 2013; pp. 542–557.
Wang, H.; Zhang, F.; Zhang, M.; Leskovec, J.; Zhao, M.; Li, W.; Wang, Z. Knowledge-aware graph neural networks with label smoothness regularization for recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 968–977.
Palumbo, E.; Rizzo, G.; Troncy, R. Entity2rec: Learning user-item relatedness from knowledge graphs for top-n item recommendation. In Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; pp. 32–36.
Gao, Y.; Li, Y.F.; Lin, Y.; Gao, H.; Khan, L. Deep learning on knowledge graph for recommender system: A survey. arXiv 2020, arXiv:2004.00387.
Yin, R.; Li, K.; Zhang, G.; Lu, J. A deeper graph neural network for recommender systems. Knowl.-Based Syst. 2019, 185, 105020.
Zhang, J.; Shi, X.; Zhao, S.; King, I. Star-gcn: Stacked and reconstructed graph convolutional networks for recommender systems. arXiv 2019, arXiv:1905.13129.
Song, W.; Xiao, Z.; Wang, Y.; Charlin, L.; Zhang, M.; Tang, J. Session-based social recommendation via dynamic graph attention networks. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, Melbourne, Australia, 11–15 February 2019; pp. 555–563.
Friedrich, G.; Zanker, M. A taxonomy for generating explanations in recommender systems. AI Mag. 2011, 32, 90–98.
Kejriwal, M.; Shen, K.; Ni, C.C.; Torzec, N. Transfer-based taxonomy induction over concept labels. Eng. Appl. Artif. Intell. 2022, 108, 104548.
Kejriwal, M.; Shen, K.; Ni, C.C.; Torzec, N. An evaluation and annotation methodology for product category matching in e-commerce. Comput. Ind. 2021, 131, 103497.
Kanagal, B.; Ahmed, A.; Pandey, S.; Josifovski, V.; Yuan, J.; Garcia-Pueyo, L. Supercharging recommender systems using taxonomies for learning user purchase behavior. arXiv 2012, arXiv:1207.0136.
Kejriwal, M.; Selvam, R.K.; Ni, C.C.; Torzec, N. Locally constructing product taxonomies from scratch using representation learning. In Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), The Hague, The Netherlands, 7–10 December 2020; pp. 507–514.
Kejriwal, M.; Shen, K. Unsupervised real-time induction and interactive visualization of taxonomies over domain-specific concepts. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Virtual, The Netherlands, 8–11 November 2021; pp. 301–304.
Liang, H.; Xu, Y.; Li, Y.; Nayak, R.; Weng, L.T. Personalized recommender systems integrating social tags and item taxonomy. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, Milan, Italy, 15–18 September 2009; Volume 1, pp. 540–547.
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S.Y. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24.
Shi, C.; Li, Y.; Zhang, J.; Sun, Y.; Philip, S.Y. A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 2016, 29, 17–37.
Sun, Y.; Han, J. Mining heterogeneous information networks: Principles and methodologies. Synth. Lect. Data Min. Knowl. Discov. 2012, 3, 1–159.
Ding, P.; Ouyang, W.; Luo, J.; Kwoh, C.K. Heterogeneous information network and its application to human health and disease. Brief. Bioinform. 2020, 21, 1327–1346.
Domingo-Fernández, D.; Baksi, S.; Schultz, B.; Gadiya, Y.; Karki, R.; Raschka, T.; Ebeling, C.; Hofmann-Apitius, M.; Kodamullil, A.T. COVID-19 Knowledge Graph: A computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. Bioinformatics 2021, 37, 1332–1334.
Shi, C.; Hu, B.; Zhao, W.X.; Philip, S.Y. Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 2018, 31, 357–370.
Wang, H.; Zhang, F.; Hou, M.; Xie, X.; Guo, M.; Liu, Q. Shine: Signed heterogeneous information network embedding for sentiment link prediction. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA, 5–9 February 2018; pp. 592–600.
Lu, Y.; Shi, C.; Hu, L.; Liu, Z. Relation structure-aware heterogeneous information network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 4456–4463.
Goasdoué, F.; Manolescu, I.; Roatiş, A. Efficient query answering against dynamic RDF databases. In Proceedings of the 16th International Conference on Extending Database Technology, Genoa, Italy, 18–22 March 2013; pp. 299–310.
Tatarinov, I.; Halevy, A. Efficient query reformulation in peer data management systems. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France, 13–18 June 2004; pp. 539–550.
Angles, R.; Gutierrez, C. Survey of graph database models. ACM Comput. Surv. (CSUR) 2008, 40, 1–39.
Bonifati, A.; Fletcher, G.; Voigt, H.; Yakovets, N. Querying graphs. Synth. Lect. Data Manag. 2018, 10, 1–184.
Sakr, S.; Bonifati, A.; Voigt, H.; Iosup, A.; Ammar, K.; Angles, R.; Aref, W.; Arenas, M.; Besta, M.; Boncz, P.A.; et al. The future is big graphs: A community view on graph processing systems. Commun. ACM 2021, 64, 62–71.
Angles, R.; Bonifati, A.; Dumbrava, S.; Fletcher, G.; Hare, K.W.; Hidders, J.; Lee, V.E.; Li, B.; Libkin, L.; Martens, W.; et al. Pg-keys: Keys for property graphs. In Proceedings of the 2021 International Conference on Management of Data, Virtual Event, Xi’an, China, 20–25 June 2021; pp. 2423–2436.
Klijn, E.L.; Mannhardt, F.; Fahland, D. Classifying and Detecting Task Executions and Routines in Processes Using Event Graphs. In International Conference on Business Process Management; Springer: Berlin/Heidelberg, Germany, 2021; pp. 212–229.
Lbath, H.; Bonifati, A.; Harmer, R. Schema inference for property graphs. In Proceedings of the EDBT 2021-24th International Conference on Extending Database Technology, Nicosia, Cyprus, 23–26 March 2021; pp. 499–504.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.