Within the knowledge organization systems (KOS) set, the term “ontology” is paradigmatic of the terminological ambiguity in different typologies. Contributing to this situation is the indiscriminate association of the term “ontology”, both as a specific type of KOS and as a process of categorization, due to the interdisciplinary use of the term with different meanings. We present a systematization of the perspectives of different authors of ontologies, as representational artifacts, seeking to contribute to terminological clarification. Focusing the analysis on the intention, semantics and modulation of ontologies, it was possible to notice two broad perspectives regarding ontologies as artifacts that coexist in the knowledge organization systems spectrum. We have ontologies viewed, on the one hand, as an evolution in terms of complexity of traditional conceptual systems, and on the other hand, as a system that organizes ontological rather than epistemological knowledge. The focus of ontological analysis is the item to model and not the intentions that motivate the construction of the system.
In the Knowledge Organization (KO) / Information Science (IS) community several authors, such as
, see the work on ontology coming from the computation field as a kind of reinvention of wheel or an etymological issue, as it concerns classification and other well-known aspects of knowledge organization processes. This situation was reflected in the different views regarding the typology of ontologies as representational artifacts known as knowledge organization systems (KOS). Considering the different technical, structural and functional characteristics of KOS, Mazzocchi
presents as a common denominator the function for which these “semantic tools” were designed: supporting the organization of knowledge and information, in order to facilitate their management and recovery.
Within the KOS set, the term ontology is paradigmatic of the terminological ambiguity used in different typologies. Pieterse and Kourie
(p. 227), e.g., sate that the term ontology “refer to a KOS that can be classified as a relationship list in Hodge’s classification [and] which is classified as a thesaurus in our classification.” Unlike these authors, Biagetti
(sec. 3.1) considers that “ontologies are a kind of KOS that present the highest degree of semantic richness, as they allow to establish a great number of relations between terms, and provide attributes for each class.” Hjørland
(sec. 3.3) also sees ontologies as a different kind of KOS “more general and more abstract” than others “traditional” KOS but, for the author, these “may just be understood as being restricted kinds of ontologies.” Contributing to the latter position will be the indiscriminate association of the term ontology, both as a specific type of KOS and as a categorization process, considered by Smiraglia
as one of the pillars in the development of any KOS. Undifferentiation not advised by Souza and others
(p. 187): “it might be asserted that all KOS are the products of some kind of ontological modeling, but using the term ‘ontologies’ arbitrarily can cause confusion.” In addition to the KO / IS and Computer Science areas, another one, philosophy, is necessary to bring to the debate to understand why the term ontology is also used as a categorization process.
Given the interdisciplinary nature of this topic, terminology issues are of vital importance for proper communication between different communities. In this context, we aim to present a systematization of the perspectives of different authors of ontologies, as representational artifacts, seeking to contribute to terminological clarification. This paper, in addition to this introduction, contains three more sections. In the second section, we present a brief historical contextualization of the term “ontology”, for a better understanding of the interdisciplinary issue. Then, in the third section, we present the systematization referred to above, focusing on the intention, semantics and modulation of ontologies, based on the two major approaches detected in the different perspectives. Finally, in the fourth section, we synthesize the insights of the present work.
The term ontology appeared in the 17th century, being attributed, in parallel but without known connection, to Rudolf Göckel and Jacob Lorhard, although, only after the publication of the work Philosophia prima sive Ontologi by Christian Wolff in 1730, the spread of the term has truly started [12][13]. In that work, Wolff called ontology to Aristotle's “first philosophy” [14]. The object of study of this discipline, which the philosopher also called “first science,” “wisdom” or “theology,” is now also called “Aristotle's Metaphysics” [15]. In short, this work of Aristotle consisted in the systematization and categorization of all the entities that exist in the world [16][17].
While in the context of Philosophy the term ontology is related to a process (study or analysis), in the context of areas linked to knowledge organization/information systems the same term is associated with a product, an artifact. In this context, the term ontology can designate either a concrete system or, in a more abstract sense, a theory, both can be in a formal (logic-based) or informal format [16].
It is in the area called Artificial Intelligence (AI), that the term ontology makes the transition from the field of philosophy to the area linked to information systems. It was Mealy who, in 1967, first used the term ontology in this new context [18]. Although it did so in the philosophical sense of the term referring to ontological analysis because it would enable a better understanding of the structure of the world and thus facilitate its modeling, or part of it, in computational terms [19]. In this same context, another initial milestone occurred with the work of Hayes [20] in the following decades, seeking ”an adequate theory of the world of common sense” [21] (p. 225) for application in robotics.
Despite those early uses of the term ontology to designate a “theory of a modeled world” [22] (p. 1964), the first formulation of a definition for ontology in the context of information systems it only happened in a study of Neches and others published in 1991: “The ontology of a system consists of its vocabulary and a set of constraints on the way terms can be combined to model a domain.” [23] (p. 40). Some authors [24] consider this study of Neches to be a pioneer in the field of in the area of IS despite the connection of its authors to the field of AI. Others, e.g. [1][25], point out an author declaredly associated with the IS, B.C. Vickery, with his 1997 work, where he claims that the issues faced by “ontological engineers” are not new to the IS community [2] (p. 285).
Among the group of researchers who participated in Neches' study is T.R. Gruber whose definition for ontology - an “explicit specification of a conceptualization” [26][27] (p. 199, p. 908) - would become paradigmatic, particularly in areas directly related to computing [28]. This definition was published online in 1992 where the author recognizes the philosophical origin of the term but explicitly departs from its meaning in this context: “this definition is consistent with the usage of ontology as set-of-concept-definitions, but more general. And it is certainly a different sense of the word than its use in philosophy.” [29] (n/p.) In the formal publication of this definition, Gruber also points out the difference in his approach to that made in the original context of the term: “An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of Existence. For knowledge-based systems, what ‘exists’ is exactly that which can be represented.” [26][27] (p. 199, p. 908, italics in original) The key term of this definition is conceptualization, used in the sense given by Genesereth and Nilsson [30] (p. 9, italics in original): “The formalization of knowledge in declarative form begins with a conceptualization. This includes the objects presumed or hypothesized to exist in the world and their interrelationships. The notion of an object used here is quite broad.”
Gruber' definition created an almost synonymous association between the terms ontology and conceptualization. This association, potentially enhanced by the role of “recognized authoritative” that Gruber appears to represent in his field [28] (pp. 206-207), contributed to the very broad use of the term ontology when applied to representational artifacts. Systems as simple as catalogs or slightly more complex as glossaries, till taxonomies, thesauri and the most expressive ones, using the axioms of full first order, higher order, or modal logic, “all these types of information systems satisfy Gruber’s definition, and all are now common bedfellows under the rubric of ‘ontology.’” [31] (p. vi) In the words of Grenon [32] (p. 69): “the term ‘ontology’ applies to virtually any structure resembling, to some extent, a set of terms hierarchically organized which may be put in a machine-processable format.”
Despite this scenario it is possible to notice two broad perspectives regarding ontologies as artifacts. Definitions like the one proposed by Studer and others [33] (p. 184) “an ontology is a formal, explicit specification of a shared conceptualization,” exemplify one of these positions. Exemplifying the other, we have the proposal by Arp and others [34] (p. 1) “ontology = def. a representational artifact, comprising a taxonomy as proper part, whose representations are intended to designate some combination of universals, defined classes, and certain relations between them.” Several authors distinguish these two approaches in different ways that can be seen as complementary aspects of a possible analysis to these representation artifacts.
This section presents three common dichotomies: (i) reference vs. application ontologies; (ii) description logics vs. resource description framework (RDF)-based semantics; and (iii) ontological vs. conceptual models. These dichotomies can be understood as the result of a faceted analysis with the following criteria: (i) general objective of the ontology; (ii) formal language used; and (iii) applied modulation approach.
Smith [35] and Jansen [36] distinguish two types of ontologies: the reference ontologies, for use in scientific domains; and the application ontologies, for practical and specific purposes. While in the latter an ad-hoc development can be applied and, in some cases, it is even unavoidable, this approach is totally discarded for the former, the reference ontologies [35][36]. These, designed to serve scientific purposes, should be developed according to the “principle of orthogonality,” that would not only address the data silos problem, but would bring additional benefits such as: mutual consistency of ontologies; unnecessary mapping between ontologies; reduced redundancy; facilitated findability of specific ontological resources; and optimized management of the ontological labor division [35].
Essential for this approach to reference ontologies is the articulation with the ontological study deriving from its disciplinary area of origin: “information-systems ontology is itself an enormous new field of practical application that is crying out to be explored by the methods of rigorous philosophy.” [12] Articulation seen as necessary by several researchers, e.g. [16][37][38], for the development of artifact ontologies that seek ontological rigor in the representation of reality. In contrast, in Gruber's understanding, rigor seems less important than the ontology usefulness: "if ontologies are engineered things, then we don't have to worry so much about whether they are right and get on with the business of building them to do something useful.” [39] (p. 1) Which, for Smith [35] (p. 33), “it is as if all ontologies, both inside and outside science, are assigned by default the status of application ontologies.”
Linked to the “application” approach is the conception that all ontologies are the result of the common agreement of a community over a portion of the world. As Gruber [39] (p. 5) states: “I find it critical to remember that every ontology is a treaty—a social agreement—among people with some common motive in sharing.” This conception is questioned by Poli and Obrst since this result is usually obtained by the lowest common denominator whose utility will be quite doubtful; “because it is inconsistent, has uneven and wrong levels of granularity, and doesn’t capture real semantic variances that are crucial for adoption by members of a community.” [37] (p. 10)
Also, with regard to interoperability, these application ontologies have their limitations derived from the ad-hoc mode with which they are usually constructed [36][40]. If the system is developed in this ad-hoc way, were its quality is essentially measured by the extent to which the needs of the various stakeholders are met [41], the ontological aspects lose their relevance in the modulation process. Alternatively, several application ontologies can be mapped to each other if they are developed “through a choice or combination of types from the reference ontology that are appropriate to the respective aim.” [36] (p. 171) In this situation the reference ontology will serve as a common benchmark for the application ontologies.
Kless and others [42] make a similar distinction between ontologies associated with the description logic (DL) semantics; and ontologies associated with RDF (Resource Description Framework) based semantics. Commonly, this last type of semantics is associated with what is called lightweight ontologies, a term that, according to Zhu and Madnick [43] (p. 9,) is used in literature in a very loose way: “data dictionaries, product catalogs, and topic maps are often considered to be lightweight ontologies.” Adding, the authors, that “generally speaking, a lightweight ontology refers to a set of concepts organized in a hierarchy with is_a relationships [and in opposition] are formal ontologies, which often use formal logic to specify constraints, relationships, and other rules that apply to the concepts.” [43] (p. 9)
The proliferation of the lightweight ontologies, was due, from the 90's, to the need to provide the Web with systems capable of automatic inferences and interoperability as a means to achieve the so-called Semantic Web (SW). The ability to generate inferences is, however, restricted to the semantic expressiveness of the language used, which in the case of RDF is quite limited, even with the increase provided by the RDFS extension (RDF-Schema) it is still a Web-oriented language and not a SW language [44]. RDF language treats indifferently individuals (instances), classes (types) and properties (attributes or relations), viewing them all as “resources.” In DL, the abstract description of domain knowledge, that is, the structural and intensional component (the terminology, known as TBox) is kept separate from the description of facts about objects / individuals (the assertions, called ABox). TBox represents the scheme or taxonomy of the domain of knowledge and the ABox describes the assertions (attributes, roles, etc.) about instances regarding their class membership with the TBox [45]. RDF-based semantics ontologies do not make a clear metaphysical distinction between the different elements, that´s why ontologies that “are the result of ontologically driven design processes and aim at reality representation” are likely published using the DL semantics [42] (sec. 2). Without this ontological rigor there is no basis for avoiding false inferences when two ontologies are combined [42]. This potential lack of interoperability keeps the problem of “data silos” which ontologies should be a solution and not part of the problem.
Despite the limitations described above it is these RDFS-based “light ontologies” that underlie what might be called the “web of linked data” which, although far behind the intended “Semantic Web,” is a non-negligible achievement [46]. It is also possible to see in this development of the Web a “democratization” of knowledge representation in this digital environment, meeting the original vision of Berners-Lee for it [47].
Yet another distinction can be made between the result of two different approaches in knowledge representation that Grenon [32] calls realist representationalism and pragmatist conceptualism. While in the former the purpose is to capture the categories of the actual world, in what we can call an ontological model, in the latter the intention is to represent our conceptualization of a real or imaginary world, resulting in what may be called a conceptual model. Although there are similarities, the processes differ: “while conceptual modeling seeks to establish relationships between the abstract concepts of a domain, ontological modeling aims to identify objects and understand their nature through the description of their properties” [48] (p. 243, original in Portuguese).
An ontological model will result from the application of a philosophically well-founded ontological theory in an information system following rigorous ontological principles, such as whole-part theory, types and instantiation, identity, and unity [16]. This endeavor is admittedly complex, bringing, however, advantages in terms of stability and coherence of the developed model [12]. The ontological principles can also be applied in the development of a conceptual model although more focused on the logical aspects that ensure the internal consistency of the model. The process of ontological analysis, in this case, does not necessarily seek an adaptation to reality in the same way as it is carried out in philosophical ontology.
Conceptual models are understood as explicit descriptions of mental models, which are considered to be “partial accounts of the external reality, filtered through the lens of a conceptualization, that people use to interact with the world around them.” [49] (p. 4) The interaction between the individual and the surroundings involves a system of concepts, the conceptualization, “in terms of which the corresponding universe of discourse is divided up into objects, processes, and relations in different sorts of ways,” which can be specified “to render explicit the underlying taxonomy” [12] (pp. 161-162). This can be a strictly pragmatic or epistemological enterprise when conceived as consisting only in that of representing others’ conceptualization, so that reality falls out of the picture almost entirely [12][32].
However, the concept model can be built up from a rigorous ontological analysis: “ontological modeling can constitute a basis for conceptual modeling in order to provide the developer, clearly and unambiguously, with the necessary knowledge about the domain to be modeled.” [48] (p. 243, original in Portuguese) This approach can be seen as a way to provide what Guarino and others [49] (p. 8) call a “grounding requirement” for conceptual models which can be understood as a “sort of completeness requirement” for them. Ultimately, as Poli and Obrst [37] (p.6) says: “without ontology, there is no firm basis for epistemology.”
As we could see the two approaches to ontologies differ in each of the three facets described. In the first facet, where the artifacts are analyzed according to the main intention for which it was developed, we have the contrast between reference and application ontologies that reflects what appears to be the main dichotomy in relation to the intended functionality for these artifacts as they are currently developed [32]. In the second facet, it is the semantic expressiveness associated with the two languages (description logics and RDF-Schema) that is in confrontation. These languages are two possible implementations of formal coding that make automatic processing by computers possible [44]. It is, therefore, a procedural stage in the development of these artifacts and should therefore not be seen as an intrinsic characteristic of ontologies [8][37]. Finally, in the third facet, the question of the difference between a truly ontological reading and another approach essentially epistemological is addressed. Poli and Obrst [37] (p. 3) describe the two approaches as follows: “Ontology is primarily about the entities, relations, and properties of the world, the categories of things. Epistemology is about the perceived and belief-attributed entities, relations, and properties of the world, i.e., ways of knowing or ascertaining things.”
To summarize, we have ontologies viewed, on the one hand, as a more complex conceptual system, and on the other hand, as a system that intends to organize ontological rather than epistemological knowledge. While the latter perspective maintains a relationship with the meaning of the term “ontology” coming from philosophy, the former presents a sense that deviates from the original. Although the appropriation of a term from another area of knowledge is common practice in the scientific community, especially when it involves new technologies [50], the case of changing the meaning of the term “ontology” shows traces of the process of “metaphorizing” which can be understood as “the result of encoding at the concept level. The resulting name or term for the concept can be understood in its new meaning without understanding the basis for the naming” [51] (p. 125).
The focus of ontological analysis is the item to model and not the intentions that motivate the construction of the ontology. This process involves an analytical complexity that makes the development of such systems quite onerous. However, it is the quality of this analysis that determines its true usefulness.