Two perspectives on ontologies coexist in knowledge organization systems spectrum. On the one hand, we have ontologies viewed as an evolution in terms of complexity of traditional conceptual systems such as thesaurus, on the other, a system that organizes ontological rather than epistemological knowledge. The focus of ontological analysis is the item to model and not the intentions that motivate the construction of ontology.
Despite early uses of the term ontology
In the Knowledge Organization (KO) / Information Science (IS) community several authors, such as
[1][2] to designate a “theory of a modeled world”,[3] the first formulation of a definition for ontology in the context of information systems it only happened in 1991: “the ontology of a system consists of its vocabulary and a set of constraints on the way terms can be combined to model a domain.”[4]. Some authors, e.g.[5], consider the study of Neches and others to be a pioneer in the area of Information Science (IS) despite the connection of its authors, particularly Gruber, to the area of Artificial Intelligence (AI). Others authors, e.g.
, see the work on ontology coming from the computation field as a kind of reinvention of wheel or an etymological issue, as it concerns classification and other well-known aspects of knowledge organization processes. This situation was reflected in the different views regarding the typology of ontologies as representational artifacts known as knowledge organization systems (KOS). Considering the different technical, structural and functional characteristics of KOS, Mazzocchi
[6]
presents as a common denominator the function for which these “semantic tools” were designed: supporting the organization of knowledge and information, in order to facilitate their management and recovery.
Within the KOS set, the term ontology is paradigmatic of the terminological ambiguity used in different typologies. Pieterse and Kourie
[7], pointing to an assumed author of IS
(p. 227), e.g., sate that the term ontology “refer to a KOS that can be classified as a relationship list in Hodge’s classification [and] which is classified as a thesaurus in our classification.” Unlike these authors, Biagetti
[8] as the first author of the area to address ontology.
Different views of IS, with a more or less computerized focus
(sec. 3.1) considers that “ontologies are a kind of KOS that present the highest degree of semantic richness, as they allow to establish a great number of relations between terms, and provide attributes for each class.” Hjørland
[9], contribute to the lack of terminological clarity given the use of the same term, ontology, to designate related but distinct artifacts as to their objectives. Prima facie, while in IS the emphasis is placed on building structures for document content representation and retrieval in Computer Science (CS) the intention is to create models of the world emphasizing the process of automated inference
(sec. 3.3) also sees ontologies as a different kind of KOS “more general and more abstract” than others “traditional” KOS but, for the author, these “may just be understood as being restricted kinds of ontologies.”
Contributing to the latter position will be the indiscriminate association of the term ontology, both as a specific type of KOS and as a categorization process, considered by Smiraglia
[10]. Directly associated with the specificity of the objectives of the two areas, some authors, e.g.
as one of the pillars in the development of any KOS. Undifferentiation not advised by Souza and others
[11], place the fields of study: Knowledge Organization (KO) and Knowledge Representation (KR), respectively, within IS and within CS. Although the distinction of these areas of study may be necessary, in the development of knowledge organization systems (KOS) it does not seem possible to separate the two processes. The organization of knowledge is a condition for its representation, which in turn functions as an instrument for the organization to be effective
(p. 187): “it might be asserted that all KOS are the products of some kind of ontological modeling, but using the term ‘ontologies’ arbitrarily can cause confusion.” In addition to the KO / IS and Computer Science areas, another one, philosophy, is necessary to bring to the debate to understand why the term ontology is also used as a categorization process.
The term ontology appeared in the 17th century, being attributed, in parallel but without known connection, to Rudolf Göckel and Jacob Lorhard, although, only after the publication of the work Philosophia prima sive Ontologi by Christian Wolff in 1730, the spread of the term has truly started
[12].
The set of items called KOS is vast and unclear as to the meaning of much of the terminology used for the various types where the term ontology is paradigmatic of this ambiguity[13]
. In that work, Wolff called ontology to Aristotle's “first philosophy”
[14]
. The object of study of this discipline, which the philosopher also called “first science,” “wisdom” or “theology,” is now also called “Aristotle's Metaphysics”
[15]. The disregard of the multiple dimensions (intrinsic and extrinsic) of the various types of KOS and the undifferentiation between ideal KOS types and particular instances are pointed out by Souza and others
. In short, this work of Aristotle consisted in the systematization and categorization of all the entities that exist in the world
[1416] as causes for this lack of clarity. Considering only the modeling process, as the authors point out: “[i]t might be asserted that all KOS are the products of some kind of ontological modeling, but using the term ‘ontologies’ arbitrarily can cause confusion.”[1417] Applying a similar criterion, we could also designate as ‘classifications’ all KOS as they employ in their development the process of classifying, as Soergel
.
While in the context of Philosophy the term ontology is related to a process (study or analysis), in the context of areas linked to knowledge organization/information systems the same term is associated with a product, an artifact. In this context, the term ontology can designate either a concrete system or, in a more abstract sense, a theory, both can be in a formal (logic-based) or informal format
[16] appears to do: “other fields, such as AI, natural language processing, and software engineering, have discovered the need for classification, leading to the rise of what these fields call ontologies.” Disagreeing with Soergel's position, Hjørland[17] considers ontologies "more general and more abstract forms of KOS" than traditional ones, such as classification systems and thesauri, which, for the author, can be understood as "restricted kinds of ontologies."
.
The fact that many researchers, particularly in the CS-related community as point out in
It is in the area called Artificial Intelligence (AI), that the term ontology makes the transition from the field of philosophy to the area linked to information systems. It was Mealy who, in 1967, first used the term ontology in this new context
[18], adopt Gruber’s definition: “[a]n ontology is an explicit specification of a conceptualization,” proposed in
. Although it did so in the philosophical sense of the term referring to ontological analysis because it would enable a better understanding of the structure of the world and thus facilitate its modeling, or part of it, in computational terms
[19] and reinforced in
. In this same context, another initial milestone occurred with the work of Hayes
[20], favors the indiscriminate use of the term ontology to designate different types of KOS. Criticizing this “mechanical definition” and the uniquely technical guidelines of the W3C RDFS and OWL standards, Kless and others
in the following decades, seeking ”an adequate theory of the world of common sense”
[21] state: “In fact, it implies that any KOS could simply become an ontology by simply changing its representation format.” The authors distinguish two types of ontologies: “data modeling ontologies,” tied to the RDF-based semantics; and the “reality representation ontologies” associated with the description logic semantics. Adding, in relation to the latter: “[i]t can be questioned, whether they should be called ontologies at all and whether they are truly interoperable in the sense of being combinable and reusable.”
(p. 225) for application in robotics.
Despite those early uses of the term ontology to designate a “theory of a modeled world”
[21]. Since the 1990s, the proliferation of these systems was due to the search to provide the Web with systems capable of automatic inferences in the context of the so-called Semantic Web (SW). The ability to generate inferences is, however, restricted to the semantic expressiveness of the language used which, in the case of RDFS, is quite limited. RDFS is a web-oriented language and not a true SW (or KR) language[22]. Despite these limitations, it is these systems that underlie what may be called the Web of Linked Data
(p. 1964), the first formulation of a definition for ontology in the context of information systems it only happened in a study of Neches and others published in 1991: “The ontology of a system consists of its vocabulary and a set of constraints on the way terms can be combined to model a domain.”
[23] and it is possible to see in this development of the Web a “democratization” of knowledge representation
(p. 40). Some authors
[24].
In terms of interoperability and reusability these lightweight ontologies, as they are often called, face also several limitations due to the ad-hoc approach generally employed in their construction
consider this study of Neches to be a pioneer in the field of in the area of IS despite the connection of its authors to the field of AI. Others, e.g. [1][25]
. One consequence of the ad-hoc approach is the continuation of the siloing of data, which ontologies should solve or at least minimize. That's one of the reasons Smith
, point out an author declaredly associated with the IS, B.C. Vickery, with his 1997 work, where he claims that the issues faced by “ontological engineers” are not new to the IS community [2] (p. 285).
Among the group of researchers who participated in Neches' study is T.R. Gruber whose definition for ontology - an “explicit specification of a conceptualization”
[26][27] considers this the wrong way to build ontologies. While condescending that a similar approach may be used in some cases of purpose-built application ontologies, the researcher totally dismisses it for the so-called reference ontologies. These, designed to serve scientific purposes, should be developed according to the “principle of orthogonality”
(p. 199, p. 908) - would become paradigmatic, particularly in areas directly related to computing [28]. This definition was published online in 1992 where the author recognizes the philosophical origin of the term but explicitly departs from its meaning in this context: “this definition is consistent with the usage of ontology as set-of-concept-definitions, but more general. And it is certainly a different sense of the word than its use in philosophy.” [29] (n/p.) In the formal publication of this definition, Gruber also points out the difference in his approach to that made in the original context of the term: “An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of Existence. For knowledge-based systems, what ‘exists’ is exactly that which can be represented.”
[26]. This approach would not only address the data silos problem, but would bring additional benefits such as: mutual consistency of ontologies; unnecessary mapping between ontologies; reduced redundancy; facilitated findability of specific ontological resources; and optimized management of the ontological labor division. Essential to Smith's approach is the articulation with ontological study from his disciplinary area of origin - the philosophy: “information-systems ontology is itself an enormous new field of practical application that is crying out to be explored by the methods of rigorous philosophy.”[27] Articulation seen as necessary for the correct development of ontologies by various researchers, e.g.
(p. 199, p. 908, italics in original) The key term of this definition is conceptualization, used in the sense given by Genesereth and Nilsson
(p. 9, italics in original): “The formalization of knowledge in declarative form begins with a conceptualization. This includes the objects presumed or hypothesized to exist in the world and their interrelationships. The notion of an object used here is quite broad.”
Gruber' definition created an almost synonymous association between the terms ontology and conceptualization. This association, potentially enhanced by the role of “recognized authoritative” that Gruber appears to represent in his field
[28]
(pp. 206-207), contributed to the very broad use of the term ontology when applied to representational artifacts. Systems as simple as catalogs or slightly more complex as glossaries, till taxonomies, thesauri and the most expressive ones, using the axioms of full first order, higher order, or modal logic, “all these types of information systems satisfy Gruber’s definition, and all are now common bedfellows under the rubric of ‘ontology.’”
(p. vi) In the words of Grenon [32] (p. 69): “the term ‘ontology’ applies to virtually any structure resembling, to some extent, a set of terms hierarchically organized which may be put in a machine-processable format.”
Despite this scenario it is possible to notice two broad perspectives regarding ontologies as artifacts. Definitions like the one proposed by Studer and others [33] (p. 184) “an ontology is a formal, explicit specification of a shared conceptualization,” exemplify one of these positions. Exemplifying the other, we have the proposal by Arp and others [34] (p. 1) “ontology = def. a representational artifact, comprising a taxonomy as proper part, whose representations are intended to designate some combination of universals, defined classes, and certain relations between them.” Several authors distinguish these two approaches in different ways that can be seen as complementary aspects of a possible analysis to these representation artifacts.
This section presents three common dichotomies that can be understood as the result of a faceted analysis with the following criteria: i) general objective of the ontology; ii) formal language used; and ii) applied modulation approach. In the first, the system is analyzed according to the main intention for which it was developed. The two facets derived from this criterion (reference or application) reflect what appears to be the main dichotomy in relation to the intended functionality for these artifacts as they are currently developed [32]. In the second dichotomy it is the semantic expressiveness associated with the two languages (Description Logics and RDF-Schema) that is in confrontation. These languages are two possible implementations of formal coding that make automatic processing by computers possible [35]. It is, therefore, a procedural stage in the development of these artifacts and should therefore not be seen as an intrinsic characteristic of ontologies [8][36]. Finally, the third subsection addresses the question of the difference between a truly ontological reading and one closer to the epistemological approach made in conceptual modulation. Poli and Obrst [36] (p. 3) describe the two approaches as follow: “Ontology is primarily about the entities, relations, and properties of the world, the categories of things. Epistemology is about the perceived and belief-attributed entities, relations, and properties of the world, i.e., ways of knowing or ascertaining things.”
Smith [37] and Jansen [38] distinguish two types of ontologies: the reference ontologies, for use in scientific domains; and the application ontologies, for practical and specific purposes. While in the latter an ad-hoc development can be applied and, in some cases, it is even unavoidable, this approach is totally discarded for the former, the reference ontologies [37][38]. These, designed to serve scientific purposes, should be developed according to the “principle of orthogonality,” that would not only address the data silos problem, but would bring additional benefits such as: mutual consistency of ontologies; unnecessary mapping between ontologies; reduced redundancy; facilitated findability of specific ontological resources; and optimized management of the ontological labor division [37].
Essential for this approach to reference ontologies is the articulation with the ontological study deriving from its disciplinary area of origin: “information-systems ontology is itself an enormous new field of practical application that is crying out to be explored by the methods of rigorous philosophy.” [12] Articulation seen as necessary by several researchers, e.g. [16][36][39], for the development of artifact ontologies that seek ontological rigor in the representation of reality. In contrast, in Gruber's understanding, rigor seems less important than the ontology usefulness: "if ontologies are engineered things, then we don't have to worry so much about whether they are right and get on with the business of building them to do something useful.” [40] (p. 1) Which, for Smith [37] (p. 33), “it is as if all ontologies, both inside and outside science, are assigned by default the status of application ontologies.”
Linked to the “application” approach is the conception that all ontologies are the result of the common agreement of a community over a portion of the world. As Gruber [40] (p. 5) states: “I find it critical to remember that every ontology is a treaty—a social agreement—among people with some common motive in sharing.” This conception is questioned by Poli and Obrst since this result is usually obtained by the lowest common denominator whose utility will be quite doubtful; “because it is inconsistent, has uneven and wrong levels of granularity, and doesn’t capture real semantic variances that are crucial for adoption by members of a community.” [36] (p. 10)
Also, with regard to interoperability, these application ontologies have their limitations derived from the ad-hoc mode with which they are usually constructed [38][41]. If the system is developed in this ad-hoc way, were its quality is essentially measured by the extent to which the needs of the various stakeholders are met [42], the ontological aspects lose their relevance in the modulation process. Alternatively, several application ontologies can be mapped to each other if they are developed “through a choice or combination of types from the reference ontology that are appropriate to the respective aim.” [38] (p. 171) In this situation the reference ontology will serve as a common benchmark for the application ontologies.
Kless and others [43] make a similar distinction between ontologies associated with the description logic (DL) semantics; and ontologies associated with RDF (Resource Description Framework) based semantics. Commonly, this last type of semantics is associated with what is called lightweight ontologies, a term that, according to Zhu and Madnick [44] (p. 9,) is used in literature in a very loose way: “data dictionaries, product catalogs, and topic maps are often considered to be lightweight ontologies.” Adding, the authors, that “generally speaking, a lightweight ontology refers to a set of concepts organized in a hierarchy with is_a relationships [and in opposition] are formal ontologies, which often use formal logic to specify constraints, relationships, and other rules that apply to the concepts.” [44] (p. 9)
The proliferation of the lightweight ontologies, was due, from the 90's, to the need to provide the Web with systems capable of automatic inferences and interoperability as a means to achieve the so-called Semantic Web (SW). The ability to generate inferences is, however, restricted to the semantic expressiveness of the language used, which in the case of RDF is quite limited, even with the increase provided by the RDFS extension (RDF-Schema) it is still a Web-oriented language and not a SW language [35]. RDF language treats indifferently individuals (instances), classes (types) and properties (attributes or relations), viewing them all as “resources.” In DL, the abstract description of domain knowledge, that is, the structural and intensional component (the terminology, known as TBox) is kept separate from the description of facts about objects / individuals (the assertions, called ABox). TBox represents the scheme or taxonomy of the domain of knowledge and the ABox describes the assertions (attributes, roles, etc.) about instances regarding their class membership with the TBox. RDF-based semantics ontologies do not make a clear metaphysical distinction between the different elements, that´s why ontologies that “are the result of ontologically driven design processes and aim at reality representation” are likely published using the DL semantics [43] (sec. 2). Without this ontological rigor there is no basis for avoiding false inferences when two ontologies are combined [43]. This potential lack of interoperability keeps the problem of “data silos” which ontologies should be a solution and not part of the problem.
Despite the limitations described above it is these RDFS-based “light ontologies” that underlie what might be called the “web of linked data” which, although far behind the intended “Semantic Web,” is a non-negligible achievement [45]. It is also possible to see in this development of the Web a “democratization” of knowledge representation in this digital environment, meeting the original vision of Berners-Lee for it [46].
Yet another distinction can be made between the result of two different approaches in knowledge representation that Grenon [32] calls realist representationalism and pragmatist conceptualism. While in the former the purpose is to capture the categories of the actual world, in what we can call an ontological model, in the latter the intention is to represent our conceptualization of a real or imaginary world, resulting in what may be called a conceptual model. Although there are similarities, the processes differ: “while conceptual modeling seeks to establish relationships between the abstract concepts of a domain, ontological modeling aims to identify objects and understand their nature through the description of their properties” [47] (p. 243, original in Portuguese).
An ontological model will result from the application of a philosophically well-founded ontological theory in an information system following rigorous ontological principles, such as whole-part theory, types and instantiation, identity, and unity [16]. This endeavor is admittedly complex, bringing, however, advantages in terms of stability and coherence of the developed model [12]
.
. The ontological principles can also be applied in the development of a conceptual model although more focused on the logical aspects that ensure the internal consistency of the model. The process of ontological analysis, in this case, does not necessarily seek an adaptation to reality in the same way as it is carried out in philosophical ontology.
The rigor sought in the ontological representation of reality, advocated for systems of the type of reference ontologies, is relegated to the background by authors such as Gruber
Conceptual models are understood as explicit descriptions of mental models, which are considered to be “partial accounts of the external reality, filtered through the lens of a conceptualization, that people use to interact with the world around them.” [48] (p. 4) The interaction between the individual and the surroundings involves a system of concepts, the conceptualization, “in terms of which the corresponding universe of discourse is divided up into objects, processes, and relations in different sorts of ways,” which can be specified “to render explicit the underlying taxonomy” [12] (pp. 161-162). This can be a strictly pragmatic or epistemological enterprise when conceived as consisting only in that of representing others’ conceptualization, so that reality falls out of the picture almost entirely [12][3032]
: “[i]f ontologies are engineered things, then we don’t have to worry so much about whether they are right and get on with the business of building them to do something useful.” It is, like Smith
.
However, the concept model can be built up from a rigorous ontological analysis: “ontological modeling can constitute a basis for conceptual modeling in order to provide the developer, clearly and unambiguously, with the necessary knowledge about the domain to be modeled.”
[2647] says, “as if all ontologies, both inside and outside science, are assigned by default the status of application ontologies.” Linked to this approach to ontologies is the conception that ontologies are the result of the common agreement of a community over a portion of the world. This conception is questioned by Poli and Obrst
(p. 243, original in Portuguese) This approach can be seen as a way to provide what Nicola and others
[2848] since this result is usually obtained by the lowest common denominator whose utility will be quite doubtful “because it is inconsistent, has uneven and wrong levels of granularity, and doesn’t capture real semantic variances that are crucial for adoption by members of a community.”
(p. 8) call a “grounding requirement” for conceptual models which can be understood as a “sort of completeness requirement” for them. Ultimately, as Poli and Obrst
(p.6) says: “without ontology, there is no firm basis for epistemology.”
Summarizing, two perspectives on ontologies coexist in KOS ecosystems, on the one hand, we have ontologies viewed as a more complex conceptual system, on the other, a system that organizes ontological rather than epistemological knowledge. The focus of ontological analysis is the item to model and not the intentions that motivate the construction of ontology. This process involves an analytical complexity that makes the development of such systems quite onerous. However, it is the quality of this analysis that determines its true usefulness.
Despite the inherent epistemological interference, i.e., the adjudication of the truth by the modeler, it is the objects, relations, and rules of reality that the ontologist must model. Ultimately "[w]ithout ontology, there is no firm basis for epistemology."[28]
Two perspectives on ontologies coexist in KOS ecosystems. On the one hand, we have ontologies viewed as a more complex conceptual system, on the other, a system that intent to organize ontological rather than epistemological knowledge. The focus of ontological analysis is the item to model and not the intentions that motivate the construction of ontology. This process involves an analytical complexity that makes the development of such systems quite onerous. However, it is the quality of this analysis that determines its true usefulness.