Information Management in Building Information Modeling: Comparison
Please note this is a comparison between Version 1 by Peter Demian and Version 2 by Peter Tang.

As Building Information Modeling (BIM) models are getting bigger, with more information linked to geometrical 3D models, a dedicated BIM search engine is important. A BIM search engine was developed to examine the value of exploiting a 3D object’s topological relationships to other 3D objects when assessing that object’s relevance to a query.  This entry is adapted from 10.3390/buildings13071591

  • building information modelling
  • search engine
  • information retrieval
  • 3DIR
  • topology
  • information standards

1. Introduction

As the construction sector undergoes a digital transformation, the amount of information packed into building models is exponentially increasing. Building design, construction and operation are information-intensive activities. For example, even two decades ago in UK construction, on average, one computer-aided design (CAD) document was produced for every 9 m2 of building floor space [1]. Researchers [2] have reported the problem of ‘information overload’ in the construction sector. Building Information Modeling (BIM) models are following this general trend and becoming more information-rich. Regarding volumes of information specifically in BIM models, BIM platforms have been identified as a particularly favorable communication medium in construction compared to extranets, email and Enterprise Resource Planning systems [3]. The advantages of BIM over documents and extranets have been reported [4]. Although no absolute measures of the quantities of information were found, the implication from such studies is that BIM models are increasingly information-rich, motivating the development of a search engine to enable BIM users to meet their information needs.
Researchers have studied the negative impact of information overload on productivity in general office work [5][6][5,6]. Practitioner-based findings from the construction industry agree with those from academic research. In a panel discussion between industry experts organized by Construction Manager magazine [7], all agreed that ‘information overload’ was a huge concern that hinders a project’s productivity. Similarly, in a report commissioned by the British Government [8], it is argued that the industry must modernize and embrace digital technologies to tackle the acute problems of low productivity and poor collaboration.
Although such findings are based on subjective perceptions rather than objective measurements, from a work productivity perspective, the generation and management of information do not appear to be the problem but rather the retrieval and consumption of information. According to a survey of nearly 600 construction leaders from around the world, construction professionals spend around 14% of their time searching for project information [9]. It is often implicitly assumed that information will be created during the design and construction of the built environment, but particular effort is needed to make this information retrievable and reusable [10]. Beyond construction, a survey of 345 IT and storage professionals [11] highlighted the importance of this later phase of retrieval in the information lifecycle.
Search engines for building model archives and project databases were found to improve aspects of design [12], cost [13] and construction [14]. Information search and retrieval in BIM have been studied from a number of perspectives. A review [15] cites knowledge management, design reuse and continuous improvement as the main motives for developing BIM information retrieval systems (i.e., search engines). Approaches are classified into context-, geometry-, and content-based BIM retrieval. The research reported here uniquely touches upon all three, combining text search augmented by 3D data and interrelationships between 3D objects to account for context.
The science of information retrieval (IR) is usually associated with documents, which can be thought of as relics from the pre-BIM age. It has been shown [12] that parameters in 3D building models could be treated as very short documents, and the application of traditional IR techniques yielded reasonable retrieval performance. The challenge remains of augmenting traditional text-based IR techniques with 3D data. The work reported here builds on earlier inconclusive work [16] and specifically explores the exploitation of interrelationships between 3D objects to improve retrieval performance and the impact of information standards on retrieval performance.

2. Information Management in BIM

Despite being exponentially crammed with information, BIM is still emerging as a useful anchor for information management purposes, whether to facilitate the retrieval and flow of information within a single project or between projects. Broadly, knowledge (information) management has been proposed to address many challenges in construction, ultimately benefitting project quality, time and cost [17][19]. The CoMem (Corporate Memory) system supports design reuse from project to project to avoid wheels being needlessly reinvented [12]. A prototype system, ContextGen [13], retrieves relevant and contextualized cost information from past projects. An automatic retrieval system [14] is proposed providing similar past accident cases, to increase health and safety on construction sites. IR techniques have been adapted to develop a system for automatic classification of construction documents, associating documents with their corresponding CAD components [18][20]. Beyond text, techniques have been developed [19][21] to automatically classify construction site photographs, and IR techniques have been adapted to enhance retrieval of relevant site photographs from project databases. The literature reviewed here demonstrates the potential of applying IR techniques in BIM environments, with their heterogenous information types. The need for a BIM search engine clearly emerges, prompting the need for a review of information retrieval concepts and literature.

3. Information Retrieval

Information retrieval (IR) is concerned with systems that support users in meeting their information needs. IR textbooks [20][22] distinguish two modes of interacting with an information repository: retrieval and browsing. This research focuses on retrieval, specifically on how 3D content can improve the quantification of relevance to a query and, ultimately, the ranking of search results. Relevance is generally defined in terms of being connected or pertinent to a matter at hand, but is potentially multifaceted and difficult to understand and quantify [21][23]. Relevance remains a cornerstone of IR and has been the focus of much research [21][22][23,24]. Quantifying relevance remains challenging because of its subjectivity and context dependency. It appears from the literature that there is still scope for research to help meet this challenge. The 3D object orientation of BIM environments offers unexplored opportunities. Applying established information retrieval techniques to the text in BIM models is a starting point to quantify relevance, but 3D information and relationships between 3D objects offer the potential for further improvements in retrieval performance. Focusing on improving the performance of search engines and the underlying quantification of relevance, there has been much relevant research in particular on built environment design and construction. Query expansion has been proposed [23][25] to improve retrieval performance of an online product search engine. Domain knowledge in the form of an ontology has also been used [24][26], which normalizes and expands index terms to improve retrieval of useful documents. Furthermore, domain knowledge has been complemented with natural language processing [25][27] to improve retrieval from BIM object libraries. Natural language processing and IFC have also been combined to improve retrieval from hierarchical BIM models [26][28]. The need for IT to consider context has also been recognized in IR in general [27][29] and particularly in the built environment [15]. Four “levels of context” have been set out [28][30] that are relevant to information retrieval. Context in the research reported here is most closely aligned to their query level of context, whereby context is information not explicitly encoded in queries or information resources, but which nevertheless might improve query retrieval performance. Evaluating IR systems is important. The most widely used measures are Recall and Precision [22][29][24,31]. Recall is the ratio between the number of relevant items retrieved and the total number of relevant items in the collection. It gauges a search engine’s ability to retrieve relevant items. Precision is the ratio between the number of relevant items retrieved and the total number of items retrieved. It gauges a search engine’s ability to filter out irrelevant items (Section 3.2.1 in [20][22] gives precise equations for calculating Precision and Recall). Precision and Recall are complementary and are usually combined in a Precision–Recall curve where Precision is recorded as more and more results are retrieved (and Recall increases). Section 3.2.1 in [20][22] gives the procedure for generating a Precision–Recall curve, which is the procedure used for the curves presented in this papentryr. A Precision–Recall curve can be summarized in a single figure by averaging the Precision at 11 standard Recall levels. This is useful for comparing relevance computations. Although Precision and Recall have been used over the last four decades, they are still recognized as valid and preferred among researchers.

4. Topological Modeling in Buildings

Objects in BIM models are interrelated, and this is often referred to as model topology. Exploiting such relationships may help to account for context and can conceivably improve retrieval performance. In a general sense, topology is the study of the way in which constituent parts are interrelated or arranged. In spatial modelling, topology is concerned with the notions of “interior”, “boundary”, or “exterior”. These notions could be captured by the Industry Foundation Classes, as buildings were modelled in 3D Euclidean space [30][32]. Algorithms have been presented [31][33] for the standard topological operators in 3D space: within, contain, touch, overlap, disjoint and equal. Conceptually related, spatial–topological relationships within floorplans have been captured [32][34], creating a sketch-based search engine aiding retrieval of similar architectural floor plans. A Query Language for Building Information Models (QL4BIM) has been proposed [33][35], with algorithms and implementation methods focusing on spatial semantic queries. The concept of exploiting topology in 3D models appears to be a promising avenue for exploration regarding improving retrieval from BIM models.

5. Graph Theoretic Formulation of Information Linked to 3D Models

A graph-theoretic formulation and accompanying relevance computations [16] are invoked as the points of departure for this research. This theoretical lens has proven to be extremely useful in researching information linked to 3D space, as in information-rich BIM models. The graph theoretic vocabulary of vertices and edges serves as an elegant language to convey 3D objects enriched with attributes and possibly linked to external documents. Edges in graphs are particularly important for this research, as relationships between 3D objects can be thought of as edges. Principles of graph theory have been used by several researchers [34][35][36,37] to capture the topological relationships between building objects. However, unlike reported here, none of the studies encountered in the literature had clearly distinguished between 3D and textual information or between different relationship types in a 3D model. Figure 1 gives the graph theoretic formulation that makes these distinctions. A graph consists of a set of vertices connected by edges, where an edge links only two vertices. Hence, mathematically modelling any graph X consists of listing its set of vertices V(X) and set of edges E(X) [36][38]. In the case of BIM, this formulation distinguishes two sets of vertices and two sets of edges:
Figure 1.
A graph theoretic representation of BIM models as a lens for studying information retrieval.
  • V3D: The set of vertices representing 3D objects in the model.
  • Vi: The set of vertices representing information items linked to the 3D objects, which can be either 3D object properties treated as short documents or full-text documents.
  • En: A ‘natural’ edge joining a vertex in set V3D and a vertex in set Vi, which originates from the nature of parametric modelling in 3D BIM environments.
  • Et: A ‘topological’ edge joining two related vertices in set V3D, representing the relationship between these two 3D objects. Such relationships and their exploitation to improve retrieval performance are focal points of this research.

6. Information Standards

BIM integrates the fragmented construction industry by increasing collaboration and facilitating sharing of information between organizations and across all project phases. With a variety of BIM platforms available, interoperability between software applications remains an issue. Information standards, i.e., neutral file formats and standard classifications, are being continuously developed and are necessary to support data exchange. The concept of Open BIM is based on open standards and workflows that allow different stakeholders to share data across any BIM platform. From previous research on information management in a digitalized construction industry in Britain [37][18], Industry Foundation Classes (IFC), Uniclass-2015 and Construction Operations Building Information Exchange (COBie) emerged as pertinent standards. It is noteworthy that IFC is a schema, while Uniclass-2015 and COBie are classification systems. Limited research was found on the effect of information standards on retrieval performance, especially Uniclass-2015 and COBie. Previous research has concluded that the complex nature of IFC makes retrieval from IFC models difficult [38][39][39,40]. Furthermore, Uniclass-2015 has been identified [37][18] as having a significant professional following in the UK but having less flexibility in terms of classifying object parameters compared to the ISO 15926 series of standards. Ultimately, the literature review seems to suggest poor IR performance for both IFC and Uniclass-2015 compliant models. (Interestingly, researchers [40][41] have attempted to characterize and classify BIM users).
ScholarVision Creations