Metaverse and AI for Internet of City Things: Comparison
Please note this is a comparison between Version 1 by Simon Elias Bibri and Version 2 by Lindsay Dong.

The Metaverse represents an always-on 3D network of virtual spaces, designed to facilitate social interaction, learning, collaboration, and a wide range of activities. This emerging computing platform originates from the dynamic convergence of Extended Reality (XR), Artificial Intelligence of Things (AIoT), and platform-mediated everyday life experiences in smart cities. However, the research community faces a pressing challenge in addressing the limitations posed by the resource constraints associated with XR-enabled IoT applications within the Internet of City Things (IoCT). Additionally, there is a limited understanding of the synergies between XR and AIoT technologies in the Metaverse and their implications for IoT applications within this framework. 

  • the Internet of City Things
  • Metaverse
  • Artificial Intelligence of Things
  • IoT applications
  • Extended Reality
  • Virtual Reality
  • Augmented Reality

1. Introduction

The realm of immersive experiences has been dramatically reshaped by the rapid strides in Extended Reality (XR) technologies, an umbrella term for Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) technologies. These dynamic advancements have unveiled a wealth of captivating possibilities for crafting interactive and spellbinding encounters. These immersive technologies, when combined with the Internet of Things (IoT) and Artificial Intelligence (AI), hold the potential to revolutionize the way we interact with the digital world as part of everyday life [1][2][1,2]. In particular, the emergence of the Metaverse, a digitally interconnected and immersive urban environment where virtual and physical realities converge, presents exciting opportunities for Extended Reality (XR) and Artificial Intelligence of Things (AIoT) technologies, which intersects with “the distinctive features of platform-mediated everyday life experiences in cities” [1] in the realm of the Internet of City Things (IoCT). As an emerging concept (e.g., [3]) reflects the integration of numerous IoT devices, sensors, and technologies with VR/AR/MR within urban environments to create smart cities, offering promising avenues for creating intelligent interconnected cities. The idea revolves around what has been identified as virtual forms of smart cities of the future, depicted in the Metaverse as speculative fiction (e.g., [1][4][5][6][1,4,5,6]). Urban scholars have long explored the role of fictional representations of the city and urban life in shaping urban change (e.g., [7][8][7,8]). The Metaverse plays a role in shaping alternatives to the socio-technical imaginaries of smart cities [9][10][9,10].
However, IoCT enables the collection, analysis, and sharing of data from different aspects of urban life, such as mobility, transportation, energy, healthcare, education, and more. By connecting and interlinking smart IoT devices, cities can optimize resource management, enhance public services, and create a more efficient and livable urban environment. IoCT plays a crucial role in the development of smart cities, where data-driven decision making and technology integration are utilized to address urban challenges and improve the overall quality of life for citizens [3]. The role of advanced technologies, such as AIoT and Big Data, 5G, Digital Twin (DT), blockchain, and edge computing, is crucial for IoT deployments in the context of IoCT. These technologies bring various capabilities and advantages that enhance the efficiency, functionality, and overall performance of smart city implementations [11][12][13][11,12,13]. The integration of these advanced technologies in the realm of IoCT empowers cities to become more intelligent, sustainable, and efficient, offering a wide range of benefits to both the city administration and its residents.
Additionally, AR/VR/MR technologies have enormous potential to unleash the power of IoT in a diversified set of applications in smart cities. These technologies make full use of data generated in IoT deployments in an interactive way, leading to an immersive user experience with useful and attractive visualizations and end applications [14]. The use of these technologies in IoT applications is on the rise in several applications. Companies and businesses are investing more and more in the duo of technology to explore the full potential of IoT devices and generated data. The interest is mainly due to the visualization capabilities of AR and VR applications enabling operators/employees to better understand IoT data by providing an intuitively three-dimensional view. These technologies also help to better understand and diagnose an underlying problem by merging and combining data from different sources in a single view [3].
Despite the outstanding capabilities of exploring the potential of IoT devices and enormous growth in the market, VR and AR technologies face several challenges affecting their adaptation in several IoT applications. Some key challenges include the lack of hardware and regulations/documentation on its usage and suitability in different applications, the unavailability of creative content for novel applications, public skepticism, and risks associated with physical safety [15]. More importantly, the cost associated with VR and AR hardware and software components is one of the main hurdles in its widespread adaptation.
There are several factors, such as the type of VR environment, instructional design and programming, type of content, and VR headsets, contributing to the overall cost of AR and VR solutions. The overall cost of the solutions could be reduced by making appropriate choices in the selection of different components. For instance, several types of VR headsets are available in the market, providing a wide range of prices. Similarly, the choice of off-the-shelf or custom content also has a direct impact on the cost of VR and AR solutions.
The scope of the study revolves around the combination of cost-effective XR and synergistic AIoT technologies. Cost-effective XR entails the utilization of XR technologies in a manner that is efficient and economical. It involves deploying XR solutions that offer high value and immersive experiences without incurring excessive costs, making these technologies more accessible and feasible for various IoT applications. Synergistic AIoT is about the integration of AI and IoT as powerful technologies to create a collaborative and mutually reinforcing environment. AIoT leverages AI’s capabilities in analyzing and interpreting IoT-generated data to enhance decision-making, efficiency, and automation. This synergy results in an ecosystem where IoT devices gather data, and AI algorithms process and interpret it, leading to more informed actions and better outcomes in relation to IoT applications.

2. Potential of the Metaverse and Artificial Intelligence for the Internet of City Things

2.1. IoT, IoCT, and the Metaverse

In the dynamic landscape of technological advancement, two related concepts have risen to prominence, each wielding the power to reshape our digital realm: the IoT and the IoCT. These interconnected frameworks have ushered in a new era of connectivity and intelligence, revolutionizing how we interact with both our physical surroundings and the virtual world. While IoT has initiated a transformative shift by interlinking devices and enabling data exchange, IoCT, with its focus on urban environments, holds the potential to fundamentally enhance how we experience and manage cities using IoT. According to the Annual Internet Report (2018–2023), Cisco [16] states that machine-to-machine connections will reach 14.7 billion by 2023. IoT Analytics [17] predicts a leap to up to 27 billion IoT-connected devices globally by 2025. This expansion is increasingly spanning almost all urban and industrial spheres through a variety of applications. The distinction between IoT and IoCT lies at the heart of their impact, influencing applications across various domains. Moreover, as the digital horizon extends to encompass the Metaverse, these two concepts take on renewed significance in shaping immersive experiences that bridge the tangible and the virtual within urban landscapes [1]. To fully grasp their implications and potential, it is essential to explore their definitions, commonalities, differences, and their intricate relationship with the emerging concept of the Metaverse.
IoT refers to the interconnected network of physical devices, vehicles, buildings, and other objects embedded with sensors, software, and network connectivity that enables them to collect and exchange data. These devices can communicate with each other and with central systems, enabling them to perform tasks and make decisions based on the data they collect. IoT is commonly used in various domains, including smart urbanism, platform urbanism, and the Metaverse [18]. In essence, IoCT’s technical underpinnings involve a complex ecosystem of sensors, data integration mechanisms, computing resources, analytics tools, and communication infrastructure [19][20][19,20]. By effectively harnessing these components, IoCT transforms cities into smart, interconnected entities that can respond intelligently to challenges and opportunities. From a technical perspective, IoCT represents a sophisticated framework that leverages digital connectivity and data integration to transform urban environments into intelligent and responsive entities. IoCT harnesses a wide array of technologies to facilitate seamless communication, data collection, analysis, and decision making across diverse sectors within a city’s infrastructure.
1.
Sensor Networks and Data Collection: IoCT relies heavily on sensor networks strategically placed throughout the city to capture real-time data. These sensors can monitor various parameters such as temperature, air quality, traffic flow, energy consumption, waste management, and more. These devices play a pivotal role in gathering data points that reflect the current state of the urban environment.
2.
Data Integration and Interoperability: IoCT involves integrating data from a multitude of sources across different sectors. This requires establishing interoperability standards and protocols to ensure seamless communication between various devices, systems, and platforms. This integration enables a holistic view of the city’s operations and helps in making informed decisions.
3.
Edge and Cloud Computing: The vast volume of data generated by IoCT sensors demands efficient processing and analysis. Edge computing, where data are processed closer to the data source, ensures real-time insights and reduces latency. Cloud computing is also utilized for more complex analytics, storage, and long-term data aggregation.
4.
Data Analytics and Insights: IoCT employs advanced data analytics techniques, including Machine Learning (ML) and AI, to extract meaningful insights from the collected data. These insights help identify patterns, trends, and anomalies, enabling city planners and administrators to make informed decisions for optimizing urban operations and services.
5.
Smart Decision-Making: IoCT enables data-driven decision making by providing real-time and predictive information. For instance, real-time traffic data can optimize traffic signal timings to reduce congestion, or energy consumption patterns can help adjust lighting and HVAC systems in public spaces. Predictive analytics can anticipate maintenance needs, preventing infrastructure failures.
6.
Communication Infrastructure: IoCT relies on robust communication infrastructure, such as high-speed internet connectivity, wireless networks (e.g., 5G), and communication protocols, to ensure reliable data transmission between devices and systems.
7.
Security and Privacy: Given the sensitivity of urban data, security measures including encryption, access controls, and data anonymization are paramount for protecting both citizen privacy and the integrity of the system.
8.
User Interfaces and Visualization: IoCT systems often provide user-friendly interfaces and visualizations that display real-time data and insights. These interfaces allow city officials, administrators, and citizens to monitor and engage with the city’s various aspects, enhancing transparency and participation.
Both IoT and IoCT involve the interconnection of physical objects with digital systems, enabling data exchange and automation. They both contribute to the concept of a more connected and intelligent world, where devices and city components work together to improve efficiency, convenience, and decision making. The main difference lies in the scope of the application. While IoT encompasses a wide range of applications across various industries, IoCT specifically focuses on the urban environment and its infrastructure, spanning applications across various domains, such as smart transportation and mobility, environmental monitoring and sustainability, smart energy management, smart waste management, smart water management, smart healthcare, smart infrastructure and building management, connected public service, and data-driven urban planning and management.
From a general perspective, the Metaverse represents a digital universe comprising interconnected virtual worlds, environments, and spaces where users can socialize, interact, work, play, and conduct various activities through digital representations of themselves, called avatars. From a technological perspective, the Metaverse is an advanced iteration of the internet, incorporating VR, AR, and MR as immersive technologies. It provides a persistent and immersive digital environment where users can seamlessly transition between different experiences and platforms, blurring the lines between the physical and digital realms. It offers opportunities for new forms of entertainment, social interaction, education, and business. Companies are beginning to explore the Metaverse’s potential to craft virtual events, interactive conferences, dynamic marketplaces, and other innovations, to shape novel urban digital economies and cultural landscapes. 

2.2. Extended Reality (XR)

XR is the conglomeration of VR, AR, and MR, where these three main realities compose the XR technology. XR devices encompass all three realities, instead of having different devices for each realm. There exists quite a large category of VR/AR devices capable of providing rich immersive experiences for the user. A few of the types of VR/AR devices are as follows:
  • Mobile devices: Smartphones and tablet PCs are playing a leading role in the VR/AR market by providing a rich experience for industrial tasks, business, entertainment, gaming, and social networking.
  • Special VR/AR devices: These devices are special devices designed only to provide a rich VR/AR experience. Head-mounted displays (HMDs) are one category of special VR/AR devices, which makes the data transparent to the view of the users.
  • VR/AR glasses: See-through wearable glasses supporting VR/AR functionalities are capable of displaying information from smartphones directly in the VR/AR glasses, providing hands-free operations. These types of VR/AR glasses are capable of assisting workers in industries, to gain quick hands-free access to the internet and gain valuable information.
  • VR/AR contact lenses: Paving the way for new VR/AR experiences, VR/AR contact lenses fixed to human eyes can interface with smartphones and are capable of performing actions similar to a digital camera and providing enhanced VR/AR experiences.

2.2.1. Virtual Reality (VR)

In general, VR creates a whole new environment and provides a completely immersive experience for the users. It uses computer technology to create a simulated experience that may be similar or completely different from the real world. Standard VR systems use either headsets or multi-projected environments to generate realistic sounds and visuals. The following are the most vital elements of VR.
  • Virtual world: Independent from the real world, the virtual world is an imaginary space with a real world of digital objects. Simulation and computer graphic models are used to create such a virtual world by rendering digital objects. The designers establish the link between the digital objects by a predefined set of rules.
  • Immersion: Specially designed VR headsets provide a better field of vision for providing an immersive experience for the users. The users will be detached from the real world on the sensory level and will be immersed in the virtual space. Apart from the immersive visual aid, VR headsets also support audio facilities for the users.
  • Sensory feedback: Changes in user positions and movement of the head and other body parts provide sensory feedback to the VR headsets to track the scenario and provide appropriate changes in the virtual world. This provides a perfect illusion for the users of VR headsets that they are moving in a virtual world.
  • Interactivity: An interactive experience could be attained by users using VR headsets, which provide a real feel of the digital objects in the virtual world. They could pick any virtual object, use it in the virtual environment, and subsequently use it in the digital world.
The major design principles of VR devices are to employ comfort and better user interactions for the end users. Based on the experience it leverages from the graphical simulation, the following are popular types of VR simulations.
  • Projection areas: They are mostly present in AR devices or headsets used or AR applications. It helps to provide interactive visualization of the environment and changes the views as necessary. The surface for the projection of a visualization could be a wall or floor.
  • Reflection: For a pleasant view of the 3D augmented images, the reflections from the environment to the user’s eye provide a path for the graphically modified digital images. Curved, double-sided mirrors are used in AR devices to reflect the light, separate the images for both eyes, and reflect the RGB color components as well.
Different technological approaches are being used by manufacturers to provide different AR  experiences. The following are popular types of AR technologies.
  • Simultaneous Localization and Mapping (SLAM): For rendering augmented real-life images, SLAM provides one of the most effective approaches. It assists in mapping the complete structure of the environment considered for visualization by localizing the sensors present in the AR devices that support SLAM functionality.
  • Recognition-based: This is a marker-based AR technology, which uses a camera to locate the objects or visual markers. The recognition-based method depends on the camera to distinguish between real-world objects and markers. Here, 3D virtual graphics are immediately replaced, while a marker is recognized by the device.
  • Location-based: Unlike the recognition-based technology, the location-based approach uses a compass, GPS, etc., to obtain the data from the locations for the implementation of AR based on the location information. Deployment of this technology could be comfortably performed using smartphones along with location-based AR applications running on smartphones.
  • Fully immersive: With the appropriate HMD or VR glasses, a more realistic immersive experience could be gained with complete sight and sound inputs to the users. Fully immersive experiences encompass a wide view of the field with high resolution and sound effects of the digital content.
  • Semi-immersive: With the realistic environments created using 3D graphics, semi-immersive VR provides a partial virtual environment for the users. It ensures physical connectivity with the physical scenario as well as focuses on the digital models of objects. They are most used for training and educational activities since they replicate the functions and design aspects of real-world mechanisms.
  • Non-immersive: Non-immersive experiences do completely fall under the VR category, since most of them include everyday common usage of computer-generated environments. It allows the users to control the virtual environment projected in the console or computer with the aid of keyboards, mice, and controllers.

2.2.2. Augmented Reality (AR)

AR, on the other hand, keeps the real-world objects as such and adds digital objects to the real world. AR systems integrate three different features: (1) the combination of the real and virtual world, (2) a real-time interaction, and (3) accurate 3D registration of virtual and real objects. The following are the most vital components used for providing rich AR experiences for end users.
  • Sensors and Cameras: For imparting successful AR performance, the role of cameras and sensors is very critical. It helps to locate objects in the environment, measure their features, and assist in creating equivalent 3D models.
  • Processing modules: Conversion of the captured real-life images into augmented ones is performed by processing units, such as RAM, CPU, and GPU modules. The rich specification of the processing modules helps to understand the reality of AR applications in the deployed environments.

2.2.3. Mixed Reality (MR)

MR is the merging of real and virtual worlds to produce new environments and visualizations. Here, the physical and digital objects coexist and interact in real time. Unlike AR, users can interact with virtual objects. To provide different user experiences from fully immersive to light information layering of environments, MR developers have provided robust tools to bring virtual experiences to life. The following are popular types of MR apps that integrate HCI, perception, and conventional reality.
  • Enhanced environmental apps: As contextual placement of digital objects in virtual environments is becoming popular, enhanced environmental apps could facilitate this feature with the support of HoloLens HMDs. The placement of digital content in the world-of-view environment of the users is one of the key features imparted through enhanced environmental apps.
  • Immersive environmental apps: These apps completely change the perspective of users’ view with respect to time and space, driven through an environment-centered approach. In this approach, the context in the real-world environment might not play a significant role in providing immersive experiences for the users.
  • Blended environmental apps: The complete transformation of an element into a different digital object is supported through blended environmental apps. It helps to map and recognize the environment of the users and build a digital layer to completely overlay the space of the users. Even though the complete transformation of digital objects is enabled through this blended environmental app, it retains the dimension of the base object.
  • MR headset-based apps: Most of the leading semiconductor manufacturers have initiated the making of MR headsets that could provide inside-out tracking and six degrees of freedom of movement across the field-of-view environment. This kind of headset supports plug-and-play features with MR-enabled PCs and thereby provides an amazing immersive experience for the users.

2.3. Integration of VR/AR Technologies and IoT

Immersive visualization experiences have made major contributions to the growth of IoT, but in integration, there is an inherent need to use cost-effective VR and AR solutions to meet all possible end use cases of IoT. Through VR/AR integration with IoT devices, we gain incredible insights into their operation, services, and outcomes. This allows us to visualize the digital model of sourcing data from all sensors interfaced with the IoT devices involved in perceiving the environment and using them all in one place and see how the real-world data impact its services and operations, right on the interactive screen. Achieving an absolutely immersive experience in the IoT systems using VR/AR technology is challenging, considering the vastly varying types, features, and costs of VR/AR HMDs. Few of the VR/AR headsets and devices have resource constraints and most of them might not meet the economic constraints. Therefore, it is vital to decide on the choice of VR/AR devices based on their roles, compatibility with the IoT application, and economic factors. The choice of integration of IoT devices with VR/AR solutions depends on the requirements of the IoT applications. For instance, when there is a need for home automation and appliance control on IoHT, cost-effective VR/AR solutions with simple HMDs make sense. In applications that require high performance such as image or video processing, using economic VR/AR devices may not be apparent, and it would make sense to use hybrid solutions.

2.4. Artificial Intelligence of Things (AIoT) and Its Relation to the Metaverse and IoCT

AIoT is the incorporation of AI technology into the IoT infrastructure, enabling real-time data processing, advanced analytics, improved human–machine interaction, and enhanced decision making. AIoT brings together AI and IoT, relocating AI capabilities closer to the data generated by IoT devices and systems. This integration empowers intelligent and autonomous behavior, enhancing the overall performance and capabilities of IoT-based solutions. AIoT acts through control and interaction to respond to the dynamic environment, a process where ML/DL has shown value in enhancing control accuracy and facilitating multimodal interactions [21][24]. The resurgence of AI is driven by the abundance and potency of IoT-enabled Big Data, thanks to enhanced computing storage capacity and real-time data processing speed. IoT produces Big Data, which in turn requires “AI to interpret, understand, and make decisions that provide optimal outcomes” [22][25] pertaining to a wide variety of practical applications spanning various urban domains [23][26]. The role of AIoT in advancing XR technologies within the context of the Metaverse is a catalyst for transformative experiences, revolutionizing how we interact with our digital and physical surroundings [1][2][1,2]. This advancement not only propels XR’s capabilities but also infuses AIoT’s intelligence to amplify the potential of both technologies. AIoT’s synergy with XR technologies transforms the Metaverse and elevates XR’s role in the current IoT applications within the IoCT. The marriage of AIoT and XR enriches user experiences, optimizes resource utilization, and enhances data visualization, fostering more intelligent, immersive, and connected urban environments.

2.5. Smart Cities and Their Relationship to the Metaverse

Smart cities are urban areas that leverage digital technologies, data analytics, IoT, and AIoT to enhance the quality of life, sustainability, and efficiency of their services and citizens. These cities employ advanced technologies and data-driven strategies to manage resources, infrastructure, transportation, and public services more effectively. The underlying components of smart cities include [24][27]:
  • Digital infrastructure: Smart cities have robust digital infrastructure, including high-speed internet connectivity and data networks that enable seamless communication between devices, sensors, and citizens.
  • Big data analytics: They collect vast amounts of data from various sources, such as sensors, smartphones, and public services. These data are analyzed to gain insights, optimize operations, and improve decision-making processes.
  • IoT: They rely on sensors and connected devices to monitor and manage various aspects of urban life such as traffic flow, mobility patterns, energy consumption, air quality, water management, and waste management.
  • Enhanced public services: They improve public services such as healthcare, education, and safety by using technology to enhance access and efficiency.
  • Citizen engagement: They encourage citizen participation through digital platforms, enabling residents to provide feedback, access services, and engage in decision-making processes.
  • Sustainable practices: They incorporate sustainable development in their strategies by implementing practices and initiatives that support and advance the environmental, economic, and social goals of sustainability.
While smart cities primarily exist in the physical world, they share common goals with the Metaverse in terms of leveraging advanced technology for enhanced experiences, connectivity, and sustainability [11]. In terms of digital overlap, the physical and digital worlds may become more intertwined in a future scenario. Smart city data could be integrated into the Metaverse, providing real-time information on urban life to virtual city inhabitants. Regarding simulation and modeling, smart cities can benefit from the Metaverse’s ability to create digital twins or simulations of real-world environments. These simulations can be used for urban planning and testing sustainability strategies. Concerning data integration, smart cities can tap into Metaverse data to gain insights into virtual representations of urban spaces, potentially influencing real-world decisions and optimizations. Overall, while smart cities and the Metaverse are distinct concepts, they share common objectives related to digital technology, data, connectivity, and sustainability. As technology continues to advance, there is potential for increased synergy between smart cities and the Metaverse, leading to more immersive and data-driven urban experiences [9].
Video Production Service