Comprehensive Background on Tiny Machine Learning: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

IoT systems frequently generate vast quantities of data, posing substantial management and analysis challenges. Researchers have introduced several frameworks and architectures to address these challenges in IoT Big Data management and knowledge extraction. One such proposal is the Cognitive-Oriented IoT Big Data Framework (COIB Framework), as outlined in Mishra’s works [47,48]. This framework encompasses an implementation architecture, layers for IoT Big Data, and a structure for organizing data. An alternative method involves employing a Big-Data-enhanced system, adhering to a data lake architecture [49]. Key features of this system include a multi-threaded parallel approach for data ingestion, strategies for storing both raw and processed IoT data, a distributed cache layer, and a unified SQL-based interface for exploring IoT data. Furthermore, blockchain technologies have been investigated for their potential to maintain continuous integrity in IoT Big Data management [50]. This involves five integrity protocols implemented across three stages of IoT operations.

  • Edge AI
  • IoT
  • IoT data engineering
  • IoT Big Data management
  • IoT systems

1. Big Data Challenges, Internet of Things, and TinyML

The rapid expansion of the Internet of Things (IoT) marks a significant shift in the digital landscape, marked by an extensive network of devices and sensors continuously collecting and sending data. This fusion of environments rich in data greatly increases the challenges related to Big Data, particularly concerning its large volume, high speed, and varied complexity.

The Big Data Dilemma in IoT

As IoT systems evolve, they inherently generate data that challenge conventional processing and storage infrastructures. Key challenges arising from this scenario include:
  • Storage Capacity and Scalability: Traditional storage systems grapple with the ever-growing influx of data from IoT sources, necessitating the development of more scalable and adaptive solutions.
  • Data Processing and Analysis: The heterogeneity of IoT data requires sophisticated adaptable algorithms and infrastructures to derive meaningful insights efficiently.
  • Data Transfer and Network Load: Ensuring efficient and timely data transmission across a myriad of devices without overburdening the network infrastructure remains a paramount concern.
  • Data Integrity and Security: As data become increasingly decentralized across devices, ensuring their authenticity and safeguarding them from potential threats are critical.
These challenges, which highlight the wider complexities that IoT introduces to Big Data management, are outlined in Table 1.

2.2. TinyML

Tiny Machine Learning (TinyML) has emerged as a growing field in machine learning, characterized by its application in highly constrained Internet of Things (IoT) devices such as microcontrollers (MCUs) [51]. This technology facilitates the use of deep learning models across a multitude of IoT devices, thereby broadening the range of potential applications and enabling ubiquitous computational intelligence. The implementation of TinyML is challenging, primarily due to the limited memory resources of these devices and the necessity for simultaneous algorithm and system stack design. Attracting substantial interest in both research and development areas, numerous studies have been conducted, focusing on the challenges, applications, and advantages of TinyML [52,53].
An essential goal of TinyML is to bring machine learning capabilities to battery-powered intelligent devices, allowing them to locally process data without necessitating cloud connectivity. This ability to operate independently from cloud services not only enhances functionality but also provides a more cost-effective solution for IoT applications [3,54,55,56]. The academic community has thoroughly examined TinyML, with systematic reviews, surveys, and research papers delving into aspects such as its hardware requirements, frameworks, datasets, use cases, algorithms/models, and broader applications. Notably, the development of specialized TinyML frameworks and libraries, coupled with its integration with networking technologies, has been explored to facilitate its deployment in various sectors, including healthcare, smart agriculture, environmental monitoring, and anomaly detection. One practical application of TinyML is in the development of soft sensors for economical vehicular emission monitoring, showcasing its real-world applicability [57]. In essence, TinyML marks a significant progression in the domain of machine learning, enabling the execution of machine learning tasks on resource-constrained IoT devices and microcontrollers, thus laying the groundwork for an expansive ecosystem surrounding this technology.

2.2.1. TinyML as a Novel Facilitator in IoT Big Data Management

Within this challenging landscape, TinyML emerges as an innovative intersection between machine learning and embedded systems. Specifically tailored for resource-constrained devices, it presents several avenues for mitigating Big Data challenges:
  • Localized On-Device Processing: TinyML facilitates local data processing, markedly reducing the need for continuous data transfers, thus optimizing network bandwidth and improving system responsiveness.
  • Intelligent Data Streamlining: With the ability to perform preliminary on-device analysis, TinyML enables IoT systems to discern and selectively transmit pivotal data, ensuring efficient utilization of storage resources.
  • Adaptive Learning Mechanisms: IoT devices embedded with TinyML can continuously refine their data processing algorithms, fostering adaptability to dynamic data patterns and environmental changes.
  • Reinforced Security Protocols: By integrating real-time anomaly detection at the device level, TinyML significantly enhances the security framework, providing an early detection system for potential data breaches or threats.
The complex challenges and problems associated with Big Data in the Internet of Things (IoT) paradigm are diverse and complex, encompassing numerous aspects such as data management, processing, unstructured data analytics, visualization, interoperability, data semantics, scalability, data fusion, integration, quality, and discovery [58]. These issues are closely related to the growing trend of “big data” within cloud computing environments and the progressive development of IoT technologies, exerting a significant impact on various industries, including but not limited to the power sector, smart cities, and large-scale petrochemical plants [58,59,60]. Additionally, the realm of IoT architectures is not immune to pressing security and privacy threats, making them salient challenges that require immediate and effective addressal [61]. The efficiency and completeness of IoT Big Data, coupled with security concerns, have emerged as critical areas of focus in the realm of research and development [62]. Furthermore, the potential integration of blockchain technology is being explored as a solution to ensure continued integrity in IoT Big Data management, particularly in addressing concerns related to data correctness, resource sharing, and the generation and verification of service-level agreements (SLA) [50].
In the IoT Big Data landscape, as delineated in Table 1, key challenges include managing the vast volume of data from numerous devices, necessitating advanced storage and processing systems. Rapid data generation requires real-time analysis and response, highlighting the importance of data velocity. The variety of data, both structured and unstructured, from diverse sources, complicates integration and analysis. Ensuring data veracity, or accuracy and trustworthiness, is increasingly challenging. Integrating various data sources while maintaining integrity is vital. Security and privacy concerns are paramount due to heightened interconnectivity, necessitating robust protocols. Lastly, minimizing latency to avoid obsolete insights is crucial in IoT Big Data management.
Table 2 highlights how TinyML addresses key challenges in Big Data and IoT. It offers solutions to data overload by facilitating on-device data filtering and summarization, significantly reducing the amount of data that needs to be transmitted to central systems. This approach is pivotal for real-time processing needs, where localized TinyML models enable instant data analysis, ensuring timely insights without the dependency on external servers. Such capability is crucial in scenarios with limited or no connectivity, maintaining device functionality.
Table 2. TinyML solutions for Big Data and IoT challenges.
Additionally, TinyML greatly enhances energy efficiency by optimizing models for specific tasks, thereby conserving resources and extending battery life. This technology also bolsters security and privacy; by processing data locally, it minimizes the risks associated with data transmission and ensures that sensitive information remains within the user’s control. Furthermore, TinyML contributes to the longevity of devices by reducing the strain on their components through local processing, potentially extending their operational lifespan. These enhancements demonstrate TinyML’s significant role in improving the efficiency, security, and sustainability of IoT systems.

2.2.2. Characteristics of Large-Scale IoT Systems

The characteristics of large-scale IoT systems and the enhancements introduced by TinyML are effectively outlined in Table 3. In these systems, a distributed topology with devices spread across various locations results in data decentralization and increased latency; TinyML tackles this by facilitating edge computation, enabling local data processing to reduce latency and provide real-time insights. The voluminous data streams generated continuously can burden storage and transmission channels, but TinyML assists by prioritizing, compressing, and filtering data at the device level, managing storage needs and reducing data transmission demands.
The varied landscape of IoT devices, each with different data formats and communication protocols, is harmonized by TinyML, which standardizes data processing and extraction at the source, ensuring consistent data representation across diverse device types. Power and resource constraints, especially in battery-operated devices, pose significant challenges in IoT systems. TinyML models are designed for optimal computational efficiency, performing tasks effectively without draining device resources. Finally, in applications that require real-time processing, such as health monitoring or predictive maintenance, delays in processing can be critical. TinyML enables rapid on-device processing, allowing immediate responses to changing data patterns, thus enhancing the overall functionality and effectiveness of large-scale IoT systems.

2.2.3. Applications of TinyML on Embedded Devices

Table 4 illustrates various applications of TinyML and machine learning in embedded devices across different sectors. In predictive maintenance, TinyML models analyze real-time sensor data from machinery, enabling early detection of potential failures and reducing maintenance costs. This technology is also pivotal in health monitoring, where wearable devices equipped with TinyML offer continuous health tracking, instantly analyzing critical health metrics while ensuring user privacy.
Table 4. Applications of TinyML and machine learning on embedded devices.
In agriculture, TinyML enhances efficiency by adjusting operations based on real-time environmental data, leading to optimal resource usage and increased yield. Voice and face recognition technologies in embedded devices benefit from TinyML through faster localized processing, enhancing reliability and privacy. TinyML also plays a crucial role in energy management within smart grids and home automation, optimizing energy use for cost and environmental benefits. In urban development, it contributes to traffic flow optimization by analyzing real-time vehicle and pedestrian movements, improving urban mobility. These examples showcase TinyML’s significant impact in enhancing operational efficiency, user experience, and sustainable practices across various industries.
Table 5 presents a detailed overview of TinyML applications across a range of fields. It includes concise descriptions of each application and corresponding academic references. The table illustrates the versatility of TinyML, from implementing CNN models on microcontrollers for material damage analysis to its use in environmental monitoring. Each example not only provides a clear application scenario but also cites relevant studies, showcasing TinyML’s extensive impact in practical situations. This presentation highlights TinyML’s role in enhancing the capabilities of embedded devices in various industries.

2.3. TinyML Algorithms

Table 6 provides a structured overview of various TinyML algorithms and their specific applications in different domains. It categorizes these algorithms into areas such as predictive maintenance, data compression, tool usage monitoring, and more, illustrating the range of TinyML’s applicability. Each entry in the table is linked to a corresponding reference, offering a direct connection to the source material. This format effectively showcases the diversity of TinyML’s real-world applications, highlighting its potential to transform various sectors through intelligent on-device data processing and analysis.
Table 7 presents an organized summary of cutting-edge research in the field of Tiny Machine Learning (TinyML). This table methodically categorizes various studies into distinct focus areas, covering a broad spectrum from optimizing deep neural networks on microcontrollers to applying federated meta-learning techniques in environments with limited resources. This structured presentation not only underscores the multifaceted nature of TinyML research but also highlights its significant role in advancing the functionalities of embedded devices for a wide array of applications.

2.4. Data Management Techniques Utilizing TinyML in IoT Systems

In the field of Tiny Machine Learning (TinyML), data management techniques are essential for handling machine learning models on devices with limited resources, such as those in IoT networks. One method involves augmenting thing descriptions (TD) with semantic modeling to provide comprehensive information about applications on devices, facilitating the efficient management of both TinyML models and IoT devices on a large scale [81]. Additionally, employing TinyML for training devices can lead to the creation of a decentralized and adaptive software ecosystem, enhancing both performance and efficiency. This approach has been effectively implemented in the development of a smart edge computing robot (SECR) [82]. Such methodologies are increasingly important in sectors like supply chain management, where they play a crucial role in predicting product quality parameters and extending the shelf life of perishable goods, including fresh fruits in modified atmospheres [83]. Moreover, the growing complexity in communication systems, spurred by diverse emerging technologies, underscores the need for AI and ML techniques in the analysis, design, and operation of advanced communication networks [84].
In the context of IoT systems, data management techniques incorporating TinyML focus on effective data handling, ensuring privacy and security, and leveraging machine learning for insightful data analysis. One strategy employs distributed key management for securing IoT wireless sensor networks, utilizing the principles of elliptic curve cryptography [85]. Another method involves applying data-driven machine learning frameworks to enhance the accuracy of vessel trajectory records in maritime IoT systems [86]. Power management also plays a crucial role in IoT systems, particularly those reliant on compact devices and smart networks. This often includes the adoption of low-power communication protocols and the integration of autonomous power systems, which are frequently powered by renewable energy sources [87]. Furthermore, AI-based analytics, processed in the cloud, are increasingly being utilized for healthcare-related data management, such as systems designed for managing diabetic patient data [88].
Table 8 illustrates how TinyML is revolutionizing data management techniques in IoT systems, bringing efficiency and accuracy to various processes. Techniques like predictive imputation and adaptive data quantization exemplify this transformation. Predictive imputation, using TinyML, maintains data integrity by filling in missing values based on historical and neighboring data, thereby ensuring dataset completeness. Adaptive data quantization, on the other hand, optimizes data storage and transmission. TinyML’s role here is to analyze current data trends and dynamically adjust quantization levels for optimal data representation.
Sensor data fusion, another critical technique, is enhanced by TinyML’s ability to process and merge data from various sensors in real time, thus providing a more comprehensive view and enhancing the accuracy of insights. Anomaly detection is particularly vital in IoT systems, and TinyML enhances this by continuously monitoring data streams to quickly identify and act upon unusual patterns or malfunctions. Intelligent data caching, enabled by TinyML, predicts future data needs, ensuring that frequently used or critical data are cached for instant access.
Further, TinyML facilitates Edge-based clustering, grouping similar data at the network’s edge to simplify analytics and processing. This on-device clustering leads to more efficient data aggregation and transmission. Real-time data augmentation and local data lifespan management are also key areas where TinyML makes a significant impact. TinyML augments sensor data in real time to enhance machine learning performance while also predicting the utility of data for effective local storage management.
Contextual data filtering and on-device data labeling are other areas where TinyML shows its prowess. By using environment-aware models, TinyML filters data relevant to the current context, thereby enhancing decisionmaking processes. Additionally, it can automatically label data based on learned patterns, facilitating efficient data categorization and retrieval. These advanced data management techniques, powered by TinyML, are pivotal in harnessing the full potential of IoT systems, ensuring that they are not only more efficient and accurate but also more responsive to real-time demands.

This entry is adapted from the peer-reviewed paper 10.3390/fi16020042

This entry is offline, you can click here to edit this entry!
Video Production Service