- Please check and comment entries here.
Real-Time Information Processing and Visualization
The processing of information in real-time (through the processing of complex events) has become an essential task for the optimal functioning of manufacturing plants. Only in this way can artificial intelligence, data extraction, and even business intelligence techniques be applied, and the data produced daily be used in a beneficent way, enhancing automation processes and improving service delivery.
1. Software of Real-Time Information Processing and Visualization
2. Event Broker
2.1 Apache Flume
3. Apache Sqoop
4. Solace PubSub+
PubSub+ Event Broker: This layer incorporates three other sub-layers that are described below. Even so, the PubSub+ event brokers have the ability of loading an event mesh.
PubSub+ Event Broker- Software: The main function of the Solace software sub-layer is to efficiently transport information in the form of events. This transport can be between applications, IoT devices, and user interfaces, all of which can be hosted locally or in a cloud. This software allows the use of various communication protocols, such as open protocols, like Advanced Message Queuing Protocol (AMQP), Java Message Service (JMS), Message Queuing Telemetry Transport (MQTT), Representational State Transfer (REST), and WebSocket. There are two versions of this software, one free (Standard) with support of up to 1000 client connections, and another that offers high performance (Enterprise), with a scale of up to 200,000 client connections.
PubSub+ Event Broker- Appliance: PubSub+ Appliances have three characteristics that define them exclusively. They are specially designed with high-speed FPGAs and network processors that support extremely low and predictable latency. They offer built-in redundancy and can even continuously replicate all messages to waiting locations.
PubSub+ Event Broker- Cloud: Solace’s cloud service makes software event brokers available as a service. Only in this way can the needs of the software be met in a short period of time, and scale on-demand to any level.
PubSub+ Event Mesh: An event mesh is a layer that dynamically routes events from one application to any other.
PubSub+ Streaming APIs and Integrations: They provide a variety of on and off-ramps, such as the protocols already listed and proprietary APIs for messaging, in order to link old and modern applications and connectors to technologies, like Kafka.
PubSub+ Event Portal: The PubSub+ Event Portal is an event managing tool-presented through the User Interface (UI) available on the Web—which allows for the discovery, constructing, visualizing, sharing, and managing of several aspects of the Event-Driven Architecture (EDA). Here, major elements of the Event portal are described, as well as a general view of its tools. Furthermore, some of the characteristics are discussed, such as the possibility of execution-time EDA, support for Kafka-native objects, event sharing, version control, REST API, AsyncAPI, and other essential characteristics. In addition, tools are provided for building, describing, and discovering events within the system, but also the establishing of connections between applications and events, making it easier to develop event-oriented applications and microservices .
PubSub+ Platform Security: The security platform allows for message architectures that obtain a consistent multi-protocol authentication of a client, plus security clearance management in a company environment, all of it integrated with company authentication services while using a minimum amount of components.
Publisher: the entity that sends or publishes the message (also called a producer);
Message: what the publisher wants to say to the subscriber. Messages often contain events, but can also carry queries, commands, and other information;
Subscriber: the ultimate receiver of the message (also called a consumer);
Topic: used when the message is intended to be consumed by more than one subscriber;
Queue: used when the message is intended to be consumed by at most one subscriber.
5. Apache Kafka
Spark Core: Spark integrates a RDD (resilient distributed dataset), handling the partitioning of data across all nodes in a cluster. Two operations can be performed on RDDs, namely Transformations and Actions ;
Spark SQL: Allows applications to access Spark’s data through a Java Database Connectivity (JDBC) API. That way, SQL queries can be performed on the data, also allowing the usage of BI and data visualization tools ;
Spark Streaming: It is used to process streaming data in real-time, based on micro-batch computing;
MLlib (Machine Learning Library): It is Apache Spark’s scalable machine learning library. MLib contains a suite of algorithms and utilities that interoperate with programming languages and most scientific computing environments ;
GraphX: GraphX is a Spark components for graphs and graph-parallel computation. GraphX extends the Spark RDD by introducing graph abstractions, as well as including a “collection of graph algorithms and builders to simplify graph analytics tasks” ;
SparkR: SparkR is an R package that provides the ability to use Apache Spark from R, as well as a data frame implementation that supports selections, filtering, and aggregation. Furthermore, SparkR enables distributed machine learning on R through MLlib .
This entry is adapted from 10.3390/app11114800
- The Apache Software Foundation. Welcome to Apache Flume. 2020. Available online: (accessed on 20 October 2020).
- Hoffman, S. Apache Flume: Distributed Log Collection for Hadoop; Packt Publishing Ltd.: Birmingham, UK, 2013.
- Vohra, D. Apache flume. In Practical Hadoop Ecosystem; Springer: Berlin/Heidelberg, Germany, 2016; pp. 287–300.
- The Apache Software Foundation. Flume 1.9.0 User Guide. Available online: (accessed on 20 October 2020).
- Srinivasa, K.; Siddesh, G.; Srinidhi, H. Apache Flume. In Network Data Analytics; Springer: Berlin/Heidelberg, Germany, 2018; pp. 95–107.
- The Apache Software Foundation. Sqoop. 2019. Available online: (accessed on 20 October 2020).
- Vohra, D. Using apache sqoop. In Pro Docker; Springer: Berlin/Heidelberg, Germany, 2016; pp. 151–183.
- Arvind. Apache Sqoop Graduates from Incubator. 2012. Available online: (accessed on 20 October 2020).
- What is Solace PubSub+ Platform? Available online: (accessed on 20 October 2020).
- PubSub+ Platform. Available online: (accessed on 20 October 2020).
- PubSub+ Event Portal. Available online: (accessed on 20 October 2020).
- Apache Kafka-Introduction. Available online: (accessed on 20 October 2020).
- Garg, N. Apache Kafka; Packt Publishing: Birmingham, UK, 2013.
- Confluent. What is Complex Event Processing? Guide to CEP. Available online: (accessed on 20 October 2020).
- Shapira, G.; Palino, T.; Sivaram, R.; Narkhede, N. Kafka: The Definitive Guide; O’Reilly Media, Incorporated: Sebastopol, CA, USA, 2017.
- Confluent Inc. Introduction to Kafka. Available online: (accessed on 20 October 2020).
- Carter, M. Apache Kafka Architecture: A Complete Guide. Available online: (accessed on 20 October 2020).
- Chellappan, S.; Ganesan, D. Practical Apache Spark: Using the Scala API; Apress: Berkeley, CA, USA, 2018.
- Frampton, M. Mastering Apache Spark; Packt Publishing: Birmingham, UK, 2015.
- Gour, R. Apache Spark Ecosystem—Complete Spark Components Guide. 2018. Available online: (accessed on 20 October 2020).
- Penchikala, P. Big Data Processing with Apache Spark—Part 1: Introduction. 2015. Available online: (accessed on 20 October 2020).
- The Apache Software Foundation. MLlib|Apache Spark. Available online: (accessed on 20 October 2020).
- The Apache Software Foundation. GraphX—Spark 3.0.2 Documentation. Available online: (accessed on 20 October 2020).
- The Apache Software Foundation. SparkR (R on Spark). Available online: (accessed on 26 February 2021).