Topic Review
Training, Test, and Validation Sets
In machine learning, the study and construction of algorithms that can learn from and make predictions on data is a common task. Such algorithms work by making data-driven predictions or decisions,:2 through building a mathematical model from input data. The data used to build the final model usually comes from multiple datasets. In particular, three data sets are commonly used in different stages of the creation of the model. The model is initially fit on a training dataset, that is a set of examples used to fit the parameters (e.g. weights of connections between neurons in artificial neural networks) of the model. The model (e.g. a neural net or a naive Bayes classifier) is trained on the training dataset using a supervised learning method (e.g. gradient descent or stochastic gradient descent). In practice, the training dataset often consist of pairs of an input vector and the corresponding "answer" vector (or scalar), which is commonly denoted as the target (or label). The current model is run with the training dataset and produces a result, which is then compared with the target, for each input vector in the training dataset. Based on the result of the comparison and the specific learning algorithm being used, the parameters of the model are adjusted. The model fitting can include both variable selection and parameter estimation. Successively, the fitted model is used to predict the responses for the observations in a second dataset called the validation dataset. The validation dataset provides an unbiased evaluation of a model fit on the training dataset while tuning the model's hyperparameters (e.g. the number of hidden units in a neural network). Validation datasets can be used for regularization by early stopping: stop training when the error on the validation dataset increases, as this is a sign of overfitting to the training dataset. This simple procedure is complicated in practice by the fact that the validation dataset's error may fluctuate during training, producing multiple local minima. This complication has led to the creation of many ad-hoc rules for deciding when overfitting has truly begun. Finally, the test dataset is a dataset used to provide an unbiased evaluation of a final model fit on the training dataset.. When the data in the test dataset has never been used in training (for example in cross-validation), the test dataset is also called a holdout dataset.
  • 682
  • 03 Nov 2022
Topic Review
Vietnam SciMath Database Project
The Vietnam SciMath Database Project is a product of the mathematical research history initiative led by Professor Ngô Bảo Châu, a Vietnamese-French mathematician who won the Fields Medal in 2010.
  • 681
  • 19 Jul 2021
Topic Review Peer Reviewed
Fatal Software Failures in Spaceflight
Space exploration has become an integral part of modern society, and since its early days in the 1960s, software has grown in importance, becoming indispensable for spaceflight. However, software is boon and bane: while it enables unprecedented functionality and cost reductions and can even save spacecraft, its importance and fragility also make it a potential Achilles heel for critical systems. Throughout the history of spaceflight, numerous accidents with significant detrimental impacts on mission objectives and safety have been attributed to software, although unequivocal attribution is sometimes difficult. In this Entry, we examine over two dozen software-related mishaps in spaceflight from a software engineering perspective, focusing on major incidents and not claiming completeness. This Entry article contextualizes the role of software in space exploration and aims to preserve the lessons learned from these mishaps. Such knowledge is crucial for ensuring future success in space endeavors. Finally, we explore prospects for the increasingly software-dependent future of spaceflight.
  • 680
  • 13 Jun 2024
Topic Review
Decentralized Blockchain-Based IoT Data Marketplaces
In present times, the largest amount of data is being controlled in a centralized manner. However, as the data are in essence the fuel of any application and service, there is a need to make the data more findable and accessible. Another problem with the data being centralized is the limited storage as well as the uncertainty of their authenticity. In the Internet of Things (IoT) sector specifically, data are the key to develop the most powerful and reliable applications. For these reasons, there is a rise on works that present decentralized marketplaces for IoT data with many of them exploiting blockchain technology to offer security advantages.
  • 680
  • 11 Aug 2022
Topic Review
Hybrid Blockchains
A hybrid blockchain is often advocated where multiple parties, trust, data access management and sharing, friction, regulations and a combination of centralization and decentralization is involved. A hybrid blockchain is also used by entities that need the benefits of both the public and private characteristics, which can be achieved using Interchain, bridges or other interoperability solutions between legacy systems and blockchains, both public or private. Private, public, consortium and permissioned blockchains all have their own setbacks and benefits. Entities that do not want to expose sensitive business data to the internet are limited to private blockchains. Entities looking for no access restrictions at all can leverage public blockchains like Bitcoin or Ethereum. The hybrid blockchain ensures sensitive business data stays private on business nodes unless permitted. It also validates the hash of the private transactions through consensus algorithms and even public checkpoints such as Bitcoin and Ethereum. Through Interchain, the hash of a private transaction can be placed on the Bitcoin network or any other public blockchain such as Ethereum, making an immutable record of the event with the benefit of a public blockchain’s hash power.
  • 679
  • 11 Oct 2022
Topic Review
Lexicographical Order
In mathematics, the lexicographic or lexicographical order (also known as lexical order, dictionary order, alphabetical order or lexicographic(al) product) is a generalization of the way words are alphabetically ordered based on the alphabetical order of their component letters. This generalization consists primarily in defining a total order on the sequences (often called strings in computer science) of elements of a finite totally ordered set, often called an alphabet. There are several variants and generalizations of the lexicographical ordering. One variant widely used in combinatorics orders subsets of a given finite set by assigning a total order to the finite set, and converting subsets into increasing sequences, to which the lexicographical order is applied. Another generalization defines an order on a Cartesian product of partially ordered sets; this order is a total order if and only if the factors of the Cartesian product are totally ordered.
  • 679
  • 16 Nov 2022
Topic Review
The Dichotomy of Neural Networks and Cryptography
Neural networks and cryptographic schemes have come together in war and peace; a cross-impact that forms a dichotomy deserving a comprehensive review study. Neural networks can be used against cryptosystems; they can play roles in cryptanalysis and attacks against encryption algorithms and encrypted data. This side of the dichotomy can be interpreted as a war declared by neural networks. On the other hand, neural networks and cryptographic algorithms can mutually support each other. Neural networks can help improve the performance and the security of cryptosystems, and encryption techniques can support the confidentiality of neural networks. The latter side of the dichotomy can be referred to as the peace. 
  • 678
  • 07 Jul 2022
Topic Review
Using Colored Petri Net for Accounting System
Many learners who are not familiar with the accounting terms find blended learning very complex to understand with respect to the computerized accounting system, the journal entries process, and tracing the accounting transaction flows of accounting system. A simulation-based model is a viable option to help instructors and learners make understanding the accounting system components and monitoring the accounting transactions easier. This entry briefly introduce a colored Petri net (CPN)-based model.
  • 678
  • 28 Mar 2022
Topic Review
Benchmarking Message Queues
Message queues are a way for different software components or applications to communicate with each other asynchronously by passing messages through a shared buffer. This allows a sender to send a message without needing to wait for an immediate response from the receiver, which can help to improve the system’s performance, reduce latency, and allow components to operate independently. The performance of four popular message queues: Redis, ActiveMQ Artemis, RabbitMQ, and Apache Kafka was compared.
  • 678
  • 24 Jul 2023
Topic Review
OS-Level Virtualisation
OS-level virtualization refers to an operating system paradigm in which the kernel allows the existence of multiple isolated user-space instances. Such instances, called containers (Solaris, Docker), Zones (Solaris), virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernel (DragonFly BSD) or jails (FreeBSD jail or chroot jail), may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources (connected devices, files and folders, network shares, CPU power, quantifiable hardware capabilities) of that computer. However, programs running inside of a container can only see the container's contents and devices assigned to the container. On Unix-like operating systems, this feature can be seen as an advanced implementation of the standard chroot mechanism, which changes the apparent root folder for the current running process and its children. In addition to isolation mechanisms, the kernel often provides resource-management features to limit the impact of one container's activities on other containers. The term "container," while most popularly referring to OS-level virtualization systems, is sometimes ambiguously used to refer to fuller virtual machine environments operating in varying degrees of concert with the host OS, e.g. Microsoft's "Hyper-V Containers."
  • 678
  • 20 Oct 2022
  • Page
  • of
  • 366
ScholarVision Creations