Blockchain: Overview and Definitions
Some groundwork for what we nowadays refer to as “blockchain” was performed by Stuart Haber and W. Scott Stornetta in 1991, in their article entitled “How to Time-Stamp a Digital Document” [
13], while the popular paper by Satoshi Nakamoto in 2008 [
14] established bitcoin as a cryptocurrency and devised the first blockchain database. According to Mayank Raikwar et al., a very general definition of blockchain is: “Blockchain is a distributed ledger maintaining a continuously growing list of data records that are confirmed by all of the participating nodes” [
15].
A block is a record that includes data inside it, as well as a value with the previous block’s hash and, finally, a value that represents its own hash. The hash stands for the digital fingerprint of an amount of data of the block. The link between the hash of the current block and the hash of the previous block explains the meaning of the cryptographically linked chain of blocks through these hashes. If anyone tampers with the data, this digital fingerprint will be changed and finally the chain will be invalid. There are many more concepts around blockchain, such as mining, distributed peer-to-peer network protocols, consensus ledgers, cryptographic hashes, etc.
The “fingerprint” attribute represents the unique identifier of every block and is one of the core principles in the blockchain architecture. A commonly used algorithm for cryptography implementation on any digital data source is the well-known SHA256 hash, which was developed by the National Security Agency (NSA). SHA stands for secure hash algorithm and ‘256’ is the number of bits it takes up in memory. There are five basic requirements that SHA256 satisfies:
The reason for using a hashing algorithm is because reverse engineering techniques are not practical. Thus, the importance of the use of the SHA256 algorithm lies on the fact that any attempt or effort to crack it is not realistically possible, due to its inherent characteristics (hashing is a one-way function).
Blockchain is also an immutable digital ledger, meaning that if anyone tries to tamper or corrupt the data of a specific block, thus changing the hash of this block, this results in a cryptographic link disruption, due to the usage of different hash(es) between the linked blocks of the chain. Because of the change in a chain’s block, all the blocks after that will no longer be valid, which means that it will no longer be connected to the chain. Therefore, from the moment data have entered into the blockchain, they cannot be altered, as all the entries following the tampered block of the chain need to be altered as well. Because of this fundamental structure, it is practically impossible to change a single block in the chain, especially as more and more components are added. It has to be noted that any digital ledger (blockchain on this occasion) is as reliable as the organization that maintains it [
17].
In case one or many blocks of the chain are maliciously changed, or a system error occurs in the process of data input, the cryptographic link will be broken and this will cause a data re-storage problem. This problem is solved through the deployment of distributed peer to peer (P2P) networks, which are one of the key components of blockchain technology and offer a significant enhancement of data storage technology compared to traditional centralized models. A P2P distributed network consists of a great number of computers (nodes) where every device is interconnected (locally, wirelessly or by a cable); ideally the more the nodes connect the better it is. The actual approach of how this type of distributed network is used can affect the whole blockchain scenario.
A P2P network stores and transfers data between the clients (or nodes) of the network without the need for a central point of storage (central server), so the data are less vulnerable to being hacked or lost. In a blockchain P2P network, the blocks of the chain can actually be copied across all the existing computers of the network (thousands or millions of computers) through the usage of the appropriate cryptographic key across all the peers. As time passes, the blockchain grows and the system becomes more and more complex. If someone attempts any malicious alteration on one or more blocks of the chain, they have to control, attack or manipulate at least 50% of the network’s peers or more in order to break the blockchain, otherwise the other peers of the network will immediately realize the difference and will alert the whole network by sending a signal to replace the broken one. In this way, distributed P2P networks add an extra level of security on the existing cryptographic hash, which is a one-level security schema. In a consensus protocol, the more layers of security used, the stronger and safer the blockchain is made [
18].
Every block that composes the blockchain contains at least four fields, namely:
-
The number of the block;
-
The stored data (or stored transactions);
-
The hash of the previous block;
-
The hash of the current block.
Another field in the block is called nonce (number used only once) and relates to the meaning of “mining”. The hash of every block is dictated by four components, i.e., block number, the previous hash, the stored data, and the nonce, and is generated by providing these as input to the SHA256 hashing algorithm. The block number, the previous hash, which is linked directly to contents of the previous block, as well as the stored data, cannot be changed because this would essentially mean that data tampering is attempted. Nonce provides additional control and flexibility that makes defining the correct hash value (one that meets certain requirements) possible, without the need to change any of the other components. In Proof of Work systems like blockchains, miners must find (by using brute force and significant computational power) a nonce value that, when plugged into the hashing algorithm, generates an output that meets specific requirements (e.g., a certain number of leading zeros).
Hence, mining can be perceived as the process of creating a new block for the blockchain and enriching it with a number of transactions. The mining difficulty indicator is the number that suggests the work intensity that a node’s computer has to perform in order to create a new block. In the bitcoin blockchain, this is not a constant number but is automatically adjusted every 2016 blocks. Normally, the creation of this number of blocks should take exactly two weeks. In this period, if more blocks need to be created, the mining difficulty is increased, otherwise it is reduced. For the blockchain, this means that if more miners try to solve the cryptographic puzzle in a shorter timeframe, the system should increase the difficulty in order to maintain it. In an opposite situation (a lot of users may stop mining), the difficulty has to be reduced. In other words, the difficulty adjustment ensures that the mining process is performed for a specific amount of time, no matter how many users are mining or how fast the hardware is. One important disadvantage of this is that it tends to lead to centralization.
Decentralized applications (Dapps) on the blockchain are interfaces enabling people to connect and interact with many components of it. They are applications that can exist as decentralized programs that can run and can be stored on the peers’ computers in the blockchain network rather than in a centralized server. The core visionary idea is to build a global super-computer in a distributed manner, which will be facilitated through a blockchain where everything (programs, transactions) will be recorded, tracked and stored in an immutable manner. Simultaneously, a copy of that blockchain application will reside with the clients.
Ethereum and Smart Contracts
Ethereum is a project platform that was created in 2013 by Vitalik Buterin. Essentially, the idea behind the Ethereum protocol is that all peers of a network are interconnected. Blockchain technology not only allows one to store transactional data but also to store and facilitate programs, as well as execute them, enabling any application to be decentralized [
19]. The article of Vitalik Buterin, the founder of Ethereum, referred to the meaning of decentralization and presented the three different levels of the (de)centralization:
Even today, many applications use the certification access mechanism which does not provide full visibility to the peers of the network. In order to address the aforementioned problem, in applications such as product supply chains, an environment that facilitates “smart contracts” is a promising approach. The term “smart contract” was firstly used by Nick Szabo in 1997. His main vision for smart contracts was the creation of a distributed ledger to store contracts. A contract is a set of rules or clauses that parties have agreed upon the governing relationship between them. Smart contracts are just like contracts in the real world, with the difference that they are completely digital. They are small script programs that are used and stored in blockchains, featuring a tamper-proof logic code into them. Smart contracts inherit some important blockchain attributes; they are immutable and distributed because of their storage inside the blockchain. Being immutable ensures that no one can tamper with the code of the contract, while being distributed secures the validation of smart contracts’ output from everyone on the network. Smart contracts are used in many types of blockchain applications such as in supply chains and in the health sector.
When a block (or transaction) is scanned and sourced through a completely digitized way, then the specific transaction is confirmed and the block is appended at the chain. After the execution of the contract, a certificate is issued, where a variety of information related to the blockchain transactions can be retrieved. Finally, every client in the network retains a copy of the smart contract and, as a result, each node has the following:
Ethereum was specifically created and designed for smart contracts support. There are many examples of programming languages that allow software coding within the blockchain. A widely used tool for this purpose is Ethereum’s Solidity programming language, a Turing-complete language. Solidity defines and determines a sequence of specific rules that dictate how a program operates and executes [
19].
provides an overview of the utilization of smart contracts in different agriculture traceability systems where blockchain technology is applied. As may be observed, the majority of current research results adopts the smart contract technology on the Ethereum blockchain in order to implement various types of transactions (such as transactions between farmers, suppliers and distributors).
Table 1. Survey or research works in the agriculture traceability systems—classification depending on the usage (or not) of smart contracts.
Use of Smart Contracts |
Literature Works |
Literature involving smart contracts in agriculture traceability systems |
[22,23,24,25,26,27,28,29,30,31,32,33,34,35,
36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58]
|
Literature involving blockchain technology but without the use of smart contracts in agriculture traceability systems |
[10,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73] |
Consensus Methodologies
A very important characteristic, not only for the blockchain technology, but also for any type of decentralized system, is byzantine fault tolerance. The byzantine generals’ problem was conceived in 1982 as a logical dilemma that illustrates how a group of generals, who may have communication problems, will try to agree on the next move. If we apply the above definitions to the operation of blockchains, each general represents a network node, and all nodes need to reach a consensus on the current state of the system. This means that the majority of the participants within a distributed network have to agree and execute the same action in order to avoid failures. The byzantine generals’ problem gave birth to the concept of byzantine fault tolerance. Byzantine fault tolerance is the property of a system to resist all the possible costs of failures derived from the byzantine generals’ problem. In other words, a byzantine fault tolerant system is able to continue operating, even if some of the nodes fail to communicate or act in a malicious way. There are multiple ways of developing or building a byzantine fault tolerant blockchain system, and these are related to the different types of consensus algorithms/protocols [
74,
75].
A consensus algorithm can be defined as a mechanism through which a blockchain network reaches consensus. The consensus protocol for a blockchain has to solve two major challenges: the first one is to protect the network from attackers, and the second one is to tackle competing chains. The most celebrated consensus protocols are the Proof of Work (PoW) and the Proof of Stake (PoS) . In addition to these, there are some other consensus algorithms, such as Proof of Authority (PoA), Proof of Burn (PoB), Proof of Elapsed Time (PoET), Proof of Capacity (PoC), and others.
More specifically, PoW gives more rewards to people that own more and better equipment. As a result, the higher the hash rate of a user is, the higher the chance of creating the next block and receiving the mining award. In the end, the hash that every block has is the proof of work, which occurs by solving a cryptographic challenge puzzle. The term “mining pools” refers to the combination of hashing power and the distribution of the reward across every node in the winning pool. The use of mining pools renders the blockchain more centralized, as opposed to decentralized schemas.
In order to solve the above issue, an idea for a new blockchain consensus protocol, as an alternative to the proof of work protocol, was born. In 2011, a Bitcointalk forum user called QuantumMechanic proposed a new technique called Proof of Stake (PoS). The differences between PoW and PoS are quite significant. PoS is more decentralized than PoW, it does not let every user to mine for new blocks, it is a lot less expensive compared to the PoW mining equipment requirements, and finally, encourages more people to set up a node, making the network more decentralized as well as more secure.
Besides these advantages, the PoS protocol entails additional risks when compared to PoW. Specifically, if a single or a group of miners can obtain 51% of the hashing power, they can effectively control and manipulate the blockchain. This attack is known as the 51% Attack. PoS makes this type of attack very impractical on specific cryptocurrency values, so it is actually less likely to occur with this type of consensus protocol, yet still remains an important risk. Another important risk issue is the way that PoS algorithms select the next validator. The process is random, and validators can typically be selected based on a combination of the lowest hash value and highest stakes, or based on how long their tokens have been staked for. Another problem of PoS is due to the possibility of selecting a user as the next validator, a role that may not come along its duties. An approach to solving this is by choosing a large number of backup validators [
76].
Public vs. Private Blockchain
Adhering to the view for open access to everyone, many widely used blockchains (Bitcoin, Ethereum and Dash) are public networks. Public blockchain networks are characterized as permissionless, where anyone can join, read and/or write data, create smart contracts or even run a node within them, ensuring one hundred percent transparency as well as a high level of anonymity. Public networks are recommended for entities involved mainly in crypto-economics. In contrast to public blockchain networks, private networks restrict either participant or validator access as a classic closed ecosystem where all peers are well defined and only pre-approved entities can run nodes. Following a business-to-business approach, many companies use private blockchain networks in order to benefit from this technology without sacrificing their autonomy. Private blockchains, in contrast to public ones, typically use a type of consensus other than PoW, and might aim at keeping certain information private from the public. In summary, public blockchain networks enjoy freedom in decentralization, whereas private blockchain networks enjoy freedom in configurational flexibility [
77]. Hybrid approaches also exist—these are called permissioned blockchain systems. Different types and configurations of permissioned blockchain systems exist, but typically the consensus process is controlled by a predefined list of participants, and users cannot participate without permission. Access to the full information of transactions on the blockchain might be restricted, depending on the user role.
provides an overview of the types of blockchain used in different agriculture traceability systems. The majority of research works have used public blockchain types, in association with either Ethereum or Hyperledger, whereas a smaller percentage prefer the adoption of permissioned or fully private blockchains. Nonetheless, it is notable that many research studies attempt to make a more theoretic contribution and do not explicitly make reference to a specific blockchain implementation.
Table 2. Literature survey—use of different blockchain types and implementations in agriculture traceability systems.
Public vs. Private Blockchain |
Blockchain Implementation |
Research Works |
Public |
Ethereum |
[28,29,30,31,33,38,51,52,56,62,63] |
Hyperledger Sawtooth |
[54] |
Hyperledger (other than Sawtooth) |
[34,49] |
Ethereum or Hyperledger Sawtooth |
[43] |
Ethereum or Hyperledger (other than Sawtooth) |
[23] |
Not specified |
[10,22,24,26,36,39,41,42,44,46,50,55,59,60,64,65,67,69,70,72,73,78,79] |
Private |
Ethereum |
[51] |
Ethereum or Hyperledger Sawtooth |
[23] |
Not specified |
[26,32,47,50,66] |
Permissioned (Hybrid) |
Ethereum |
[27,40,48] |
Hyperledger Sawtooth |
[37,45,57,58] |
Hyperledger Fabric |
[35,53,68] |
Not specified |
[25,61,71] |