Supply chain sustainability (SCS) in the age of Industry 4.0 and Big Data is a growing area of research. However, there are no systematic and extensive studies that classify the different types of research and examine the general trends in this area of research. This paper reviews the literature on sustainability, Big Data, Industry 4.0 and supply chain management published since 2009 and provides a thorough insight into the field by using bibliometric and network analysis techniques. A total of 87 articles published in the past 10 years were evaluated and the top contributing authors, countries, and key research topics were identified. Furthermore, the most influential works based on citations and PageRank were obtained and compared. Finally, six research categories were proposed in which scholars could be encouraged to expand Big Data and Industry 4.0 research on SCS. This paper contributes to the literature on SCS in the age of Industry 4.0 by discussing the challenges facing current research but also, more importantly, by identifying and proposing these six research categories and future research directions.
Sustainable development is defined as development that meets the needs of the present without compromising the ability of future generations to meet their own needs [1], and it has become a key strategic objective worldwide. Sustainable development requires the consideration and integration of economic, social, cultural, political and ecological factors in decision-making, in an attempt to balance economic development, social development and environmental protection [2]. As a result, the majority of large companies and an important number of small and medium-sized enterprises have been incorporating policies and actions aimed at improving their sustainability and the sustainability of their supply chains (SC) [3]. This consideration of sustainability as an objective in supply chain management, in what is also known as Sustainable Supply Chain Management (SSCM) is due mainly to three factors: (1) the pressure of stakeholders (such as investors, shareholders, customers and non-profits) to alleviate the enormous environmental impacts that are being generated (deterioration of the environment, scarcity of resources, increase in waste generated, increased pollution); (2) to generate brand value that serves as a differentiating element against competitors; and (3) because of the increasingly restrictive regulations [4].
However, the incorporation of sustainability into supply chain management faces a number of obstacles and difficulties. Pagell [5] states that present knowledge in SSCM is not sufficient to create truly sustainable supply chains, and identify problems with assumptions, norms, institutions, measures, and methods that future research needs to address. In addition to this shortcomings, Luthra & Mangla [6] add other: the technical issues related to the generation, processing and analysis of data that will allow greater effectiveness and efficiency in business processes, as well as performance control and support for supply chain decision-making.
This is where the concept of Industry 4.0 and the use of Big Data technology come into play. Industry 4.0 has become a buzzword that describes the trend towards digitalization and automation of manufacturing. It allows products, machines, components, individuals and systems to create a smart network so that it can integrate cyber-physical systems to act quickly by linking information and physical memory to the smart network for faster and more effective service environments [7]. The contribution of Industry 4.0 to more sustainable industrial value creation will be remarkable in the future [8]. For example, it will allow efficient resource allocation, which includes water, energy, raw material and other products, based on data that is collected in real time, resulting in new sustainable green practices. On the other hand, Big Data analytics are technologies and techniques used to analyse large-scale, complex data from various applications in order to acquire intelligence and to extract unknown, hidden, valid and useful relationships, patterns and information [9] Big Data can be very useful to enhance supply chain sustainability as it can be used to identify the sustainability impact of the SC in the past, as well as to predict its future sustainability impact [10]. Therefore, the combined use of digitalization and automation of Industry 4.0 together with Big Data technology has great potential to overcome some of the problems affecting sustainable supply chain management [11].
However, the development of the research on Big Data and Industry 4.0 to enhance supply chain sustainability is still limited, its final application in real cases of the business world is usually in early stages [12], and hence practitioners have problems concerning how to use Big Data and Industry 4.0 in supply chain sustainability [13]. This could limit its impact, since existing SSCM literature has been primarily focused on other problems, such as building the definitions of SSCM; discussing the implementation of SSCM; proposing strategic decisions to be considered in SSCM; analysing SSCM governance mechanisms; and developing frameworks for SSCM [14]. Therefore, more research in the technological issues concerning the Industry 4.0 and Big Data solutions that support sustainable practices in the supply chain is required [6, 15-16].
In order to cover the research gap between the growing and useful interest in how Industry 4.0 and Big Data can help to support sustainable supply chains, and the scarcity of systematic and extensive reviews of the recent research on this subject, this entry seeks to explore the status of research in the domains of Industry 4.0, Big Data and sustainable supply chains, as well as to establish a categorical framework that brings together research conducted on the basis of relevant common points. The hypothesis is that a bibliographic analysis of current research can facilitate the advancement of future research in this area. In particular, the following research questions (RQs) are elaborated:
RQ1: Which are the top contributing authors, countries, and institutions in the field of Big Data and Industry 4.0 applied to SSCM?
RQ2: Is it possible to define research categories on the basis of relevant common points?
RQ3: Which are the future research necessities in the field of Big Data and Industry 4.0 applied to SSCM?
To answer the above research questions, and to address Industry 4.0 and Big Data as means to support sustainable supply chains, this entry (1) reviews the literature on Big Data, Industry 4.0 and supply chain management, published since 2009; (2) provides a thorough insight into the field by using bibliometric and network analysis techniques and by evaluating 87 articles published over the past 10 years, and identifies top contributing authors, countries and key research topics related to the field; (3) obtains and compares the most influential works based on citations and PageRank; and (4) identifies and proposes six established and emerging research categories which would encourage scholars to expand research on Industry 4.0, Big Data and SSCM; and (5) identifies the future research necessities in every category. The methodology followed has combined Rowley and Slack's proposal [17] and the snowball method [18] for item selection. To perform the bibliographic and network analyses, the paper by Mishra et al. [19] has been used as a model. Finally, for the identification of the categories, it has been followed the comparative method proposed by Collier [20].
2.1. Sustainable Supply Chain Management
A Supply Chain can be defined as “a set of three or more entities (organizations or individuals) directly involved in the upstream and downstream flows of products, services, finances, and/or information from a source to a customer” [21]. The supply chain should be designed, managed and coordinated as a single entity rather than separately as individual functions. Therefore, an adequate Supply Chain Management (SCM) must:
Nowadays, one of the main challenges in SCM is the integration of sustainability principles into the supply chain, considering a multidimensional (the economic, environmental and social impact) and multiscale approach (institutional, geographical and temporal [25]. The objective is to adopt social and environmental practices in the supply chain, in alignment with diverse stakeholder expectations, to mitigate sustainability-related risks [26]. It has conducted to the emergence of a new stream inside SCM, the sustainable supply chain management (SSCM). SSCM can be defined as “the designing, organizing, coordinating, and controlling of supply chains to become truly sustainable with the minimum expectation of a truly sustainable supply chain being to maintain economic viability, while doing no harm to social or environmental systems” [5].
There is a growing movement towards adopting social and environmental practices in the supply chain [27]. Many papers have discussed the drivers and enablers for organizations implementing SSCM [14]. After an analysis of literature, Zimon et al. [28] summarize these drivers in Management commitment, Organisational involvement, Supportive culture, Productivity improvement, Waste elimination, Competitive opportunity, Business social compliance, Environmental regulation compliance, Green product requirement, Reverse logistics requirement, Customer and supplier involvement, Regulatory pressure, Institutional pressures, International environmental regulation, Competition, Reputation, and Social responsibility. Therefore, SSCM has become a synergistic and dynamic part of corporate competitive advantage; and understand how SSCM contributes to the creation of value is an important part of management [29].
SSCM involves different organizational, human and technological elements, such as: (1) the involvement of all members of the supply chain in defining a new mission, vision, values, policies, objectives and strategies of the SC that include the sustainability approach; (2) the re-design of business processes and the creation of new ones, such as reverse logistics or close loop supply chains to incorporate sustainable practices; (3) training and generation of new skills among human resources; and (4) the support of information and communications technologies to optimize business processes and decision-making. This makes SSCM a complex task that has to overcome different difficulties such as the diversity of products and services provided by the supply chain [30], the geographical dispersion of the members of the supply chain with different legal regulations [31], the necessity to measure the impacts of the supply chain upstream and downstream [32], and the difficulty to obtain data beyond the supply chain [33].
To solve these problems, different frameworks/models have been developed to integrate sustainability in supply chains. Zimon et al. [29] identify four categories of SSCM models/frameworks: implementation models, conceptual models, performance models, and contextual factor models. A thorough review of the different frameworks can be found in [34]. However, none of these frameworks addresses the possibilities of combining Big Data and Industry 4.0 to support SSCM.
2.2 Industry 4.0
Industry 4.0 is a German project that has amalgamated manufacturing with information technology and its aim is to work with a higher level of automatization in order to accomplish a higher level of operational productivity and efficiency by connecting the physical to the virtual world [35]. This connection is achieved using technology-based production processes and equipment that communicate autonomously with each other along the value chain [36]. Therefore, Industry 4.0 is an industrial approach based on three fundamental principles: (1) Equipment and processes are connected to each other and operate with the maximum possible degree of autonomy, allowing horizontal and vertical integration across the entire value creation network; (2) Digitalization of the product and service offerings, and end-to-end engineering throughout the entire life cycle; (3) Innovative digital business models [37].
Industry 4.0 implies a complete communication among the different components of the supply chain, such as companies, factories, suppliers, logistics, resources and customers [38]. Each of them optimizes its configuration in real time depending on the demands and status of the other members of the supply chain, which will allow for the incorporation of sustainable practices [38-42]. For example, costs and pollution, raw materials and CO2 emissions will be reduced [43].
The information technology part of Industry 4.0 consists of cyber-physical systems (CPS) operating in a self-organized and decentralized manner, using cloud computing and the Internet of Things (IoT) to communicate and cooperate with each other and with humans in real time [15]. This cooperation is based on Interoperability, Virtuality, Decentralization, Real-Time Capability, Modularity and Service Orientation [43].
2.3 Big Data
Currently, there is no globally accepted definition of the term Big Data, although the 6Vs framework has emerged as a structure commonly used to describe it: Volume, very large amount of data; Velocity, the data are generated very quickly and must be processed in a very short time; Variety, a large number of structured and unstructured data types are processed; Value, the goal is to generate significant value for the organization; Veracity, reliability of the processed data; and Variability, flexibility to adapt to new data formats, by collecting, storing and processing them [44].
Big Data analytics are technologies and techniques used to analyse large-scale, complex data from various applications in order to acquire intelligence and to extract unknown, hidden, valid and useful relationships, patterns and information. Different methods are used to deal with such data. Some of the most important include: Text analytics; Audio analytics; Video analytics; Social Media analytics and Predictive analytics [45-46]. Therefore, Big Data involves a complex, interconnected and multilayered ecosystem of high-capacity networks, users, applications and services needed to store, process, visualize and deliver results to destination applications from different data sources [47].
Big Data analytics are being used in different aspects of SCS, such as to support world-class sustainable manufacturing [48], to design sustainable ship routing [49] and scheduling [50] or to assess environmental efficiency [51].
The bibliographic search was conducted using the Web of Science. Web of Science together with Scopus are the main sources of bibliographic citations used for bibliometric analyses. This is mainly because they are the only one that combine both a rigorous selection process and a wide interdisciplinary coverage, which represents a significant strength over the others databases. On the one hand, there are other popular interdisciplinary databases such as Google Scholar, but the low data quality found in Google Scholar raises questions about its suitability for research evaluation [52]. On the other hand, there are prestigious bibliographic databases but they are focused on specialised fields, or they are regional and even country-based abstracting.
In this case, for the systematic and extensive review to carry out in Industry 4.0 and Big Data to support sustainable supply chains, relevant and great quality papers from different scientific fields are needed, because sustainability is an interdisciplinary field addressed from different research domains such as business management, environmental and social sciences, engineering, etc. Therefore, Web of Science together and Scopus are the best databases for this study.
Even so, both have their limitations and biases that give rise to the existence of under-representation of articles by fields, language or countries. The most comprehensive approach would be to combine different search engines so that they can each cover the biases of the others [52]. In the study presented in this paper only the Web of Science has been used because it offers several analysis tools that will be necessary and that are not compatible with other systems. The reason for not using Scopus is that its search tool works by study areas and does not offer the versatility to carry out the search by keywords referring to content, title or author, which is a crucial element for this study.
The desktop version of EndNote has been used for the management of the bibliographic libraries extracted from the search in the Web of Science. This software was selected because it is a tool that is very easy to use, very powerful, compatible with many bibliographic search engines, compatible with the MS Office tools and can be downloaded as a trial version.
A tool integrated within the Web of Science has been used for the bibliographic analysis. This tool performs an analysis of the list of articles, grouping them into different categories such as country, institution in which they were developed, the journal that publishes them or their authors. It also allows a report to be created that provides an analysis of the evolution of the number of citations over time and to obtain some reference indicators such as the h-index.
The list of articles was extracted from the Web of Science in plain text format and was then processed with the BibExcell tool to generate a file with the necessary format and structure for the network analysis performed in Gephi. Gephi is an open source tool widely used for network and graph analysis. It offers the possibility of performing an advanced exploratory analysis of entity-relationship data. It also allows a significant number of distribution, organization and clustering algorithms to be applied in order to determine the interactions that occur in the data. These techniques allow, among other things: Identification of nodes, or more influential entities; Identification of groups related by affinity; Calculation of relevance indicators such as PageRank; or Visualization of information in order to facilitate the identification of elements of interest.
The model for the methodology followed to perform the bibliographic analysis was taken from the article by Mishra et al. [19], which in turn is based on the five-step method proposed by Rowley and Slack [17].
First, the keywords that will define the search for the articles were selected. Three keyword combinations were established: (1) Big Data, Industry 4.0, sustainability, supply chain; (2) Big Data, sustainability, supply chain; and (3) Industry 4.0, sustainability, supply chain. The search was conducted in English as there are a greater number of bibliographic sources in that language [35]. Web of Science has the search criterion 'Topic', which searches for the keyword combinations in the title, abstract and keywords of the papers. This is the search criterion that was used in the study.
The search method used in this study is based on the snowball system that is widely described in Wohlin's article [18]. As a result, an initial list of articles was obtained from the combination of the aforementioned keywords. The articles in the initial list were then filtered by analysing each of them individually. First, it was checked whether at least two of the search keywords were contained among the keywords of the article in question. If that was the case, references to the third keyword were sought in the abstract. If they were not included, the abstract was analysed in detail to determine whether the article dealt with the topic in question, even if the keyword was not explicitly mentioned. If the keyword was not found, the article was discarded. If it was found, the article was selected, and then its cited papers were analysed to determine whether they could also be selected. If so, the process was continued in a similar manner until no further adequate articles were found in the references of a selected article. When this happened, it went back to a higher level to continue the search.
Applying this process resulted in a list of 87 articles. Once the list had been defined, the analysis tools available on the Web of Science were used. The following tasks were performed with these tools:
The plain file containing the list of papers was treated with BibExcell to generate a file that was compatible with Gephi software. This file contained only the relationships between all the articles and a label that identified them. The relationships reflect the citations that papers on the list make from other papers on the list. The analysis of networks and relationships carried out using Gephi was aimed at: (1) analysing the relationship of articles based on citations made among them; (2) identifying the relevant articles and those that are marginal; (3) detecting clusters or sub-communities on the list based on the relationships generated through citations; and (4) calculating the PageRank indicator.
PageRank was the first algorithm used by Google to establish a ranking among web pages, proposed by Brin and Page [53]. The algorithm assigns a value to each web page based on a network analysis that measures the interactions among the pages. This algorithm can be applied to any set of elements that are related to each other through citations or references. This is why, shortly after its appearance, it was applied in bibliographic studies and in determining the relevance and prestige of publications [54]. The interesting thing about this algorithm is that it not only takes into account the times an article is cited but also the degree of importance of the articles that cite it.
Finally, a detailed analysis of the content of all the articles was carried out in order to classify them in different categories. A system inspired by the comparative method proposed by Collier [20] was used to establish the categories. According to this system, a preliminary classification based on a previous content analysis was proposed. To establish these categories, the common points shared by the articles were identified. The fundamental objective of the article and the contributions and advances that it offers the state of the art were the main factors considered. This classification was taken as a starting hypothesis. After that, the adequacy of the categories to classify all the articles was checked paper by paper. When an article was found that did not fit in any category, the classification was rethought with a view to integrating the dissonant element. Several reviews were performed until all the items on the list were properly distributed in the proposed classification.
Results about the bibliometric analyis, the research categories, and future trends can be found in https://www.mdpi.com/2071-1050/12/10/4108
This entry is adapted from the peer-reviewed paper 10.3390/su12104108