Big Data on the Farm: Comparison
Please note this is a comparison between Version 1 by Giovanni Franzo and Version 3 by Jason Zhu.

The demand for poultry meat and eggs is predicted to considerably increase in pace with human population growth. Although this expansion clearly represents a remarkable opportunity for the sector, it conceals a multitude of challenges. Pollution and land erosion, competition for limited resources between animal and human nutrition, animal welfare concerns, limitations on the use of growth promoters and antimicrobial agents, and increasing risks and effects of animal infectious diseases and zoonoses are several topics that have received attention from authorities and the public. The increase in poultry production must be achieved mainly through optimization and increased efficiency. The increasing ability to generate large amounts of data (“big data”) is pervasive in both modern society and the farming industry. Information accessibility—coupled with the availability of tools and computational power to store, share, integrate, and analyze data with automatic and flexible algorithms—offers an unprecedented opportunity to develop tools to maximize farm profitability, reduce socio-environmental impacts, and increase animal and human health and welfare. A detailed description of all topics and applications of big data analysis in poultry farming would be infeasible. Therefore, the present work briefly reviews the application of sensor technologies, such as optical, acoustic, and wearable sensors, as well as infrared thermal imaging and optical flow, to poultry farming. The principles and benefits of advanced statistical techniques, such as machine learning and deep learning, and their use in developing effective and reliable classification and prediction models to benefit the farming system, are also discussed.

  • poultry
  • big data
  • sensors
  • machine learning

1. Introduction

Poultry production plays a critical role in the global economy. Pressure on the agricultural system will increase with the continuing expansion of the human population. By the end of 2050, the demand for poultry meat is estimated to double, and the demand for eggs is estimated to increase by 40%, representing an important source of highly valuable and inexpensive protein [1][2][1,2]. Beyond industrial farming, particularly in the small-scale village context, chicken farming can significantly contribute to poverty alleviation through income generation and household food security [3][4][3,4]. Although the increase in poultry demand represents a great opportunity for the industry, it also conceals a multitude of challenges. Pollution and land erosion, competition for limited resources between animal and human nutrition, animal welfare concerns, limitations in the use of antimicrobial agents, and increasing risks and impacts of animal infectious diseases and zoonoses are only some of the topics that have concerned authorities and the public [5]. Whether real or perceived, these aspects pose severe limitations to further expansion of traditional poultry production. A clear solution would be the improvement of production efficiency. Accurate, prompt, and dynamic collection, integration, and analysis of large amounts of data have been key to the success of many productive activities and have become an essential part of our lives. Technologies such as sensors, cloud computing, machine learning (ML), and artificial intelligence (AI) are transforming several industries. Although data collection is already routinely applied in certain agriculture and farming realities, including poultry farming, skepticism persists regarding this approach [6][7][6,7]. Because the production of poultry—one of the fastest-growing production species—uses highly similar management strategies worldwide and high levels of integration, it offers ideal conditions for the application of new technological developments. Moreover, most animal farmers now have access to modern technologies—such as high-speed internet, smartphones, and inexpensive computing power—which were unavailable a decade ago [8][9][8,9]. Unfortunately, many strategies for big data integration, sharing, and analysis remain at early stages of development. Hardware sensors (such as cameras or vision sensors; infrared thermal imaging sensors; temperature sensors; radio frequency identification tags; accelerometers; motion sensors; or microphones) can generate an astonishing amount of information (big data) [1][10][1,10]. Similarly, instances of progress in sequencing technologies allow for a continuous increase in host, microorganism and pathogen genomes and gene expression profiles. Advanced AI and ML algorithms can be integrated in the data analysis process and make use of these extensive data to analyze, predict, and notify farmers of abnormal occurrences, identifying patterns and suggesting solutions to pressing problems in modern animal farming, driving the strategies to improve the sector’s profitability [8][11][8,11].

2. Sensors and Data Generation

In response to the above-mentioned challenges, changes in farming strategies and the implementation of new smart management technologies are highly relevant. These include precision livestock farming practices, in addition to other technologies associated with the collection and use of farm-generated data. Precision livestock farming is the management of livestock production benefitting from automatic data acquisition, access, and processing [12][13][14,15]. In intensive poultry production, many factors, such as stocking density, environmental deterioration, unsuitable social environments, thermal stress, or difficulties in accessing essential resources, can be major sources of stress leading to welfare deterioration and reduced performance [14][15][16,17]. The collection of environmental variables, such as temperature, air velocity, ventilation rate, litter quality, humidity, and gas concentration, has clear benefits in poultry welfare, mortality, and performance, thereby helping producers reach the desired level of production [1] (Table 1).
Table 1. Examples of sensors used to evaluate and detect alteration in different fields of poultry farming.
Field Topic Sensor Reference
Infectious disease Avian influenza Wearable sensor [16][17][18][19][20[23][18,19][,2021][,2122,22],23,24,25]
Imaging
Sound analysis
Thermal images
Clostridium perfringens Sound analysis [24][26]
Coccidiosis Volatile organic compounds [25][26][27][27,28,29]
Imaging
Infectious bronchitis Sound analysis [21][22][23][23,24,25]
Newcastle disease Sound analysis [21][28]30[29],31[30],32[31][23,,33]
Imaging
Welfare and health Distress Thermal Imaging [32][33][34][35][34,35,36,37]
Imaging
Footpad dermatitis Imaging [13][36][15,38]
Gait score and lameness Imaging [33][37][35[38],39,40]
Management and equipment malfunctioning Imaging [31][33[39],41]
Thermal comfort Sound analysis [35][40][37,42]
Production Broiler performances Feed nutritional composition [41][43]
Chicken embryo sex assessment Raman Spectroscopy [42][43][44][44,45,46]
Egg production Multiple Environmental Sensors [45][46][47,48]
Embryo monitoring Thermal Images [47][48][49,50]
Live weight of broilers Imaging [49][50][51,52]
Poultry house environmental monitoring Multiple Environmental Sensors [51][52][53][54][53,54,55,56]
Precision feeding systems Weight Sensor [55][56][57][57,58,59]
Thermal Images
However, although environmental and animal data can be acquired by a multitude of sensors [53][55], such data, except those for temperature, are not commonly collected in most commercial poultry farms. Temperature, relative humidity, carbon dioxide, and ammonia level monitoring have been effectively used to predict broiler weight as many as 72 h in advance [51][53]. Such systems can enable early interventions and the achievement of target weight. Integration with other non-invasive surveillance technologies developed and implemented in poultry houses, including those for health, welfare, and feeding, would enable more data to be incorporated into predictive production models, thus potentially enhancing their capabilities. Acoustic sensors have been developed for exploitation of birds’ acoustic communications for their social interactions and alarm signaling; some can also be considered reliable stress indicators [13][15]. Using acoustic parameters such as vocalization frequency has enabled detection of episodes of food deprivation or the inadequacy of the thermal environment in broilers and laying hens [58][59][60,61]. Similarly, higher rates of squawks and total vocalizations have been observed in laying hen flocks with feather pecking problems [60][62]. Detection of infections with pathogenic microorganisms is also possible with this technology. The frequency of rales produced by chickens infected with infectious bronchitis virus (IBV) has been shown experimentally to enable detection of infections before clinical signs are evident [22][23][24,25]. Sadeghi et al. have recorded broiler vocalizations in healthy and Clostridium-perfringens-infected birds. An artificial neural network model was able to differentiate between infected and healthy birds with an accuracy of 66.6% on day 2 after infection and 100% on day 8 [24][26]. Air sensors in the poultry industry can now predict the onset of coccidiosis by monitoring volatile organic compounds in the air that increase as the number of infected birds increases, thus enabling much earlier detection of infection spread than would be achievable by farmers or veterinarians [26][28]. Alerted farmers would be able to take timely measures to prevent further spread of the infection. Such systems could save several animal lives and prevent financial losses [10]. Similarly, wearable sensors such as accelerometers have been demonstrated as being useful in identifying influenza viral infection in chickens, by detecting changes in physiology and movement patterns [16][18]. Although this sensing equipment can prevent economic losses and welfare issues due to disease spread, it would be unpractical and too expensive to fit all individuals in a typically-large poultry flock with surveillance equipment. However, sensors could be used in a subpopulation of sentinel birds, and may be effective for prevention or early detection, at least in high-risk areas [13][15]. Therefore, smart poultry management practices can mitigate the risks of infection and disease, and the consequent health threat to both animals and humans, through prompt diagnosis and detection at the point of care (i.e., performing a medical diagnostic testing in an area where a patient can receive care) [61][63]. Rapid detection systems continuously monitoring poultry for disease can complement pre-existing approaches to infectious disease detection and diagnosis. The combination of early warning systems and rapid diagnosis could enable immediate action to be taken, preventing subsequent spread of infection to other flocks, and thereby avoiding potential losses and risks for animals and humans that would probably have occurred with use of traditional methods [61][63]. An alternative approach to animal-movement pattern monitoring is automatic image acquisition and analysis. EyenamicTM software has been used to calculate birds’ activity levels by processing calibrated recorded video images. The differences in pixel intensity values with respect to those of the previous image enable calculation of an activity index. This system has been used to assess the relationship between automatic gait evaluation with gait scores obtained by human experts and to develop an automatic activity-index tool capable of detecting leg problems [31][38][33,40]. Another approach, optical flow analysis (OF), developed for applications such as traffic flows, movement of glaciers, or cell and sperm motility, has also been applied in the analysis of movement in confined broilers [13][31][15,33]. OF may provide a practical approach for the assessment of movement-associated welfare issues in commercial poultry through the automatic and continuous assessment of moving images containing hundreds of individuals [62][64]. Recent studies have indicated that OF technology can even be useful in detecting Campylobacter-infected flocks. Colles et al. have shown that flocks likely to become positive for Campylobacter can be identified in the first 7–10 days of life, and are characterized by a lower mean flow rate and consistently higher kurtosis than observed in non-infected flocks [63][65]. If positive results continue to be supported by research, these technologies may greatly influence poultry management, because they benefit animals, producers, and consumers by reducing economic losses and improving food safety. Furthermore, these methods are non-invasive and relatively easy to apply in large flocks. It is probably only a matter of time before OF and other technologies are commonly applied to commercial laying hens or other poultry species. Among imaging techniques, infrared imaging can determine the surface temperatures of objects and create image maps with colors representing different temperatures [64][66]. Heat stress is detrimental to poultry health, and body temperature is indicative of physiological abnormalities that can lead to elevated rates of mortality. Infrared thermal imaging can be used to detect chicken temperature after changes in diet, poultry house environments, and stress levels [1]. Near-infrared spectroscopy has been applied in the assessment of the barn thermal environment, and in compliance regarding comfort zones and insulation [65][67]. Other aspects of meat production have benefitted from this technology, such as the non-destructive detection and grading of wooden breast syndrome in chicken breast fillets [66][68]. The above examples describe only a few of the plethora of automatic data generation systems that are already available or are under development for the poultry industry. Further benefits are, and will be, associated with the common implementation of mobile apps dedicated to welfare, health, and productive performance assessment, because they provide easy and user-friendly access to substantial computational power and connectivity, and enable extremely effective geolocation. Therefore, they have clear applications in monitoring and reconstructing the movements of employees, trucks, and other fomites, as well as in evaluating whether established flows and biosecurity measures are being followed [9][67][68][9,69,70].
Although not exhaustive, the reported overview of data collection methods and sensors demonstrate the breadth of fields that can be investigated using different technologies, ranging from management efficacy improvement and assessment to animals’ welfare and health monitoring, biosecurity implementation, early-stage disease detection in animals, etc. Nevertheless, the amount and variability of generated data can be dispersive and hamper their interpretability. Therefore, proper data organization, analysis, and reporting are mandatory to produce an effective output and fully benefit from the obtained information (Figure 1).
Figure 1. Potential poultry-farm-generated data flow, from collection to output generation.

3. Data Management: Computational Approaches, Storage and Sharing

As new sensors and technologies become incorporated into poultry farming operations, larger amounts of data will be generated. Such development must be paired with adequate infrastructure for collecting, interpreting, and applying all this information. Local resources are typically insufficient for such purposes, and connectivity in a broader sense is critically important. The internet of things (IoT) is leading to massive changes in how humans live and work. The IoT infrastructure consists of several components, including hardware to collect data from the environment; connectivity to transmit data; software to store, analyze and process data; and an interface to allow users to interact with the IoT platform [69][71]. The implementation of IoT technologies in poultry production will consist of a variety of internet-connected smart devices that enable enhanced device communication, thereby leading to automation of operations, and allowing humans to focus on monitoring farms and act on processes requiring higher levels of intelligence [70][72]. The main advantage that IoT provides for the poultry industry is the capabilitiy for communication between sensors and equipment that are used on the farm; storage of information in remote or cloud datasets; analysis of data with algorithms requiring intensive computational resources; and provision of an automatic response action or feedback to farmers [70][72]. The need for more complex data processing and analysis approaches is a key feature of big data. Basic and traditional statistical models, based primarily on variants and extensions of linear regression models, are typically unsuitable for large datasets including several parameters, and for modeling the large variability and complexity of biological phenomena and productive processes. The application of ML and deep learning (DL) algorithms is thus becoming increasingly common [11][71][72][73][11,73,74,75]. ML refers to computer systems and algorithms that can learn and adapt automatically from experience (i.e., from data) without being explicitly programmed. ML typically requires the input data to be pre-processed to make them more amenable to processing by these methods (so-called “feature engineering”). DL, in contrast, can be viewed as a further extension that completely automates this step. The use of a complex structure of algorithms such as artificial neural networks inspired by the human brain enables the processing of unstructured data. These advances have greatly simplified ML workflows, and sophisticated multistage pipelines have often been replaced by a single simple end-to-end DL model [74][75][76,77]. In recent years, these methods have found many applications in all sectors of society and have demonstrated excellent categorization and prediction capabilities. However, because of the complexity of the methods and the data that they address, the interpretability is limited or absent [74][76]. The methods behave in a manner similar to “black boxes”, to which inputs are provided, and from which outputs are received; therefore, the underlying causes, intermediate processing, mathematical models, and relevance of the different variables involved are obscure [73][75]. This aspect differentiates these approaches from traditional statistical ones, whose mathematical formulations are well-known and operator-defined, and are based primarily on causal association, either known or hypothesized. Consequently, considerable mistrust in ML and DL has arisen among non-experts in the field. A brief explanation of the key principles of ML and DL development and validation is thus warranted. In most instances, ML and DL are used to predict a quantitative or categorical outcome. For this purpose, the methods learn (are “trained”) from a dataset of records with known features and outcomes of interest. During the training, the method parameters are automatically optimized to maximize predictive performance (i.e., minimize errors). Nevertheless, the effectiveness of the developed tool in predicting future data is not ensured. That is, the tool could be too specific for the training dataset, and the prediction could be inaccurate for external data. Therefore, an additional check must be performed on a test dataset, i.e., a dataset with the same features as the training dataset (and comparable with the datasets that will be provided thereafter, during application of the routine method) and with known outcomes, whose records were not used in the training step. In this way, an objective and empirical evaluation of the performance and generalizability of the ML or DL approach can be demonstrated, thus ensuring its applicability to future data. Therefore, although the process might appear obscure, its reliability can be considered to be even higher than that of traditional methods, being validated on the basis of empirical demonstrations rather than mathematical assumptions. The outcome of this process is an automatic response or an easily understandable and effective warning/reporting system for farmers or other workers. Typically, farmers address diseases in their animals by taking no action, proactively using veterinary physicians, using a mix of antibiotics, or, in many cases, following a combination of these three approaches [10]. Modern technologies such as sensors, big data, AI, and ML present new possibilities for farmers. Instead of reacting to diseases after they become evident, farmers can continuously monitor key animal health parameters, such as movement, air quality, and consumption of food and fluids. By collecting these data and using advanced AI and ML algorithms to predict deviations or abnormalities, farmers can now identify, predict, and prevent disease outbreaks, even before large-scale outbreaks occur. That is, sensors, instead of humans, can perform continuous monitoring of animal health [61][76][63,78]. The first advantage of this system is that it enables fewer farmers to care for many more animals, thereby decreasing production costs [10]. Second, this system can notify farmers about the possibility of a disease, even during pre-clinical stages, thus helping farmers take timely action to prevent catastrophic losses [10][61][77][10,63,79].
Video Production Service