2. Applications and Hardware
Thermal cameras are increasingly popular in different consumer-oriented applications. One of the markets driving commercial thermal imagining is the automotive market, where thermal imagining is used for pedestrian and car detection for increased security and collision avoidance
[11][12][13]. However, there are other applications such as object inspection in construction
[14][15][16], medicine
[17][18], veterinary science
[19], access control
[20], agriculture
[21], astronomy
[22][23], gas leakage detection
[24][25], inspection of solar panels
[26], autonomous drone navigation
[27], helping visually impaired subjects
[28], and so on. While a more in-depth presentation of thermal imaging applications can be found in
[2][29]; the application range is evident and diverse.
In
[12], a system for avoiding animal–vehicle collisions was developed. The system used a histogram of oriented gradients (HOG) algorithm for image enhancement and a convolutional neural network (CNN) for wild animal detection. It was tested with wild deer images captured with a FLIR One Pro thermal imaging device and achieved a detection accuracy of 91%
[30]. Besides HOG, another histogram-based method used in thermal imaging applications is CENTRIST (CENsus TRansform hISTogram), which in
[31] demonstrated a better performance in human detection in terms of accuracy and computational cost. Thermal images can also be used to improve autonomous vehicle safety during low visibility conditions
[32]. Here, the authors fused thermal images (acquired with five FLIR Automotive Development Kit cameras with 20 fps) with radar data in a near real-time manner. In
[11], the authors fused data from two spectrums—visible light and infrared light—to increase image readability. This was achieved on a field programmable gate array (FPGA) device with an image fusion competition approach based on image quality parameters. However, no results have been presented to demonstrate the effectiveness of the approach. A similar approach, based on fusing visible and thermal image data, was used in
[33], where the authors used an encoder-decoder type neural network to fuse the two information channels. It is worth noting that large producers of thermal cameras, such as FLIR, encourage the fusion of visible and thermal cameras by providing large and well-organized databases for testing for free
https://www.flir.eu/oem/adas/adas-dataset-form/ (accessed on 26 April 2023).
Combining thermal imaging with neural networks is also used in other works, as in
[34], where it was applied for pedestrian detection. The authors used a custom thermal image database (recorded with FLIR ThermaCAM P10 thermal camera) and YOLO-based neural networks to detect pedestrians near the road’s edge. The obtained results showed that the YOLO network needed to be additionally trained to increase the average precision (AP) parameter up to 30%. At the same time, an identical network had a 90% AP on visible range images. Besides direct detection of objects of interest in thermal images, neural networks can be used to enhance them. This approach was used in a two-step manner in
[30]. At first, the thermal image was translated to a grayscale image using Texture-Net, which was based on generative adversarial architecture. Then, the image was colorized using a deep neural convolutional network. In
[35], several U-net-based neural network architectures were tested in infrared image coloring. The authors reported good results and noted that the approach achieved 3 fps without special optimization. Some works, such as
[36], for medical-based applications are focused on algorithms for image segmentation and region of interest selection since this can have a significant influence on certain applications, such as medical thermography
[37][38]. A good overview of state-of-the-art methods emphasizing available thermal image pedestrian databases can be found in
[39]. A more technical overview of thermal image sensors can be found in
[40].
It should be noted that one of the driving forces of thermal image application is thermal camera availability which can even be used as a smartphone add-on (as FLIR ONE,
https://www.flir.com/flirone/) with good practical results
[41]. Interestingly, a smartphone-based approach can be used for advanced applications with appropriate signal processing, as was the case in
[18]. Here, a thermal image of the patient’s back was used to determine if the patient had COVID-19 disease. The obtained results showed 92% sensitivity and 0.85 area under the curve. While these and other commercial thermal cameras are becoming increasingly available and powerful, they are still too expensive for many targeted applications and are sold as black boxes. Thus, some works aim to develop affordable and open thermal imaging devices
[42][43].
3. Image Enhancement
Once the thermal imaging process (i.e., data acquisition) has been completed, data processing algorithms come into focus since they can significantly influence overall system performance and overcome some of the limitations present in image acquisition systems (usually poor contrast, low resolution, lack of color information, etc.). These algorithms also contribute to the increased performance of subsequent image analysis algorithms (e.g., object detection). When working with thermal images, two main objectives should be considered: dynamic range reduction and detail enhancement
[44] since a high-bit per pixel raw image needs to be mapped onto low 8-bit gray values (before false colorization). The two most widely used approaches are based on automatic gain control (AGC) and histogram equalization (HE). HE is the most notable representative of global contrast enhancement (GCE) methods in which one global mapping function is used (as opposed to local enhancement methods—LCE—where multiple local maps are used)
[45]. Its main disadvantage is that grayscale values with a high probability distribution function will take out most of the intensity range and thus be significantly enhanced, while parts of the image with a low probability distribution function (foreground, which is usually of interest) will be suppressed. Thus, the approach over-enhances the homogeneous regions of the image and/or amplifies noise in the background. To avoid the issue, a number of improvements can be made to the original algorithm
[46], with plateau histogram equalization (PHE) being a good example
[47]. In the PHE, the probability density function is limited by a threshold. The idea can be evolved even further with a double plateau (i.e., two thresholds)
[48]: an upper threshold to eliminate over-enhancement of background noise and a lower threshold to preserve details in the image. A short overview of HE algorithms can be found in
[49]. Approaches also exist that try to leverage the advantages of both the local and the global-based approaches by fusing them
[50].
LCE Histogram-based methods are widely-used contrast enhancement techniques since they are rather simple and intuitive both in theory and implementation
[51]. One of the widely used algorithms in many thermal imaging applications is contrast limited adaptive histogram equalization (CLAHE)
[52] alongside its variations (such as
[53][54]) and fusions with other algorithms (as in
[55]). It is based on adaptive histogram equalization (AHE) but avoids one of its pitfalls of over-amplifying noise in homogeneous image regions. It is part of the LCE group of enhancement methods since it divides the input image into multiple non-overlapping blocks and applies the algorithm to them. It uses interpolation to smooth out inconsistencies between borders of non-overlapping blocks. The algorithm has two parameters that can be tuned: the clip limit (that controls noise amplification) and the number of tiles (that controls the number of non-overlapping areas in the image). Lately, there have been attempts to improve CLAHE performance by estimating its parameters using supervised machine learning algorithms
[51]. Other approaches to parameter optimization are also used, such as multi-objective meta-heuristics, which includes a structural similarity index (SSIM)
[56] and entropy-based approach
[57]. One way of tackling the issue is to use predefined histogram models, e.g., for an ideal far-infrared image consisting only of emissions and which would be a piece-wise constant
[58]. Alternatively, non-parametric methods, such as the one in
[59], can completely avoid the parameter selection procedure.
There are also approaches that separate raw images into two channels: base and detailed ones
[44][55][60]. The base channel is usually processed for noise removal and/or compression, while the detailed channel is used to gain control during subsequent channel fusion. Fusion can also be implanted on per image bases as in
[61] where a CLAHE processed image was used for the fusion of visible and infrared images using sparse representation. Another way of improving the CLAHE algorithms is to provide image/situation-specific information to the approach
[62]. This motivated
our approach where additional information was also used for image enhancement (temperature of the surroundings alongside a predefined histogram model). This approach, however, might be lacking for applications of long-range surveillance
[58] due to possible large changes in the surrounding temperature. The CLAHE algorithm is also used in other image processing algorithms, not only for thermal images. It is also well suited for field programmable gate array (FPGA) real-time implementation
[63]. It is worth noting that some authors (as in
[64]) state that, in thermal images, GCE-based methods are better since they preserve thermal distribution information, which is important for some temperature-sensitive applications.