Improving Visual Defect Detection

Improving Visual Defect Detection: Comparison

Please note this is a comparison between Version 1 by Sasha Behrouzi and Version 3 by Dean Liu.

Reliable functionality in anomaly detection in thermal image datasets is crucial for defect detection of industrial products. Nevertheless, achieving reliable functionality is challenging, especially when datasets are image sequences captured during equipment runtime with a smooth transition from healthy to defective images. This causes contamination of healthy training data with defective samples. Anomaly detection methods based on autoencoders are susceptible to a slight violation of a clean training dataset and lead to challenging threshold determination for sample classification. This paper indicates that combining anomaly scores leads to better threshold determination that effectively separates healthy and defective data. Our research results show that our approach helps to overcome these challenges. The autoencoder models in our research are trained with healthy images optimizing two loss functions: mean squared error (MSE) and structural similarity index measure (SSIM). Anomaly score outputs are used for classification. Three anomaly scores are applied: MSE, SSIM, and kernel density estimation (KDE). The proposed method is trained and tested on the 32 × 32-sized thermal images, including one contaminated dataset. The model achieved the following average accuracies across the datasets: MSE, 95.33%; SSIM, 88.37%; and KDE, 92.81%. Using a combination of anomaly scores could assist in solving a low classification accuracy. The use of KDE improves performance when healthy training data are contaminated. The MSE+ and SSIM+ methods, as well as two parameters to control quantitative anomaly localization using SSIM, are introduced.

anomaly detection
deep learning
novelty detection
autoencoder
industrial image

1. Introduction

Anomaly detection is a machine learning problem in which datasets are heavily biased in favor of normal classes because the abnormal class is too small. Meanwhile, detecting unseen anomaly cases makes utilizing supervised methods impossible . These challenges lead researchers to lean towards methods that infer latent spaces. Autoencoders lie at the heart of these methods. They compress the input image to latent codings and subsequently reconstruct it. Anomaly scores can be calculated by studying the latent vector or reconstruction loss approaches. Using industrial thermal images, theour research comprises a training autoencoder with two different loss functions—mean squared error (MSE) and structural similarity index measure (SSIM)—and finds anomalies within the test phase by applying three anomaly scores—MSE, SSIM, and kernel density estimation (KDE). Researchers useThis paper uses the term anomaly score denoted by A(x). A larger anomaly score, A(x), suggests that there are possible anomalies in the test image. ResWearchers group all approaches above, give their ideas, and report the survey results using classification measures. While autoencoders and alternative metrics have been extensively studied, researchersthis paper introduces a distinctive approach by utilizing KDE in conjunction with MSE or SSIM to effectively address the challenge of contaminated training data. In addition, researcherswe provide quantitative visualization of localized anomalies. The Our goal is to assist researchers in comprehending the basic concepts behind autoencoder-based visual anomaly detection, to design autoencoder models with high classification performance, and to localize defect areas on an image to find the actual defective part of the product.

2. Methodology

After the characteristics of the datasets and used data pipeline are introduced, the research method is discussed in detail. The approach consists of two phases. First, the proposed autoencoder models are developed and trained in the training phase. Then, in the test phase, anomaly score outputs are used for classification performance analyses.

2.1. Dataset and Pipeline

To evaluate theour anomaly detection approaches, researcherswe used 32 × 32-sized thermal images of switchgear equipment. The images were captured using four infrared cameras monitoring low-voltage switchgear busbar cable connections . The cameras recorded a healthy baseline for a period of time; then, several faults were introduced to the switchgear. The faults were simply different loose cable connections that create notable intense hotspots on the thermal images.

Datasets are image sequences captured during equipment runtime. The cameras were pointed in the direction of the switchgear busbar and captured thermal images within approximately 4 h of the test time. After three hours of recording baseline healthy images, defects were introduced. Consequently, the temperature changes in the healthy data emerged on the images within a 20–30 min time window . Four scenarios were defined according to the type of loose connections, resulting in four different defective datasets: FaultL1K1, FaultL1K1K5, FaultL2K3, and FaultL2K6-7. In this context, “L” corresponds to the phase number, and “K” denotes the contact number on the busbar. To exemplify, FaultL1K1 is a loose connection in phase 1 of the first contact on the busbar accordingly. After labeling the healthy and defective images, a small amount of defective labeled data became available, which led reusearchers to the classic problem of imbalanced data in the anomaly detection problem.

Datasets are used using below pipeline in theour experiments:

In the training phase, a model was trained with $X_{t r a i n}$ with its respective loss functions.
In the test phase, the anomaly score distributions for $X_{t r a i n}$ and $X_{d e f e c t - t e s t}$ were visualized and the threshold for classification was determined $T h r e s h o l d = m a x (A (X_{t r a i n}))$ .
$X_{h e a l t h y - c l a s s}$ and $X_{d e f e c t - c l a s s}$ were combined and used for supervised classification using the data labels $y_{h e a l t h y - c l a s s}$ and $y_{d e f e c t - c l a s s}$
The classification performance measures with $T h r e s h o l d = m a x (A (X_{t r a i n}))$ were calculated.
The false-positive rate and the true-positive rate for 30 thresholds were determined; a receiver operating characteristic (ROC) curve was drawn; and accordingly, the area under the curve (AUC) was calculated.
Result tables were prepared with the performance measures, ROC, and the AUC results.
A visualization of the healthy and defective samples was created, along with residual maps for quantitative analysis.

3. Results

Table 1 presents the classification performance results based on the determined T on the anomaly scores generated using the autoencoder. The classification measures indicate high performances in all the datasets, except the contaminated dataset (camera 94693). This high accuracy is due to the nature of thermal images and can be reduced depending on the texture of the input images. Nevertheless, the SSIM outperforms the MSE anomaly score in all cases.

For performance improvement, rwesearchers c combined the MSE and SSIM thresholds with KDE thresholds MSE+ and SSIM+. To conduct a comparative analysis, researchers cwe compared the results of MSE and SSIM as the baseline approaches and further evaluated the enhanced performance achieved using the MSE+ and SSIM+ thresholds. Figure 1 displays the amount of improvement in accuracy.

Figure 1. (a) Impact of combining KDE and MSE anomaly scores. (b) Impact of combining KDE and SSIM anomaly scores.