1. Breast
Breast cancer is a disease that represents one of the principal causes of cancer deaths for women and this number is increasing. The probability that a woman will die from a breast tumor is about 1 in 39. Only 10% of cases are detected at the initial stages. Breast cancer can begin in different parts of the breast which is made up of lobules, ducts, and stromal tissues. Most breast cancers begin in the cells that line the ducts, while some begin in the cells that line the lobules and a small number begin in the other tissues
[1]. Breast cancer manifests itself mainly through a breast nodule or thickening that feels different from the surrounding tissue, lymph node enlargement, nipple discharge, a retracted nipple, or persistent tenderness of the breast.
A successful diagnosis in the early stages of breast cancer makes better treatment possible with increase in the probability of the person’s survival
[2]. Furthermore, the cost of breast cancer treatment is high. For such reasons, in recent years, several breast diagnostic approaches have been investigated, such as mammography, magnetic resonance imaging, computerized tomography, biopsy, and ultrasound imaging. The latter, in the last few years, has started to become an integral part of the characterization of breast masses because of the advantages previously described. In addition, compared to mammography, ultrasound is the most accessible imaging modality, is age-independent
[3] and allows the assessment of breast density that often represents a predictor of breast cancer risk evaluation and prevention. The breast density percentage is defined as the ratio between the area of the fibrograndular tissue and the total area of the breast. Breast ultrasound is also used to distinguish benign from malignant lesions.
Detection is fundamental in ultrasound analysis because it provides support for segmentation and/or classification between malignant and benign tumors. In a recent study, Gao et al.
[5] proposed a method for the recognition of breast ultrasound nodules with low labeled data. Nodule detection was achieved by employing the faster region-based CNN. Benign and malignant nodules were classified through a semi-supervised classifier, based on the mean teacher model, trained on a small amount of labeled data. The results demonstrated that the SSL enabled performances comparable to those obtained with SL trained on a large number of data to be achieved.
Segmentation
[11] has an important role in the clinical diagnosis of breast cancer due to the capability to discriminate different functional tissues, providing valuable references for image interpretation, tumor localization, and breast cancer diagnosis. A segmentation approach that combines fuzzy logic and deep learning was suggested by Badawy et al.
[12] for automatic semantic segmentation of tumors in breast ultrasound images. The proposed scheme is based on two steps: the first consists of preprocessing based on a fuzzy intensification operator and the second consists of semantic segmentation based on CNN, based on experimenting with eight known models. It is applied using different modes: batch and one-by-one image-processing. The results demonstrated that fuzzy preprocessing was able to enhance the automatic semantic segmentation for each evaluated metric, but only in the case of batch processing. Another automatic semantic segmentation approach was proposed by Huang et al.
[14]. In this approach, BUS images are first preprocessed using wavelet features; then, the augmented images are segmented through a fuzzy fully convolutional network, and, finally, an accurately fine-tuning post-processing based on breast anatomy constraints through conditional random fields (CRFs) is performed. The experimental results showed that fuzzy FCN provided better performances than non-fuzzy FCN, both in terms of robustness and accuracy; moreover, its performances were better than all the other methods used for comparison and remained strong when small data sets were used. Ilesanmi et al.
[15] used contrast-limited adaptive histogram equalization (CLAHE) to improve image quality. Semantic segmentation was performed through a variant of UNET, named VEU-NET, based on a variant enhanced (VE) block, which encoded the preprocessed image, and concatenated convolutions that produced the segmentation mask. The results indicated that the VEU-Net produced better segmentation than the other classic CNN methods that were tested for comparison. An approach based on the integration of deep learning with visual saliency for breast tumor segmentation was proposed by Vakanski et al.
[18]. Attention blocks were introduced into a U-Net architecture and feature representations which prioritized spatial regions with high saliency levels were learned. The results demonstrated that the accuracy of tumor segmentation was better than for models without salient attention layers. An important merit of this investigation was the use of US images collected from different systems, which demonstrated the robustness of the technique.
Image classification is very important in medical diagnostics because it enables distinguishing lesions or benignant tumors from malignant ones and a particular type of tissue from others. Shia et al.
[23] presented a method based on a transfer learning algorithm to recognize and classify benign and malignant breast tumors from B-mode images. Specifically, feature extraction was performed by employing a deep residual network model (ResNet-101). The features extracted were classified through the linear SVM with a sequential minimal optimization solver. The experimental results highlighted that the proposed method was able to improve the quality and efficacy of clinical diagnosis. Chen et al.
[34] presented a contrast-enhanced ultrasound (CEUS) video classification model for breast cancer into benign and malignant types. The model was based on a 3D CNN with a temporal attention module (DKG-TAM) incorporating temporal domain knowledge and a channel attention module (DKG-CAM) that included feature-based domain knowledge. It was found that the incorporation of domain knowledge led to improvements in sensitivity. A study aimed at testing the capability of AutoML Vision, a highly automatic machine learning model, for breast lesion classification was presented by Wan et al.
[36]. The performance of AutoML Vision was compared with traditional ML models with the most used classifiers (RF, KNN, LDA, LR, SVM and NB) and a CNN designed in a Tensorflow environment. The AutoML Vision performances were, on average, comparable to the others, demonstrating its reliability for clinical practice. Finally, Huo et al.
[38] experimentally evaluated six machine learning models (LR, RF, extra trees, SWM, multilayer perceptron, and XG Boost) for differentiating between benign and malignant breast lesions using data from different sources. Two examples of the ultrasound depictions of malignant breast lesions are shown in
Figure 1. The experimental results demonstrated that the LR model exhibited better diagnostic efficiency than the others and was also better than clinician diagnosis (see
Table 1).
Figure 1. Ultrasound depictions of malignant breast lesions: (
a) lesion characterized by irregular shape, calcification indicated by large arrow and not circumscribed margin by thin arrow (
b) lesion characterized by the an oval shape, circumscribed margins indicated by thin arrow and enhancement posterior features by large arrow
[38].
Table 1. Summary of Detection ML algorithms employed in analyzed studies with respect to organ investigated, diagnosis objective, dataset used, and main results achieved.
| Ref. |
Organ |
Objective |
Technique |
Results |
Datasets |
| [40] |
Breast |
Recognition of |
Faster R-CNN |
Mean accuracy: 87% |
Public |
| |
|
breast ultrasound |
for detection of |
Performances of |
6746 and |
| |
|
nodules with |
nodules and SSL |
SSL and SL |
2220 nodules |
| |
|
low labeled images |
for classification |
are comparable |
|
| [41] |
Arteries |
Detection of |
Bi-GRU NN |
Mean accuracy: 80% |
Private |
| |
|
end-diastolic |
trained by a |
Better accuracy |
20 coronary |
| |
|
frames in NIRS-IVUS |
segment of |
than expert analysts |
arteries |
| |
|
images of |
64 frames |
with Doppler criteria |
|
| |
|
coronary arteries |
|
|
|
| [42] |
Heart |
Evaluation of |
CNN with |
Mean ROC AUC |
Public |
| |
|
biomarkers |
residual |
Anemia: 80% |
108521 |
| |
|
from |
connections and |
BNP: 84% |
echocardiogram |
| |
|
echocardiogram |
spatio-temporal |
Troponin I: 75% |
studies |
| |
|
videos |
convolutions for |
BUN: 71.5% |
|
| |
|
|
estimation of |
|
|
| |
|
|
biomarker values |
|
|
| [43] |
Heart |
Extract information |
Texture-based features |
ROC AUC: 80% |
Public |
| |
|
associated |
extracted with unsupervised |
Sensitivity: 86.4% |
392 subjects |
| |
|
with myocardial |
similarity networks |
Specificity: 83.3% |
|
| |
|
remodeling |
ML models (DT,RF,LR,NN) |
Prediction of |
|
| |
|
from still |
for prediction of |
myocardial fibrosis |
|
| |
|
ultrasound |
functional remodeling |
only from textures |
|
| |
|
images |
LR for predicting |
of ultrasound images |
|
| |
|
|
presence of fibrosis |
|
|
| [44] |
Liver |
Detection of |
SSD and FPN to |
ROC AUC |
Public |
| |
|
gallstones and |
classify gallstones with |
ResNet-50: 92% |
89,000 images |
| |
|
acute cholecystitis |
features extracted |
MobileNetV2: 94% |
|
| |
|
with still images |
by ResNet-50 and |
detect cholecystitis |
|
| |
|
for preliminary |
MobileNetV2 |
and gallstones with |
|
| |
|
diagnosis |
to classify |
acceptable discrimination |
|
| |
|
|
cholecystitis |
and speed |
|
| [45] |
Fetus |
Gestational age and |
AlexNet variation |
Accuracy % |
Private |
| |
|
automatic estimation |
for TC frames |
TC plane detection: 99% |
5000 TC images |
| |
|
from TC diameter |
extraction |
TC segmentation: 97.98% |
|
| |
|
as a POCUS solution |
FCN for TC |
Accurate GA |
|
| |
|
|
localization and |
estimation |
|
| |
|
|
measurment |
|
|
| [46] |
Fetus |
Automatic |
LH-SVM |
Accuracy: 94.67% |
Private |
| |
|
recognition |
SVM for learning of |
Average precision: 94.25% |
943 standard |
| |
|
and classification |
features extracted |
Average recall rate: 93.88% |
planes |
| |
|
of FFUSP |
by LBP and HOG |
Average F1 score: 94.88% |
424 nasolabial |
| |
|
for diagnosis |
|
Effective prediction |
coronal planes |
| |
|
of cardiac |
|
and classification |
50 nonstandard |
| |
|
conditions |
|
of FFUSP |
planes |
| [47] |
Lungs |
Assist diagnosis |
Pre-trained ResNet50 |
Average F1-score |
Public |
| |
|
of Covid19 |
Fully connected layer |
Bal dataset: 93.5% |
3909 images |
| |
|
on LUS images |
for feature extraction |
Unbal dataset: 95.3% |
|
| |
|
of FFUSP |
Global average pooling |
Improves performances |
|
| |
|
|
for features classification. |
in radiologists’ diagnosis |
|
2. Arteries
Another major cause of death in the world is represented by cardiovascular diseases (CVD), caused principally by a pathological condition called atherosclerosis, which is characterized by alterations of artery walls that have lost their elasticity because of the accumulation of calcium, cholesterol, or inflammatory cells. It is the principal cause of ictus and infarct. Early detection of plaques in the arteries has a fundamental role in the prevention of brain strokes. The imaging modality based on ultrasound represents a useful method for the analysis of carotid diseases through visualization and interpretation of carotid plaques because a correct characterization of this disease is fundamental to identifying plaques vulnerable to surgery. A reliable and useful indicator of atherosclerosis is the so-called intima-media (IM) thickness, defined as the distance from the lumen-intima (LI) to the media-adventitia (MA) interface. Most studies have been devoted to the improvement of early atherosclerosis diagnosis; in this respect, three main issues are considered: detection
[48][49][50][51][52][53][54], segmentation
[55][56][57][58][59][60][61][62], and classification
[40][63][64][65][66][67][68][69][70][71][72][73][74].
As far as detection is concerned, Bajaj et al.
[48] designed a novel deep-learning methodology for the automated detection of end-diastolic frames in intravascular ultrasound (IVUS) images. Near-infrared spectroscopy(NIRS)-IVUS were collected from 20 coronary arteries and co-registered with the concurrent electrocardiographic (ECG)-signal for identification of end-diastolic frames. A bidirectional-gated recurrent unit (Bi-GRU) neural network was trained by a segment of 64 frames, which incorporated at least one cardiac cycle, and then the test set was processed to identify the end-diastolic frames. The performances of the proposed method demonstrated higher accuracy than expert analysts and conventional image-based (CIB) methodologies.
Two recent segmentation approaches based on DL have been proposed by Blanco et al.
[58] and Zhou et al.
[59]. The first method
[58] employs small datasets for algorithm training. Specifically, plaques from 2D carotid B-mode images, are trained on three small databases, and are segmented through a UNet++ ensemble algorithm, which uses eight individual UNet++ networks with different backbones and architectures in the encoder. Good segmentation accuracy was achieved for different datasets without retraining. The second method
[59] involves the concatenation of a multi-frame convolutional neural network (MFCNN), which exploits adjacency information present in longitudinally neighboring IVUS frames to deliver a preliminary segmentation, followed by a Gaussian process (GP) regressor to construct the final lumen and vessel contours by filtering high-dimensional noise. The results obtained with the model developed demonstrated accurate segmentation in terms of image metrics, contour metrics, and clinically relevant variables, potentially enabling its use in clinical routine by reducing the costs involved in the manual management of IVUS datasets. Lo Vercio et al.
[74] suggested an automatic detection method fundamentally based on two machine learning algorithms: SVM and RF. The first one is employed to detect lumen, media, and surrounding tissues through SVM algorithms, and the second one to detect different morphological structures and to modify the initial layer classification depending on the detected structures. Successively, the resulting classification maps are inserted into a segmentation method based on deformable contours to detect LI and MA interfaces. The main steps of LI and MA segmentation are described in
Figure 2.
Figure 2. Main steps of LI and MA segmentation: (
a) B-mode images,(
b) edge map, (
c) contour segmentation, (
d) final segmentation. LI and MA are marked in red and green, respectively
[74].
With respect to classification, Saba et al.
[66] focused on the classification of plaque tissues by employing four ML systems, one transfer learning system, and one deep learning architecture with different layers. Two types of plaque characterization were used: an AI-based mean feature strength and a bispectrum analysis. The results demonstrated that the proposed method was able to accurately characterize symptomatic carotid plaques, clearly discriminating them from symptomatic ones. Another study on carotid diseases was published by Luo et al.
[67] that proposed an innovative classification approach based on lower extremity arterial Doppler (LEAD) duplex carotid ultrasound studies. They developed a hierarchical deep learning model for the classification of aortoiliac, femoropopliteal, and trifurcation disease and an RF algorithm for the classification of the quantity of carotid stenosis from duplex carotid ultrasound studies. Then, an automated interpretation of the LEAD and carotid duplex ultrasound studies was developed through artificial intelligence. Successively, a statistical analysis was performed using a confusion matrix and the reliability of novel machine learning models in differentiating normal from diseased arterial systems was evaluated.
3. Heart
Echocardiography is one of the most employed diagnostic tests in cardiology, where heart images are created through Doppler ultrasound. It is routinely employed in the diagnosis, management, and follow-up of patients with any suspected or known heart disease.
The heart is a muscular organ that pumps blood through the body and is fundamentally divided into four different chambers: the upper left and right atria and the lower left and right ventricles. The heart activity can be divided into two principal phases: systole and diastole. During systole, the myocardium contracts, ejecting blood to the lungs (right ventricle) and the body (left ventricle). During diastole, the cardiac muscle dilates expanding the heart’s volume and causing blood to flow in. The heart has four valves, including the mitral valve that collapses the left atria and the left ventricle and plays a fundamental role by regulating the blood transition from atria to ventricle, opening up during the diastole, while during the systole the valve closes and prevents reflux towards the left atria. Echocardiography can provide information about different anatomical heart aspects including position, shape of the atrium and ventricles
[75], and even other variables such as cardiac output, ejection fraction and diastolic function. In addition, echocardiography enables detection of a series of heart diseases, including cardiomyopathy, congenital heart diseases, aneurysm, and mitral valve diseases. However, one of the major issues in echocardiography is the difficulty of automatically classifying and identifying large databases of echocardiogram views in order to provide a diagnosis. The classification task is challenging because of several properties of echocardiograms, including the presence of noise, redundant information, acquisition errors, and the variability of different scans due to different acquisition techniques.
Several studies have been devoted to the automation of algorithms for the detection of anomalies and heart anatomy
[76][77][78][79][80], and the classification of echocardiogram views to provide a full and reliable assessment of cardiac functionality improving diagnosis accuracy
[81][82].
As far as detection is concerned, an advanced method for the evaluation of several biomarkers from echocardiogram videos based on DL has been developed by Hughes et al.
[76]. The method, named EchoNet-Labs, is a CNN with residual connections and spatio-temporal convolutions that provides a beat-by-beat estimate for biomarker values. Experimental results have demonstrated high accuracy in detecting abnormal values of hemoglobin, troponin I, and other proteins, and better performance compared to models based on traditional risk factors. A detection method based on radiomics-based texture analysis and supervised learning was proposed by Kagiyama et al.
[77], who designed a low-cost texture-based pipeline for the prediction of fibrosis and myocardial tissue remodeling. The first part of the method consists of the extraction of 328 texture-based features of the myocardium from ultrasound images and exploration of the phenotypes of myocardial textures through unsupervised similarity networks. The second part involves the employment of supervised machine learning models (decision trees, RF, logistic regression models, neural network) for the prediction of functional left ventricular remodeling, while, in the third part, supervised models (logistic regression models) for predicting the presence of myocardial fibrosis are employed.
Figure 3 shows a comparison of two myocardial fibrosis predictions from ultrasound and magnetic resonance images.
Figure 3. Prognosis of myocardial fibrosis. Three ultrasound renderings and the corresponding myocardial textures
[77].
A classification deep learning approach was developed by Vaseli et al.
[81]. They defined a method for obtaining a lightweight deep learning model for the classification of 12 standard echography views, by employing a large echography dataset. For this purpose, three different teacher networks are implemented, each of which consists of a CNN module and fully-connected (FC) module, where the first module is based on one of the three advanced deep learning architectures, i.e, VGG-16, DenseNet, and Resnet. A dataset of 16,612 echo cines obtained from 3151 unique patients across several ultrasound imaging machines was employed for the development and evaluation of the networks. The proposed models were shown to be lightweight and faster than state-of-the-art huge deep models, and to be suitable for POCUS diagnosis.
4. Liver
Liver disease is one of the principal causes of death worldwide and comprises a wide range of diseases with varied or unknown origins. In 2017, about 1.32 million deaths worldwide were due to cirrhosis. Furthermore, liver cancer represents the fifth most common cancer and the second cause of death for cancer according to the World Health Organization (WHO). Studied pathologies can be summarized as:
-
focal liver lesions, solid formations that can be benign or malignant,
-
liver fibrosis, excessive accumulation of extracellular matrix proteins, such as collagen,
-
fatty liver or liver steatosis, conditions based on the accumulation of excess fat in the liver,
-
liver tumors.
A number of studies have sought to develop automated algorithms for detection
[83][84][85][86][87], segmentation
[86], and classification,
[86][88][89][90][91][92][93][94][95][96][97][98][99][100] of the diseases described above.
Yu et al.
[83] developed a machine learning system to detect and localize gallstones and to detect acute cholecystitis using still images for preliminary rapid and low cost diagnoses. A single-shot multibox detector (SSD) and a feature pyramid network (FPN) were used to classify and localize objects using image features extracted by ResNet-50 for gallstones and MobileNet V2 to classify cholecystitis. The deep learning models were pretrained using public datasets. The experimental results demonstrated the capability of the proposed system to detect cholecystitis and gallstones with acceptable discrimination and speed and its suitability for point-of-care ultrasound (POCUS).
A recent study by Cha et al.
[101] proposed a deep learning model aimed at automatically quantifying the hepatorenal index (HRI) for evaluation of ultrasound fatty liver, in order to overcome limitations due to interobserver and intraobserver variability. They developed an organ segmentation based on a deep convolutional neural network (DCNN) with Gaussian mixture modeling for automated quantification of the hepatorenal index (HRI) by employing B-mode ultrasound abdominal images. Interobserver agreement for the measured brightness of liver, kidney, and calculated HRI were analyzed between two board-certified radiologists and DCNN using intraclass correlation coefficients. The automatic quantification of HRI through DCNN results were found to be similar to those obtained by expert radiologists.
Regarding classification, Wang et al.
[89] proposed a method to differentiate malignant from benign focal liver lesions through two-dimensional shear-wave elastography (2D-SWE)-based ultrasomics (ultrasound-based radiomics). The ultrasomics technique was employed to extract from 2D-SWE images features that were used to define an ultrasomics score model, while SWE measurements and ultrasomics features were used to define a combined score model through an SVM algorithm. Good diagnostic accuracy for the combined score in differentiating malignant from benign focal liver lesions was demonstrated. The authors highlighted, however, that, to achieve more reliable results, a higher number of cases would be required to better train the ML model. An alternative approach based on ultrasomics was proposed by Peng et al.
[90] who concentrated on the differentiation of infected focal lesions from malignant mimickers. In particular, they defined an ultrasomics model based on machine learning methods with ultrasomics features extracted from grayscale images, and dimensionality reduction methods and classifiers employed to carry out feature selection and predictive modeling. The experimental results demonstrated the usefulness of ultrasomics in differentiating focal liver lesions from malignant mimickers. An alternative approach focusing on ultrasound SWE was proposed by Brattain et al.
[92], who developed an automated method for the classification of liver fibrosis stages. This method was based on the integration of three modules for the evaluation of SWE image quality, selection of a region of interest, and use of machine learning-based (SVM, RF, CNN and FCNN) multi-image SWE classification for fibrosis stage ≥ F2. The performance of the system was compared with manual methods, showing that the proposed method improved classification accuracy. A study focused on liver steatosis was published by Neogi et al.
[98]. They presented a novel set of features that exploited the anisotropy of liver texture. The features were obtained using a gray level difference histogram, pair correlation function, probabilistic local directionality statistics, and randomness of texture. Three datasets that included anisotropy features were employed for the classification of images using five classifiers: MLP, PNN, LVQ, SVM, Bayesian. The best results were achieved with PNN and anisotropy features.
5. Fetus
Ultrasound imaging was introduced into the field of obstetrics by Donald et al.
[41], and, since then, it has become the most commonly used imaging modality for investigating several factors related to fetal diagnosis, such as information on fetal biometric measurements, including head and abdominal circumferences, biparietal diameter and information on fetal cardiac activity. Several scientific studies have been devoted to advancement of the quality of prenatal diagnoses by focusing on three main issues: detection of anomalies, fetal measurements, scanning planes and heartbeat
[42][43][44][45][46][47][102][103], segmentation of fetal anatomy in ultrasound images and videos
[42][103][104][105][106][107][108][109] and classification of fetal standard planes, congenital anomalies, biometric measures, and fetal facial expressions
[42][43][102][107][109][110][111][112][113][114][115].
A detection approach based on DL was proposed by Maraci et al.
[42]. They designed a method for point-of-care ultrasound estimation of fetal gestational age (GA) from the trans-cerebellar (TC) diameter. In the first step, TC plane frames are extracted from a short ultrasound video using a standard CNN based on a variation of the AlexNet architecture. Then, an FCN is employed to localize TC structure and to perform TC diameter estimation. GA is finally achieved through a standard equation. A good agreement was found between the automatic and manual estimation of GA. A recent ML detection method has been published by Wang et al.
[43], who focused on the accurate identification of the fetal facial ultrasound standard plane (FFUSP), which has a significant role in facial deformity detection and disease screening, such as cleft lip and palate detection. The authors proposed an LH-SVM texture feature fusion method for automatic recognition and classification of FFUSP. Texture features were extracted from US images through a local binary pattern (LBP) and a histogram of oriented granted (HOG); successively, features were fused and SVM was employed for predictive classification. The performances obtained demonstrated that the proposed method was able to effectively predict and classify FFUSP.
With respect to segmentation, Dozen et al.
[104] proposed a novel segmentation method called cropping-segmentation-calibration (CSC) of the ventricular septum in fetal cardiac ultrasound videos. This method was based on time-series information of videos and specific information for U-Net output calibration, obtained by cropping the original frame. The experimental results demonstrated a clear improvement in performance with respect to general segmentation methods, such as DeepLab v3+ and U-net.
A novel model-agnostic DL method (MFCY) was proposed by Shozu et al.
[105] in order to improve the segmentation performance of the thoracic wall in ultrasound videos. Three standard UNet (DeepLabV3+), pre-trained with the original sequence video and labels of thoracic wall (TW), thoracic cavity (TC) and whole thorax (WT), were used to perform a preliminary segmentation of the video sequence. Then a multi-frame method (MF) was used to extract predictions for each labeled target. Finally, a cylinder method (CY) integrated the three prediction labels for final segmentation. The results showed improvement in the segmentation performance of the thoracic wall in fetal ultrasound videos without altering the neural network structure.
Perez-Gonzalez et al.
[106] presented a method, named probabilistic learning coherent point drift (PL-CPD), for automatic registration of real 3D ultrasound fetal brain volumes with a significant degree of occlusion artifacts, noise, and missing data. Different acquisition planes of the brain were preprocessed to extract confidence maps and texture features, which were used for segmentation purposes and to estimate probabilistic weights by means of random forest classification. Point clouds were finally registered using a variation of the coherent point drift (CPD) method that basically assigns probabilistic weights to the point cloud. The experimental results, although obtained from a relatively small dataset, demonstrated the high suitability of the proposed method for automatic registration of fetal head volumes.
A recent deep learning classification model was developed by Rasheed et al.
[107] for automation of fetal head biometry by employing a live ultrasound feed. Initially, the headframes were classified through the CNN ALEXNET, obtained in this case from the ultrasound videos. The classified headframes were then validated through occipitofrontal diameter (OFD) measurement. Successively, the classified headframes were segmented by a UNET with mask and annotated images. Then, the least square ellipse (LSE) was employed to compute the biparietal diameter (BPD) and head circumference (HC). This approach enabled accurate computation of the gestational age with very reduced interaction of the sonographer with the system.
6. Lungs
Computed tomography (CT) is considered as the imaging gold standard for pulmonary disease due to its high reliability. However, CT presents a series of disadvantages because it is risky due to the presence of radiation, expensive and non-portable. A valid alternative is represented by lung ultrasound (LU), which is cheap, safe, portable, and is capable of generating medical images in real-time. LU has been used for many years for the evaluation of several lung diseases, including tumors
[116][117], interstitial diseases
[118][119], post-extubation distress
[120], lung edemas
[121], and subpleural lesions
[122]. In very recent years, research activity into lung ultrasonography has been growing significantly due to the diffusion of the pandemic worldwide. In particular, in COVID-19 evaluation, the use of AI has assumed an increasingly important role concerning the analysis of images in order to make rapid decisions and relieve the heavy workload of radiologists.
AI techniques reduce operator-dependence, standardize the interpretation of images and provide stable results; they have been focused principally on COVID-19 syndrome detection
[123][124][125][126][127][128], segmentation of lung regions
[124][125][126][127][128][129], classification of lung diseases between COVID-19 positivity and COVID-19 negativity
[124][125][126][127][128][130][131][132][133].
With respect to detection, Shang et al.
[123] proposed a CAD system that consists of the feature extraction of LUs images through a residual network (ResNet) to assist radiologists in distinguishing COVID-19 syndrome from healthy and non-covid pneumonia. The architecture of the ResNet, pre-trained using ImageNet, was modified by adding a fully connected layer for feature extraction and a global average pooling for features classification. Then, the gradient-weighted class activation mapping (Grad-CAM) method was used to create an activation map that highlights the crucial areas to help radiologist visualization. The CAD system has proved capable of improving radiologists’ performance of COVID-19 diagnosis in experiments carried out.
An interesting segmentation method for accurate COVID-19 diagnosis has been proposed by Xue et al.
[129]. The method is based on a dual-level supervised multiple instances learning module (DSA-MIL) and can predict patient severity from heterogeneous LUS data of multiple lung zones. An original modality alignment contrastive learning module (MA-CLR) is proposed for the combination of LUS data and clinical information. Nonlinear mapping was trained through a staged representation transfer (SRT) strategy. This method demonstrated great potential in real clinical practice for COVID-19 patients, particularly for pregnant women and children.
A classification deep learning procedure was proposed by Tsai et al.
[130] who defined a standardized protocol combined with a deep learning model based on a spatial transformer network for automatic pleural effusion classification. Then, supervised and weakly supervised approaches, based on frame and video ground truth labels, respectively, were used for training deep learning models. The method was compared with expert clinical image interpretation with similar accuracy obtained for both methods, which brings closer the possibility of achieving the automatic, efficient and reliable diagnosis of lung diseases.
7. Other Organs
Machine learning ultrasound is being successfully applied to a number of other organs including:
-
Prostate [134][135][136]: research activity has mainly focused on prostate segmentation on ultrasound images, fundamental in biopsy needle placement and radiotherapy treatment planning; it is quite challenging due to the relatively low quality of US images. In recent years, segmentation based on deep learning techniques has been widely developed due to several benefits compared to classical techniques which are difficult to apply in real-time image-guided interventions.
-
Thyroid [33][137][138][139][140][141][142][143][144][145][146][147][148][149][150][151]: the risk of malignancy of thyroid nodules can be evaluated on the basis of nodule ultrasonographic characteristics, such as echogenicity and calcification. Much activity has been devoted to automate thyroid detection through CAD systems, mainly based on CNN.
-
Kidneys [152][153][154][155][156][157][158][159][160][161][162][163][164][165]: US image-based diagnosis are widely used for the detection of kidney abnormalities including cysts and tumors. For the early diagnosis of kidney diseases, DNN and SVM are very often used as machine learning models for abnormality detection and classification
Figure 4 presents a resuming histogram where the frequency of application of the different ML techniques in the analysed period is reported for each analyzed organ. As can be seen, DL techniques based on CNN are clearly the most popular for almost all organs. In particular, for breast and liver, which are the most investigated organs, CNN is employed in about 63 and 50 percent of the cases, respectively. Only for arteries is a slight predominance of SVM methods observed.
Figure 4. Frequency of ML algorithms application across all organs.