The application of artificial intelligence (AI) has become more and more widespread in medicine and dentistry. It may contribute to improved quality of health care as diagnostic methods are getting more accurate and diagnostic errors are rarer in daily medical practice. The accuracy of determining cephalometric landmarks using widely available commercial AI-based software and advanced AI algorithms was presented. Most AI algorithms used for the automated positioning of landmarks on cephalometric radiographs had relatively high accuracy. At the same time, the effectiveness of using AI in cephalometry varies depending on the algorithm or the application type, which has to be accounted for during the interpretation of the results.
No | Study | No. of Cephalograms | Patients’ Age (in Years) | Type of Algorithm | No. of Examiners | No. of Landmarks/Mean SDR | No. of Measurements/Mean Error | Time for Analysis (in Seconds) |
---|---|---|---|---|---|---|---|---|
1 | Leonardi et al., 2009 [11] | 41 | 10–17 | Authors’ algorithm/CNN, Borland C++ |
5 | 10/ n.s. |
n.s. | 257 for 10 landmarks |
2 | Tanikawa et al., 2010 [12] |
859 (400: permanent dentition; 459: mixed dentition) |
5–60; mean age: 23.6 (permanent dentition); 8.9 (mixed dentition) | Authors’ algorithm/PPED system | 2 | 18/ n.s. |
n.s. | n.s. |
3 | Lindner et al., 2016 [13] | 400 | 7–76 | Authors’ algorithm/FALA system, RFRV-CLM | 2 | 19/ 84.7% in the range of 2 mm |
8/ 78.4 ± 2.61% |
<3 |
4 | Park et al., 2019 [14] | 1311 (1028: training set; 283: testing set) |
n.s. | Authors’ algorithm/YOLOv3 and SSD | 1 | 80/ YOLOv3: 80.4% in the range of 2 mm |
n.s. | 0,05 for YOLOv3; 2.89 for SSD |
5 | Hwang et al., 2020 [15] | 1311 (1028: training set; 283: testing set) |
n.s. | Authors’ algorithm/YOLOv3 and manual analysis | 2 | 80/ mean detection error: 1.46 ± 2.97 mm |
n.s. | n.s. |
6 | Moon et al., 2020 [16][10] |
2400 (2200: training set; 200 test set) | n.s. | Authors’ algorithm/YOLO v3 | 2 | 80/ n.s. |
n.s. | n.s. |
7 | Lee et al., 2020 [17][16] |
400 | n.s. | Authors’ algorithm/Bayesian CNN | 2 | 19/ 82.11% in the range of 2 mm |
n.s. | 512/38 for 19 landmarks (1 GPU/4 GPU) |
8 | Kunz et al., 2020 [18][17] |
1792 (96.6%: training set; 3.4% validation set) |
n.s. | Authors’ algorithm/CNN, Keras and Google Tensorflow | 12 | 18/ n.s. |
12/ <0.37° (angular measurements); <0.20 mm (metric measurements); <0.25% (proportional measurements) |
n.s. |
9 | Kim at al., 2020 [19][18] | 2075 | n.s. | Authors’ algorithm/DL, SHG, Tensorflow, Python | 2 | 23/ 84.7% in the range of 2 mm |
n.s. | 0.4 for 23 landmarks |
10 | Kim et al., 2021 [20][19] | 950 (800: training set; 100: validation set; 50: testing set | n.s. | Authors’ algorithm/CNN | 2 | 13/ 64.3% in the range of 2 mm |
n.s. | n.s. |
11 | Tanikawa et al., [21][20] | 1785 | 5.4–56.5; mean age: 12.2 | Authors’ algorithm/CNN-PC & CNN-PE, Adam | 2 | 26/ success rates from 85% to 91% |
n.s. | n.s. |
12 | Tanikawa et al., 2021 [22][21] | 2385 | 5.8–77.9 | Authors’ algorithm/ CNN-PC&PE, Adam |
2 | 26/ success rates from 85% to 90% |
n.s. | n.s. |
13 | Yao et al., 2022 [23][22] | 512 (312: training set; 100: validation set; 100: testing set) |
9–40 | Authors’ algorithm/CNN, PyTorch | 2 | 37/ 45.95% in the range of 1 mm; 97.3% in the range of 2 mm |
n.s. | 3 for 37 landmarks |
14 | Uğurlu, 2022 [24][23] | 1620 (1360: training set; 140: validation set; 180: testing set) |
9–20 | Authors’ algorithm/CNN/PyTorch, Python | 1 | 21/ 76.2% in the range of 2 mm |
n.s. | n.s |
15 | Popova et al., 2023 [25][24] | 890 (387: training set; 43: validation set; 460: testing set) |
All ages | Authors’ algorithm/CNN/(Keras and TensorFlow, Python | 3 | 16/ 84.73% in the range of 2 mm |
n.s. | n.s. |
16 | Jeon et al., 2021 [26][25] | 35 | Mean age: 23.8 | Commercial analysis/CephX | 1 | 16 | 26/ 0.1–0.3° (angular measurements); 0.1–0.3% (linear measurements) |
n.s. |
17 | Bulatova et al., 2021 [27][26] | 110 | n.s. | Commercial analysis/Ceppro | 2 | 16/ ±0.13 mm in the range of 2 mm for 75% of landmarks; mean difference 2.0 ± 3.0 in X plane and 2.1 ± 3.0 in Y plane |
n.s. | n.s. |
18 | Ristau et al., 2022 [28][27] | 60 | Patients with a full complement of teeth | Commercial analysis/AudaxCeph | 2 | 13/max. mean error: <2.6 mm in X plane; <2.3 mm in Y plane | n.s. | n.s. |
19 | Kılınç et al., 2022 [29][28] | 110 | 10–24, mean age: 15.83 ± 2.85 | Commercial analysis/ WebCeph and CephNinja |
1 | n.s. | 11/ ICC from 0.170 to 0.884 |
n.s. |
20 | Çoban et al., 2022 [30][29] | 105 | >15, mean age: 17.25 ± 2.85 | Commercianalyser/ WebCeph |
1 | n.s. | 22/ ICC from 0.418 to 0.959 |
n.s. |
21 | Mahto et al., 2022 [31][30] | 30 | Mean age: 20.17 ± 6.72 | Commercianalyser/WebCeph | 1 | n.s. | 12/ ICCC from 0.795 to 0.966 |
n.s. |
22 | Tsolakis et al., 2022 [32][31] | 100 | Mean age: 15.9 ± 4.8 | Commercial analyser/CS imaging V8 | 1 | 16 | 18/ ICC from 0.70 to 0.92 |
n.s. |
23 | Jiang et al., 2023 [33][32] | 9870 (8611: training set; 1000: validation set; 259: testing set) | 6–50 | Commercial analyser/CNN/CephNet | 5/100 | 28/ 66.15% in the range of 1 mm; 91.73% in the range of 2 mm |
11/ 89.33% |
n.s. |