MEDLINE/PubMed Journal Browser Search

Pubmed for Handhelds

PUBMED FOR HANDHELDS

Search MEDLINE/PubMed

Title: Spine surgeon versus AI algorithm full-length radiographic measurements: a validation study of complex adult spinal deformity patients.
Author: Haselhuhn JJ, Soriano PBO, Grover P, Dreischarf M, Odland K, Hendrickson NR, Jones KE, Martin CT, Sembrano JN, Polly DW.
Journal: Spine Deform; 2024 May; 12(3):755-761. PubMed ID: 38336942.
Abstract:
INTRODUCTION: Spinal measurements play an integral role in surgical planning for a variety of spine procedures. Full-length imaging eliminates distortions that can occur with stitched images. However, these images take radiologists significantly longer to read than conventional radiographs. Artificial intelligence (AI) image analysis software that can make such measurements quickly and reliably would be advantageous to surgeons, radiologists, and the entire health system. MATERIALS AND METHODS: Institutional Review Board approval was obtained for this study. Preoperative full-length standing anterior-posterior and lateral radiographs of patients that were previously measured by fellowship-trained spine surgeons at our institution were obtained. The measurements included lumbar lordosis (LL), greatest coronal Cobb angle (GCC), pelvic incidence (PI), coronal balance (CB), and T1-pelvic angle (T1PA). Inter-rater intra-class correlation (ICC) values were calculated based on an overlapping sample of 10 patients measured by surgeons. Full-length standing radiographs of an additional 100 patients were provided for AI software training. The AI algorithm then measured the radiographs and ICC values were calculated. RESULTS: ICC values for inter-rater reliability between surgeons were excellent and calculated to 0.97 for LL (95% CI 0.88-0.99), 0.78 (0.33-0.94) for GCC, 0.86 (0.55-0.96) for PI, 0.99 for CB (0.93-0.99), and 0.95 for T1PA (0.82-0.99). The algorithm computed the five selected parameters with ICC values between 0.70 and 0.94, indicating excellent reliability. Exemplary for the comparison of AI and surgeons, the ICC for LL was 0.88 (95% CI 0.83-0.92) and 0.93 for CB (0.90-0.95). GCC, PI, and T1PA could be determined with ICC values of 0.81 (0.69-0.87), 0.70 (0.60-0.78), and 0.94 (0.91-0.96) respectively. CONCLUSIONS: The AI algorithm presented here demonstrates excellent reliability for most of the parameters and good reliability for PI, with ICC values corresponding to measurements conducted by experienced surgeons. In future, it may facilitate the analysis of large data sets and aid physicians in diagnostics, pre-operative planning, and post-operative quality control.

[Abstract] [Full Text] [Related] [New Search]