Volume
Volume 5, Issue 1 (March, 2025) – 15 articles
Cover Picture:
Aim: A growing body of literature reports on prediction models for patient-reported outcomes of spine surgery, carrying broad implications for use in value-based care and decision making. This review assesses the performance and transparency of reporting of these models.
Methods: We queried four studies reporting the development and/or validation of prediction models for patient-reported outcome measures (PROMs) following elective spine surgery with performance metrics such as the area under the receiver operating curve (AUC) scores. Adherence to transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD-AI) guidelines was assessed. One representative model was selected from each study.
Results: Of 4,471 screened studies, 35 were included, with nine development, 24 development and evaluation, and two evaluation studies. Sixteen machine learning models and 19 traditional prediction models were represented. Oswestry disability index (ODI) and modified Japanese Orthopaedic Association (mJOA) scores were most commonly used. Among 29 categorical outcome prediction models, the median [interquartile range (IQR)] AUC was 0.79 [0.73, 0.84]. The median [IQR] AUC was 0.825 [0.76, 0.84] among machine learning models and 0.74 [0.71, 0.81] among traditional models. Adherence to TRIPOD-AI guidelines was inconsistent, with no studies commenting on healthcare inequalities in the sample population, model fairness, or disclosure of study protocols or registration.
Conclusion: We found considerable variation between studies, not only in chosen patient populations and outcome measures, but also in their manner of evaluation and reporting. Agreement about outcome definitions, more frequent external validation, and improved completeness of reporting may facilitate the effective use and interpretation of these models.
view this paper Aim: A growing body of literature reports on prediction models for patient-reported outcomes of spine surgery, carrying broad implications for use in value-based care and decision making. This review assesses the performance and transparency of reporting of these models.
Methods: We queried four studies reporting the development and/or validation of prediction models for patient-reported outcome measures (PROMs) following elective spine surgery with performance metrics such as the area under the receiver operating curve (AUC) scores. Adherence to transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD-AI) guidelines was assessed. One representative model was selected from each study.
Results: Of 4,471 screened studies, 35 were included, with nine development, 24 development and evaluation, and two evaluation studies. Sixteen machine learning models and 19 traditional prediction models were represented. Oswestry disability index (ODI) and modified Japanese Orthopaedic Association (mJOA) scores were most commonly used. Among 29 categorical outcome prediction models, the median [interquartile range (IQR)] AUC was 0.79 [0.73, 0.84]. The median [IQR] AUC was 0.825 [0.76, 0.84] among machine learning models and 0.74 [0.71, 0.81] among traditional models. Adherence to TRIPOD-AI guidelines was inconsistent, with no studies commenting on healthcare inequalities in the sample population, model fairness, or disclosure of study protocols or registration.
Conclusion: We found considerable variation between studies, not only in chosen patient populations and outcome measures, but also in their manner of evaluation and reporting. Agreement about outcome definitions, more frequent external validation, and improved completeness of reporting may facilitate the effective use and interpretation of these models.