术前预测肝内胆管癌患者神经侵犯状态机器学习模型的构建和验证
Construction and validation of a machine learning model for preoperative prediction of perineural invasion status in intrahepatic cholangiocarcinoma
目的:构建并验证术前预测肝内胆管癌患者神经侵犯(PNI)状态的机器学习模型。方法:回顾性纳入2018年1月至2023年6月郑州大学人民manbet官网登录 收治的245例肝内胆管癌患者以及2013年1月至2020年1月郑州大学附属肿瘤manbet官网登录 收治的84例肝内胆管癌患者。为了构建和验证机器学习模型,329例患者分为训练集( n=231)和测试集( n=98)。收集患者的年龄、性别、乙型肝炎病毒感染情况等临床特征。通过最小绝对值收敛和选择算子(LASSO)回归分析确定预测变量。选择随机森林(RF)、逻辑回归、基于线性核的支持向量机等6种机器学习算法构建术前预测肝内胆管癌PNI的模型。使用混淆矩阵计算模型的性能指标,筛选最终模型。在测试集验证模型表现。绘制校准曲线评价最终模型,使用帕累托图对预测变量按重要性进行可视化排列。 结果:LASSO回归确定了9个预测变量纳入预测模型,包括:肿瘤糖类抗原19-9(CA19-9)、乙型肝炎病毒感染情况、碱性磷酸酶、丙氨酸氨基转移酶、凝血酶原时间、总胆红素、白蛋白、中性粒细胞乘以谷氨酰转移酶与淋巴细胞比值、肿瘤负荷评分。在经过训练的6个模型中,RF模型的曲线下面积(AUC)为0.909,灵敏度为0.842,准确度为0.870,与RF模型的AUC相比,其他5个模型的AUC均较低,差异均具有统计学意义(均 P<0.05)。RF模型预测测试集肝内胆管癌患者PNI的受试者工作特征AUC为0.736。校准曲线显示,在训练集和测试集中,RF模型预测肝内胆管癌患者PNI的曲线和代表理想模型的对角线贴合良好。帕累托图显示,CA19-9是该模型中最为重要的预测变量,其次是乙型肝炎病毒感染情况。 结论:本研究基于RF算法建立的术前预测肝内胆管癌PNI状态的机器学习模型具有较高的准确度,可用于术前预测肝内胆管癌患者的PNI状态。
更多Objective:To construct and validate a machine learning model for preoperative prediction of perineural invasion (PNI) status in intrahepatic cholangiocarcinoma (ICC).Methods:Clincial data of 329 patients, including 245 admitted to Zhengzhou University People's Hospital from January 2018 to June 2023 and 84 admitted to the Affiliated Cancer Hospital of Zhengzhou University from January 2013 to January 2020 were retrospectively analyzed. Patients were divided into a training set ( n=231) and a validation set ( n=98). Clinicopathological data including age, gender, hepatitis B virus (HBV) infection status were collected. Predictive variables were determined using least absolute shrinkage and selection operator (LASSO) regression analysis. Six machine learning algorithms including random forest (RF), logistic regression, and linear kernel-based support vector machine were selected to construct the preoperative prediction model for PNI in ICC. Performance metrics of the model were calculated using a confusion matrix, and the final model was selected. The model performance was evaluated in the validation set. Calibration curves were plotted to evaluate the final model, and a Pareto chart was used to visualize the importance of predictive variables. Results:LASSO regression identified nine predictive variables included in the prediction model, including carbohydrate antigen 19-9 (CA19-9), HBV infection status, alkaline phosphatase, alanine aminotransferase, prothrombin time, total bilirubin, albumin, neutrophil times gamma-glutamyl transferase to lymphocyte ratio, and tumor burden score. Among the trained six models, the area under the curve (AUC) of the RF model was 0.909, with a sensitivity of 0.842 and an accuracy of 0.870. Compared with the AUC of the RF model, the AUCs of the other 5 models were lower (all P<0.05). The AUC of the RF model for predicting PNI in ICC in validation set was 0.736. Calibration curves showed good fit of the RF model's prediction of PNI in ICC in both training and validation sets. The Pareto chart showed that CA19-9 was the most important predictive variable in the model, followed by HBV infection status. Conclusion:The machine learning model based on the RF algorithm has a high accuracy in preoperative prediction of PNI status in ICC.
More- 浏览:0
- 被引:0
- 下载:0
相似文献
- 中文期刊
- 外文期刊
- 学位论文
- 会议论文