基于Lasso-Logistic回归的缺血性脑卒中发病风险预测模型构建
收稿日期: 2025-08-07
修回日期: 2026-01-21
录用日期: 2026-03-18
网络出版日期: 2026-04-21
基金资助
黑龙江省哲学社会科学研究规划项目(18RK069);齐齐哈尔医学院研究生创新基金项目(QYYCX2023-46)
A Risk Prediction Model for Ischemic Stroke Construction of a Risk Prediction Model for Ischemic Stroke Based on Lasso-Logistic Regression
Received date: 2025-08-07
Revised date: 2026-01-21
Accepted date: 2026-03-18
Online published: 2026-04-21
目的:构建缺血性脑卒中发病风险预测模型,为医疗机构完善缺血性脑卒中防治措施提供基于真实世界数据的预测模型参考。方法:基于真实数据研究,选取2023—2024年3所三甲医院体检人员5605例为研究对象,运用t检验和χ2检验比较临床特征差异性,应用Lasso回归筛选发病的相关变量,采用bootstrap法重复抽样1000次,计算最优正则化参数(lambda.min)和1倍标准误差(lambda.1se)模型的C-index指数,采用随机森林模型对相关变量的重要性排序,运用多因素Logistic回归构建缺血性脑卒中发病风险预测模型,运用ROC曲线和校准曲线在训练集和验证集评价预测模型的准确性,并绘制临床决策曲线(DCA)和临床影响曲线(CIC)分析预测价值。结果:基于赤池信息量准则(AIC)选择更简洁的lambda.1se模型,以λ最优值(λ=0.017)筛选自变量,基于Gini系数灵敏重要性排序依次为同型半胱氨酸、血小板数、高血脂、高血压、空腹血糖、甘油三酯和糖尿病。多因素Logistic回归分析,自变量均是缺血性脑卒中发病的危险因素(P<0.05),绘制训练集和验证集的ROC曲线,AUC值的95%可信区间分别为0.771(95%CI:0.739~0.802)和0.786(95%CI:0.736~0.835),校准曲线的平均绝对误差分别为0.003和0.011,DCA曲线说明模型具有较好的发病干预获益值,CIC曲线说明模型在>0.4阈值后,发病预测有效率高。结论:该预测模型具有良好的区分度、拟合度,在广泛风险阈值下具有较好的预防干预获益性,可为临床预测提供参考和借鉴。
关键词: Lasso-Logistic回归模型; 缺血性脑卒中; 机器学习; 预测; 风险评估
火利峰
,
王玺
,
张寒琪
,
孙盼盼
,
韩云峰
.
基于Lasso-Logistic回归的缺血性脑卒中发病风险预测模型构建
Objective: To construct a risk prediction model for ischemic stroke, providing a reference for medical institutions to improve relevant prevention and treatment measures.Methods: Based on real data research, 5605 physical examination subjects from three tertiary hospitals in 2023—2024 were selected as the research subjects. T-tests and χ2 tests were used to compare the differences in clinical characteristics. Lasso regression was applied to screen the related variables of the disease onset. The bootstrap method was used for repeated sampling 1000 times. The C-index of the lambda.min model and the lambda.1se model were calculated, rank the importance of related variables using the random forest model, and construct a risk prediction model for ischemic stroke onset using multivariate Logistic regression. The accuracy of the prediction model was evaluated in the training set and validation set by using the ROC curve and calibration curve, and the DCA curve and CIC curve were plotted to analyze the predictive value.Results: Based on the Akaike Information Criterion (AIC), the more parsimonious lambda.1se model was selected. Using the optimal λ value (λ=0.017), the screened independent variables, in descending order of importance, were: homocysteine, platelet count, hyperlipidemia, hypertension, fasting blood glucose, triglycerides, and diabetes. Multivariate logistic regression confirmed all these variables as significant risk factors for ischemic stroke (P<0.05). The model demonstrated good discrimination, with AUC values of 0.771 (95% CI: 0.739-0.802) in the training set and 0.786 (95% CI: 0.736-0.835) in the validation set. Calibration was also satisfactory, with mean absolute errors of 0.003 and 0.011 in the training and validation sets, respectively. DCA indicated favorable net intervention benefits across a reasonable threshold range. CIC further showed high prediction efficiency for onset risk when the threshold probability exceeded 0.4.Conclusion: The constructed prediction model exhibits good discrimination, calibration, and provides net clinical benefit across a range of threshold probabilities. It can serve as a useful tool to support clinical decision-making in the prevention of ischemic stroke.
[1] 刘明波,何新叶,杨晓红,等.《中国心血管健康与疾病报告2023》要点解读[J].临床心血管病杂志,2024,40(8):599-616.
[2] Ma Qingfeng, Li Rui, Wang Lijun,et al. Temporal trend and attributable risk factors of stroke burden in China, 1990—2019: an analysis for the Global Burden of Disease Study 2019[J].Lancet Public Health, 2021,6(12):e897-e906.
[3] GBD 2016 neurology collaborators. Global, regional, and national burden of neurological disorders, 1990—2016: a systematic analysis for the Global Burden of Disease Study 2016[J].Lancet Neurol, 2019,18(5):459-480.
[4] Lian Zhengqi, Gao Yikun, Xiong Jiali, et al. Temporal trend and attributable risk factors of ischemic stroke burden in China, 1990—2021[J]. BMC Neurology, 2025,25(1):245.
[5] Nayem HM, Aziz S, Kibria BMG, et al. Evaluating estimator performance under multicollinearity: a trade-off between MSE and accuracy in logistic, lasso, elastic net, and ridge regression with varying penalty parameters[J].Stats, 2025,8(2):45.
[6] Qi Duan, Li Wenlong, Zhang Ye, et al. Nomogram established on account of Lasso-logistic regression for predicting hemorrhagic transformation in patients with acute ischemic stroke after endovascular thrombectomy[J].Clin Neurol Neurosurg, 2024,243(1):108389.
[7] Huang Qianqian, Zeng Tianshu, Zhang Jiaoyue,et al. Framingham risk score conventional risk factors are potent to predict all-cause mortality using machine learning algorithms: a population-based prospective cohort study over 40 years in China[J].J Investig Med, 2023,71(6):586-590.
[8] 北京高血压防治协会,中国卒中学会高血压预防与管理分会,中国老年保健协会养老与健康专业委员会,等.基层冠心病与缺血性脑卒中共患管理专家共识2022[J].中国心血管病研究,2022,20(9):772-793.
[9] 程露,梁晓峰,吴静,等.中国5省农村地区脱贫居民吸烟现状及影响因素研究[J]. 中国慢性病预防与控制,2023,31(8):601-605.
[10] Ng R, Sutradhar R, Yao Z, et al. Smoking, drinking,diet and physical activity-modifiable lifestyle risk factors and their associations with age to first chronic disease[J].Int J Epidemiol, 2020,49(1):113-130.
[11] 中国高血压防治指南修订委员会.中国高血压防治指南(2024年修订版)[J].中华高血压杂志(中英文),2024,32(7):603-700.
[12] 中华医学会糖尿病学分会.中国糖尿病防治指南(2024版)[J].中华糖尿病杂志,2025,17(1):16-139.
[13] 中国血脂管理指南修订联合专家委员会.中国血脂管理指南(2023年)[J].中国循环杂志,2023,38(3):237-271.
[14] Zhang Juanjuan, Wang Weixian, Yang Mingming, et al.Variational Bayesian Variable Selection in Logistic Regression Based on Spike-and-Slab Lasso[J].Mathematics, 2025,13(13):2205.
[15] Gou Dengqun, Min Changhang, Peng Xiaofeng, et al. Associating factors of cognitive frailty among older people with chronic heart failure: Based on LASSO-logistic regression[J].J Adv Nurs, 2024,81(3):1399-1411.
[16] Li Yun, Zhao Lina, Wang Ye, et al. Development and validation of prediction models for neurocognitive disorders in adult patients admitted to the ICU with sleep disturbance[J].CNS Neurosci Ther, 2021,28(4):554-565.
[17] Li Qiaowei, Lin Fan, Gao Zhonghai, et al. Chinese ASCVD risk equations rather than pooled cohort equations are better to identify macro and microcirculation abnormalities[J].BMC Cardiovasc Disord, 2020,20(1):145.
[18] Sakamoto T, Endo H, Yamamoto H, et al. Machine learning prediction of anastomotic leak after low anterior resection: nationwide database analysis[J].Medicine, 2025,104(34):e443977.
[19] Jiang Yan, Wang Weikai, Xu Ruifeng, et al. Construction and efficacy evaluation of a model for early diagnosis of pediatric sepsis based on LASSO-logistic regression[J].Front Pediatr, 2025,13(1):1624278.
[20] 许海婷,魏晓霞,吴文华,等.基于Markov模型的血塞通软胶囊治疗缺血性中风的成本效果分析[J].中国药物评价,2024,41(2):155-159.
[21] Jin Xiuli, Yang Xiangxi, Li Fengjuan. Prediction of ischemic stroke in elderly hypertensive patients using carotid plaque superb microvascular imaging characteristics:a lasso-logistic regression model[J].J Med Ultrason, 2025,52(2):1-9.
[22] Nam KW, Kim CK,YU S, et al. Plasma total homocysteine level is related to unfavorable outcomes in ischemic stroke with atrial fibrillation[J].J Am Heart Assoc, 2022,11(9):e022138.
[23] Li Keliang, Xu Min, Zhang Yun, et al. Prognostic values of homocysteine and potassium levels in acute ischemic stroke patients after intravenous thrombolysis with recombinant tissue-type plasminogen activator[J].Crit Rev Eukaryot Gene Expr, 2025,35(2):65-73.
[24] Burlard P, Vögtle T, Nieswandt B. Platelets in thromboinflammation:concepts, mechanisms,and therapeutic strategies for ischemic stroke[J].Hamostaseologie, 2020,40(2):153-164.
[25] 李丽,刘婉如,张征,等.逐瘀通脉胶囊对大鼠急性局灶性脑缺血再灌注损伤的影响[J].中国药物评价,2023,40(6):502-505.
[26] Wen Huijun, Wang Ning, Lv Min, et al. The early predictive value of platelet-to-lymphocyte ratio to hemorrhagic transformation of young acute ischemic stroke[J].Asian Biomed, 2023,17(6):267-272.
[27] Zeng Nimei, Shen Yue, Li Yuan, et al. Association between remnant cholesterol and subclinical carotid atherosclerosis among Chinese general population in health examination[J].J Stroke Cerebrovasc Dis,2023,32(8):107234.
[28] Li Chujun, Chen Yuzhen, Ou Xiuli, et al. Factors influencing the occurrence of ischemic stroke in elderly patients with hypertension and type 2 diabetes mellitus:a case-control study[J].BMC Neurology,2025,25(1):35.
[29] Xu Tianqi, Yang Jianhong, Xu Yao, et al. Post-acute ischemic stroke hyperglycemia aggravates destruction of the blood-brain barrier[J].Neural Regen Res, 2024,19(6):1344-1350.
/
| 〈 |
|
〉 |