陈琦,赵骅,蔡恬静,高义哲,高玲,刘青杰.基于多种机器学习算法的人血浆辐射敏感脂质代谢物模型探索[J].中华放射医学与防护杂志,2024,44(6):457-463
基于多种机器学习算法的人血浆辐射敏感脂质代谢物模型探索
Exploration of models of radiosensitive lipid metabolites of human plasma based on multiple machine learning algorithms
投稿时间:2023-12-28  
DOI:10.3760/cma.j.cn112271-20231228-00225
中文关键词:  电离辐射  脂质组学  机器学习  随机森林
英文关键词:Ionizing radiation  Lipidomics  Machine learning  Random forest
基金项目:国家自然科学基金(82173463,82003393)
作者单位E-mail
陈琦 中国疾病预防控制中心辐射防护与核安全医学所 辐射防护与核应急中国疾病预防控制中心重点实验室, 北京 100088
湖北省疾病预防控制中心传染病防治所急性传染病预防控制部, 武汉 430079 
 
赵骅 中国疾病预防控制中心辐射防护与核安全医学所 辐射防护与核应急中国疾病预防控制中心重点实验室, 北京 100088  
蔡恬静 中国疾病预防控制中心辐射防护与核安全医学所 辐射防护与核应急中国疾病预防控制中心重点实验室, 北京 100088  
高义哲 中国疾病预防控制中心辐射防护与核安全医学所 辐射防护与核应急中国疾病预防控制中心重点实验室, 北京 100088  
高玲 中国疾病预防控制中心辐射防护与核安全医学所 辐射防护与核应急中国疾病预防控制中心重点实验室, 北京 100088  
刘青杰 中国疾病预防控制中心辐射防护与核安全医学所 辐射防护与核应急中国疾病预防控制中心重点实验室, 北京 100088 liuqingjie@nirp.chinacdc.cn 
摘要点击次数: 284
全文下载次数: 95
中文摘要:
      目的 通过脂质组学方法联合机器学习(ML)多种算法,探索人外周血辐射敏感脂质代谢物分类模型。方法 收集2023年3—9月北京市某综合医院准备接受骨髓移植的25例白血病放射治疗病例,照射前和照射后外周血样本97份,其中对照组24份,为照射前血液样本; 辐射组73份,为4、8、12 Gy照射剂量下的24、25和24份血液样本。采用基于超高效液相色谱- 串联质谱(UPLC-MS/MS)平台的靶向脂质组学方法,分析辐射组与对照组差异脂质。线性回归筛选0~12 Gy 的剂量范围内辐射剂量响应脂质。采用5种机器学习方法构建训练集辐射分类模型,验证集进行模型的验证和评价。结果 与对照组相比,辐射组敏感的脂质代谢物中62个脂质浓度变化差异有统计学意义(t=-4.91~4.74, P<0.05),包括鞘磷脂(SM)、胆固醇酯(CE)、神经酰胺(Cer)、磷脂酰肌醇(PI)、己糖神经酰胺(HexCer)、溶血磷脂酰胆碱(LysoPC)、醚磷脂酰胆碱(PCO)、磷脂酰乙醇胺(PE)、溶血磷脂酰乙醇胺(LysoPE)这9大类。在 0~12 Gy的剂量范围内,筛选出20种具有良好剂量反应的脂质代谢物,包括11个SM,7个CE,1个Cer和1个PI。决策树(DT)、支持向量机(SVM)、轻量梯度增强机(Light GBM)、随机森林(RF)、K最邻近(KNN)这5种机器学习训练模型拟合均较好(F1=0.69~1.00),灵敏度较高。通过评价验证指标,辐射分类判定效果最好的模型为随机森林(灵敏度1.00、准确率0.72、F1值0.80)。结论 通过靶向脂质组学分析,发现人类样本中辐射响应的脂质代谢物和辐射剂量响应的脂质。机器学习方法中的RF模型可以为探索人类辐射脂质代谢物模型提供新的思路。
英文摘要:
      Objective To explore classification models for radiosensitive lipid metabolites in human peripheral blood by combining lipidomics with multiple machine learning (ML) algorithms. Methods Totally 97 peripheral blood samples were collected from 25 leukemia cases admitted to a general hospital in Beijing from March to September 2023 who were ready to undergo bone marrow transplantation, including 0 Gy blood samples before irradiation in the control group (n=24), and 73 blood samples after irradiation at doses of 4, 8 and 12 Gy in the radiation group (n=73), and the targeted lipidomic based on the ultra-high performance liquid chromatography-mass spectrometry (UHPLC-MS) platform method to analyze the differences of different lipids between control and radiation groups. Then, lipids responsive to radiation doses of 0-12 Gy were identified using linear regression. Finally, classification models were constructed using five ML algorithms based on the training set, followed by the validation and evaluation of these models using the validation set. Results Compared with the control group, the differences in the concentration changes of 62 lipids in 9 classes of lipid metabolites sensitive to radiation group were statistically significant (t=-4. 91 to 4. 74, P<0. 05), including sphingomyelins(SMs), cholesteryl esters (CEs), ceramides(Cers), phosphatidylinositols(PIs), hexosylceramides(HexCers), lysophosphatidylcholines (LysoPCs), phosphatidylcholines (PCOs), phosphatidylethanolamines (PEs), and lysophosphatidylethanolamines (LysoPEs). Twenty lipids responsive to radiation doses of 0-12 Gy were identified, namely 11 SMs, 7 CEs, 1 Cer, and 1 PI. The five models based on ML algorithms of decision tree (DT), support vector machine (SVM), light gradient boosting machine (Light GBM), random forest (RF), and K-nearest neighbors (KNN) all exhibited high goodness of fit (F1=0. 69-1. 00) and high sensitivity. The evaluation and validation metrics revealed that the RF-based model yielded the optimal radiation classification discrimination (sensitivity: 1. 00; accuracy: 0. 72; F1 score: 0. 80). Conclusions Lipid metabolites responsive to radiation and lipids responsive to radiation dose in human samples were identified using targeted lipidomics. The RF-based model can provide new ideas for exploring models of human radiosensitive lipid metabolites.
HTML  查看全文  查看/发表评论  下载PDF阅读器
关闭