食品科学 ›› 2022, Vol. 43 ›› Issue (24): 310-317.doi: 10.7506/spkx1002-6630-20211129-354

• 安全检测 • 上一篇    

基于GC-MS指纹图谱和XGBoost机器学习的泸型基酒贮存时间鉴别

刘青茹,孟连君,张晓娟,翟伟绩,柴丽娟,陆震鸣,许泓瑜,王松涛,张宿义,沈才洪,史劲松,许正宏   

  1. (1.江南大学生物工程学院,江苏 无锡 214122;2.江南大学 粮食发酵与食品生物制造国家工程研究中心,江苏 无锡 214122;3.江南大学生命科学与健康工程学院,江苏 无锡 214122;4.国家固态酿造工程技术研究中心,四川 泸州 646000)
  • 发布日期:2022-12-28
  • 基金资助:
    “十三五”国家重点研发计划重点专项(2018YFC1604104);四川省固态酿造技术创新中心建设项目(2021ZYD0102)

Identification of the Age of Luzhou-Flavor Base Baijiu by Gas Chromatography-Mass Spectrometry Fingerprinting and eXtreme Gradient Boosting Machine Learning

LIU Qingru, MENG Lianjun, ZHANG Xiaojuan, ZHAI Weiji, CHAI Lijuan, LU Zhenming, XU Hongyu, WANG Songtao, ZHANG Suyi, SHEN Caihong, SHI Jingsong, XU Zhenghong   

  1. (1. School of Biotechnology, Jiangnan University, Wuxi 214122, China; 2. National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, Wuxi 214122, China; 3. School of Life Science and Health Engineering, Jiangnan University, Wuxi 214122, China; 4. National Engineering Research Center of Solid-State Brewing, Luzhou 646000, China)
  • Published:2022-12-28

摘要: 通过顶空固相微萃取结合气相色谱-质谱联用采集挥发性成分指纹图谱,采用极端梯度提升算法建立回归模型,运用极端随机森林的变量重要性评估、sklearn特征选择模块中的单变量线性回归测试(F_regression)以及连续目标变量的互信息(mutual_info_regression)确定有效建模变量,对白酒的贮存时间进行鉴别。模型的R2评估结果为0.987,预测模型可靠性较好,为白酒酒龄的判断提供了新思路。

关键词: 白酒年份;挥发性化合物;特征筛选;机器学习;鉴别

Abstract: In order to identify the age of Luzhou-flavor base baijiu, headspace solid phase microextraction-gas chromatography-mass spectrometry (HS-SPME-GC-MS) was used to create a fingerprint of the volatile composition of Luzhou-flavor base baijiu, and the eXtreme Gradient Boosting (XGBoost) algorithm was used to establish a regression model. Feature selection was conducted via a combination of variable importance evaluation using the extremely randomized trees, and F_regression and mutual_info_regression in the sklearn feature selection module. The coefficient of determination (R2) of the proposed regression model was 0.987, demonstrating good predictive reliability. This study provides a new idea for the identification of baijiu age.

Key words: baijiu age; volatile compounds; feature selection; machine learning; discrimination

中图分类号: