食品科学 ›› 2021, Vol. 42 ›› Issue (8): 248-256.doi: 10.7506/spkx1002-6630-20191016-151

• 安全检测 • 上一篇    下一篇

基于支持向量机对云南常见野生食用牛肝菌中红外光谱的种类鉴别

胡翼然,李杰庆,刘鸿高,范茂攀,王元忠   

  1. (1.云南农业大学资源与环境学院,云南 昆明 650201;2.云南农业大学农学与生物技术学院,云南 昆明 650201;3.云南省农业科学院药用植物研究所,云南 昆明 650200)
  • 出版日期:2021-04-25 发布日期:2021-05-14
  • 基金资助:
    国家自然科学基金地区科学基金项目(31660591);云南省农业基础研究联合专项面上项目(2018FG001-033)

Species Identification of Common Wild Edible Bolete in Yunnan by Fourier Transform Mid-infrared Spectroscopy Coupled with Support Vector Machine

HU Yiran, LI Jieqing, LIU Honggao, FAN Maopan, WANG Yuanzhong   

  1. (1. College of Resources and Environment, Yunnan Agricultural University, Kunming 650201, China;2. College of Agronomy and Biotechnology, Yunnan Agricultural University, Kunming 650201, China;3. Institute of Medicinal Plants, Yunnan Academy of Agricultural Sciences, Kunming 650200, China)
  • Online:2021-04-25 Published:2021-05-14

摘要: 利用傅里叶变换中红外光谱鉴别云南野生牛肝菌种类,明确不同数据挖掘方法对模型分类性能的影响,为云南省食用菌的鉴别和质量控制提供参考依据。扫描云南8?种827?个常见野生牛肝菌样本的中红外光谱,分析光谱特征,结合支持向量机建立判别模型,并利用预处理、提取特征变量及两者组合等方法挖掘光谱信息,比较各模型分类性能,找出野生牛肝菌种类鉴别的最优方法。结果表明:原始数据存在大量噪音和干扰信息,降低模型分类性能;不同数据挖掘方法均能不同程度去除非有效信息,提高模型分类性能;预处理组合特征变量挖掘光谱信息能力最强,对模型分类性能提高最大。预处理组合特征变量对模型信息挖掘能力最强,结合支持向量机建立判别模型,模型拟合好,分类精度高,适用范围广,实现了8?种野生牛肝菌的准确、快速鉴别,可以为野生牛肝菌种类鉴别提供参考。

关键词: 野生牛肝菌;种类鉴别;数据挖掘;傅里叶变换中红外光谱;支持向量机

Abstract: In an effort to provide reference for the identification and quality control of edible fungi in Yunnan, Fourier transform mid-infrared spectroscopy combined with support vector machine (SVM) was used to identify different wild bolete species in Yunnan. The influence of different data mining methods on model classification performance was determined. Infrared spectra of 8 species with 827 samples of eight common wild bolete species were acquired and analyzed for spectral characteristics, and a discriminant model was established using SVM. Spectral information mining was performed by preprocessing, feature variable extraction or their combination, and the classification performance of the models developed was compared with each other to find the optimal method for species identification of wild bolete. The results showed that there was a large amount of noise and interference information in the original data, which reduced the classification performance of the model. All tested data mining methods could remove non-effective information but in different extents, improving model classification performance. Preprocessing combined with characteristic variables extraction had the highest ability to mine spectral information, providing the best model classification performance. The SVM model developed had excellent goodness of fit with high classification accuracy and wide applicability, allowing the accurate and quick identification of the eight wild bolete species.

Key words: wild bolete; species identification; data mining; Fourier transform mid-infrared spectroscopy; support vector machine

中图分类号: