FOOD SCIENCE ›› 2025, Vol. 46 ›› Issue (20): 318-326.doi: 10.7506/spkx1002-6630-20250430-261

• Safety Detection • Previous Articles     Next Articles

Rapid and Non-destructive Identification of Wuchang Daohuaxiang Rice Using Near-Infrared Spectroscopy and t-Distributed Stochastic Neighbor Embedding

SUN Xinyue, LI Yanlong, CHEN Mingming, SONG Yan, QIAN Lili, ZUO Feng, GUAN Hai’ou, ZHANG Tao, LIU Xingquan, ZHOU Guoxin   

  1. (1. College of Food Science, Heilongjiang Bayi Agricultural University, Daqing 163319, China;2. College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China;3. National Food and Strategic Reserves Administration, Beijing 100834, China;4. College of Food and Health, Zhejiang Agriculture and Forestry University, Hangzhou 311300, China)
  • Online:2025-10-25 Published:2025-09-17

Abstract: This study proposed a rapid, non-destructive method for identifying Wuchang Daohuaxiang rice based on near-infrared (NIR) spectroscopy combined with machine learning algorithms. NIR spectra of different varieties of rice were collected. First-order derivative was determined as the best spectral preprocessing method using partial least squares regression (PLSR). Two dimensionality reduction methods, principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE), were compared, and five machine learning models including artificial neural network (ANN), K-nearest neighbors (KNN), random forest (RF), decision tree (DT), and Naive Bayes (NB) were constructed for variety classification and comparison. The results showed that t-SNE improved the Calinski-Harabasz index by 1 078.005 1, demonstrating better clustering performance. After t-SNE dimensionality reduction, the performance of all five models was superior to that without dimensionality reduction. The average classification accuracy was 95.78%. The accuracy of the NB model was improved most effectively (by 18.89%). The random forest model showed the best classification performance, with prediction accuracy and precision of 98.89% and 98.96%, respectively. This method provides a rapid, non-destructive solution for identifying Wuchang Daohuaxiang rice, which will contribute to brand protection and safeguarding consumer rights, and also offers a new approach for the identification of other geographical indication agricultural products.

Key words: Wuchang Daohuaxiang rice; near-infrared spectroscopy; random forest; discrimination of similar rice varieties; t-distributed stochastic neighbor embedding

CLC Number: