食品科学 ›› 2025, Vol. 46 ›› Issue (22): 13-22.doi: 10.7506/spkx1002-6630-20250408-059

• 基于计算机视觉和深度学习的食品检测技术专栏 • 上一篇    下一篇

基于增强检索生成框架的食品安全监管智能问答系统

毛典辉,王可浩,陈俊华,徐静婷   

  1. (1.北京工商大学计算机与人工智能学院,北京 100048;2.中国标准化研究院,北京 100191;3.清华大学智库中心,北京 100084)
  • 发布日期:2025-11-21
  • 基金资助:
    北京市自然科学基金项目(9232005)

An Intelligent Question Answering System for Food Safety Regulation Based on Retrieval-Augmented Generation Framework

MAO Dianhui, WANG Kehao, CHEN Junhua, XU Jingting   

  1. (1. School of Computer and Artificial Intelligence, Beijing Technology and Business University, Beijing 100048, China; 2. China National Institute of Standardization, Beijing 100191, China;3. Center of Tsinghua Think Tanks, Tsinghua University, Beijing 100084, China)
  • Published:2025-11-21

摘要: 为满足食品安全监管问答任务对模型准确性、合规性和可解释性的高要求,解决现有大语言模型(large language model,LLM)在该领域应用面临的知识召回不精准、法规解析能力不足及计算成本高等问题,本研究基于检索增强生成框架提出了一个智能问答系统,其核心是食品安全监管大语言模型(food safety regulation large language model,FSR-LLM)。通过优化数据库存储结构、检索策略及生成器,提升食品安全监管问答的质量和效率。首先构建了食品安全知识图谱(knowledge graph,KG)数据库,以结构化方式存储法规条款、食品安全标准等数据,增强模型对食品领域知识的组织与调用能力。此外,在检索阶段,设计一种大模型引导检索策略,利用LLM智能解析查询语句,在食品安全监管KG中准确地提取高度相关的信息,从而减少无关或误导性内容的召回。对于生成器(Generator)模块,基于Qwen-7B-Chat模型采用低秩适应微调,使模型更贴合食品安全监管问答的需求,同时显著降低计算成本,使其能够在单张RTX 4090 GPU上完成训练。在所提食品安全问答数据集上的实验结果表明,FSR-LLM在BLEU-4、Rouge-L和准确率指标上均优于基线模型,展现出更高的精准度和语义连贯性,为食品安全监管智能化提供了一种低成本、高效能、可扩展的解决方案。

关键词: 食品安全监管;检索增强生成;知识图谱;低秩适应;微调

Abstract: The food safety regulation question answering (QA) task imposes high requirements on model accuracy, compliance, and interpretability. However, existing large language models (LLMs) face challenges in this domain, including imprecise knowledge retrieval, insufficient regulatory interpretation capabilities, and high computational costs. To address these issues, we proposed an intelligent question answering system based on the retrieval augmented generation (RAG) framework, with its core being the food safety regulation large language model (FSR-LLM). By optimizing database storage structures, retrieval strategies and the generator, FSR-LLM enhanced the quality and efficiency of food safety regulation QA. First, we constructed a food safety knowledge graph (KG) database to store regulatory provisions, food safety standards, and related data in a structured manner, improving the model’s capability to organize and utilize domain-specific knowledge. Additionally, we introduced an LLM-guided retrieval strategy, which enables intelligent query parsing and accurately extracts highly relevant information from the food safety regulation KG, reducing the retrieval of irrelevant or misleading contents. For the generator module, we fine-tuned Qwen-7B-Chat using low-rank adaptation (LoRA), ensuring better alignment with food safety QA tasks, while significantly reducing computational costs, allowing training on a single RTX 4090 GPU. Experimental results on the proposed dataset demonstrated that FSR-LLM outperformed baseline models in BLEU-4, Rouge-L, and accuracy, exhibiting higher precision and semantic coherence. This work provides a low-cost, high-performance, and scalable solution for intelligent food safety regulation.

Key words: food safety regulation; retrieval-augmented generation; knowledge graph; low-rank adaptation; fine-tuning

中图分类号: