###
工程科学与技术:2022,54(6):59-66
←前一篇   |   后一篇→
本文二维码信息
码上扫一扫!
基于知识蒸馏的目标检测模型增量深度学习方法
(北京交通大学 计算机与信息技术学院,北京 100044)
Incremental Deep Learning Method for Object Detection Model Based on Knowledge Distillation
(School of Computer and Info. Technol., Beijing Jiaotong Univ., Beijing 100044, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 72次   下载 18
投稿时间:2021-09-14    修订日期:2022-11-02
中文摘要: 随着万物互联时代的到来,具备目标检测能力的物联网设备数量呈爆炸式增长。基于此,网络边缘产生了海量的实时数据,具有低时延、低带宽成本和高安全性特点的边缘计算随之成为一种新兴的计算模式。传统的深度学习方法通常假定在模型训练前所有数据已完全具备,然而实际的边缘计算场景中大量的新数据及类别往往随时间逐渐产生和获得。为了在训练数据成批积累和更新的条件下在资源有限的边缘设备上高效率地完成目标检测任务,本文提出了基于多中间层知识蒸馏的增量学习方法(incremental learning method based on knowledge distillation of multiple intermediate layers,ILMIL)。首先,为了能够适当地保留原有数据中的知识,提出了包含多个网络中间层知识的蒸馏指标(multi-layer feature map RPN and RCN knowledge,MFRRK)。ILMIL将教师模型和学生模型的中间层特征的差异加入模型训练,相比于现有的基于知识蒸馏方法的增量学习,采用ILMIL方法训练的学生模型可以从教师模型的中间层学习到更多的旧类信息来缓解遗忘。其次,ILMIL利用MFRRK蒸馏知识完成现有模型的增量训练,避免训练使用多个独立模型带来的资源开销;为进一步降低模型复杂度以高效地在边缘设备上部署推理,可在知识蒸馏前进行剪枝操作来压缩现有模型。在不同场景和条件下的实验对比表明,本文方法可在有效降低模型计算和存储开销的前提下,缓解已有知识的灾难性遗忘现象,并维持可接受的推理精度。
Abstract:With the advent of the Internet of Everything era, the number of IoT devices with object detection capability has exploded. Accordingly, massive amounts of real-time data are generated at the edges of the network. Thus, edge computing has become an emerging computing paradigm that has the characteristics of low latency, low bandwidth and high security. While the traditional deep learning approaches usually assume that all data have been obtained before model training, a large number of new data and new categories are often gradually generated and obtained over time in real edge computing environment. In order to execute the object detection task efficiently on resource-constrained edge devices when the input data samples are accumulated and updated in batches, an incremental learning method based on knowledge distillation of multiple intermediate layers (ILMIL) was proposed in this paper. First, to preserve the obtained knowledge from existing data, a new metric called MFRRK was proposed for covering knowledge from multiple intermediate network layers. The discrepancy between the intermediate layers’ features of teacher model and student model were added in ILMIL to model training. Compared to the recent incremental learning methods based on knowledge distillation, the student model adapted ILMIL was able to alleviate forgetting by learning more knowledge of the old classes from the intermediate layers of the teacher model. Then, the incremental training for current model was conducted to avoid resource costs of using multiple independent models for training. To further reduce the model complexity, the model pruning technique can be used before knowledge distillation to compress the current model. Extensive experiments under different scenarios and conditions demonstrated that the proposed ILMIL method can effectively reduce the model calculation and storage costs, alleviate the catastrophic forgetting of existing knowledge, and maintain acceptable inference accuracy.
文章编号:202100925     中图分类号:TP391.41    文献标志码:
基金项目:国家自然科学基金项目(62172031);北京市自然科学基金–丰台轨道交通前沿研究联合基金项目(L191019)
作者简介:第一作者:方维维(1981-),男,副教授,博士.研究方向:物联网;边缘计算与边缘智能.E-mail:fangww@bjtu.edu.cn;通信作者:陈爱方,E-mail:19125157@bjtu.edu.cn
引用文本:
方维维,陈爱方,孟娜,程虎威,王清立.基于知识蒸馏的目标检测模型增量深度学习方法[J].工程科学与技术,2022,54(6):59-66.
FANG Weiwei,CHEN Aifang,MENG Na,CHENG Huwei,WANG Qingli.Incremental Deep Learning Method for Object Detection Model Based on Knowledge Distillation[J].Advanced Engineering Sciences,2022,54(6):59-66.