当前位置: 首页 > news >正文

LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image

LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image




RMF 的新型多尺度融合模块。它可以通过同时使用基于参数和无参数的方法来结合 X 射线图像的局部和全局线索。


X-ray image plays an important role in manufacturing industry for quality assurance, because it can reflect the internal condition of weld region.However, the shape and scale of different defect types vary greatly, which makes it challenging for model to detect weld defects.


a reinforced multiscale feature (RMF) module is designed to implement both parameter-based and parameter-free multi-scale information extracting operation.RMF enables the extracted feature map capable to represent more plentiful information, which is achieved by superior hierarchical fusion structure.


To improve the performance of detection network, we propose an efficient feature extraction (EFE) module.To further prove the ability of our method, we test it on public dataset MS COCO, and the results show that our LF-YOLO has a outstanding versatility detection performance.

为了提高检测网络的性能,我们提出了一种高效的特征提取(EFE)模块。为了进一步证明我们的方法的能力,我们在公共数据集MS COCO上进行了测试,结果表明我们的LF-YOLO具有出色的通用性检测性能。


However, either manual or robotic welding will inevitably produce weld defects, which is a potential hazard for daily production.people utilize X-ray technology to reflect internal defect of weld into image as shown in Fig. 1, and detect them through expert or computer vision model.


The context of weld image is complicated, and there are blurred boundaries and similar texture between defect and background. In addition, the scales and shapes of defects vary greatly among different classes, which can be seen in Fig. 2.


All of these factors bring great challenges to the detection model [3], and it is required to capture abundant contextual information.


local feature is beneficial to represent the boundary, shape, and geometric texture of defect, while global feature is vital for classification and distinguishing foreground and background.


In this paper, we propose an reinforced multiscale feature (RMF) module, which combines both of parameter-based and parameter-free operations.


RMF module firstly contains a basic parameter-free hierarchical structure, which generates multiple feature maps obtained from maxpool operations of different sizes.


Furthermore, within each branch of basic hier- archy, new features are produced through learning potential information implicitly, and the process is parameter-based.


Finally, the output data of each hierarchy would be fused for finer estimation. Besides the contribution of multi-scale feature utilization, original feature extraction also determines the performance of the network.


To effectively extract feature of weld defect, we design an efficient feature extraction (EFE) module elaborately, and build a superior backbone by stacking EFE repeatedly.


In summary, this work makes the following contributions.


A novel multi-scale fusion module named RMF is pro- posed. It can combine local and global cues of X- ray image by using parameter-based and parameter-free methods simultaneously.


To efficiently learn representation, we design a novel EFE module as the unit of backbone, and it can extract mean- ingful feature with few parameters and low computation.


deal with multiple defect classes, and the proposed network is memory and computation friendly.



efficient feature extraction (EFE) module and reinforced multi- scale feature (RMF) module


A. EFE module

Feature extraction module is the basic block of deep learning network.


to better accomplish corresponding tasks. In addition, feature extraction operation is the main source of parameters and computation. Therefore, the weight of feature extraction module determines the weight of whole network.


Inspired by the inverted residual block in MobileNetV2 [22], EFE module maps the input data into a higher dimension space in the middle stage, because the expansion of feature space is beneficial to obtain more meaningful representation.


MobileNetV2 [21] solves this problem by using depthwise separable convolutions. In this paper, we employ a more wise strategy.


Following the idea of [34], we design the middle expansion structure based on “split-transform-merge” theory. After the first 1×1 Conv, feature maps are split into two branches, and split ratio ra is set as 0.25 in this paper.

我们遵循[34]的思想,基于“分裂-转换-合并”理论设计了中间扩展结构。在进行了第一次1×1 Conv之后,特征映射被拆分为两个分支,本文设置拆分比ra为0.25。

One of them is an identity branch, which does not utilize any operation on the data. Another branch is a dense block in [35], which is used to further extract features.


To optimize the complexity, EFE module introduces Ghost Conv [24].

为了优化复杂度,EFE模块引入了Ghost Conv[24]。

At the tail of EFE module, the second 1×1 Conv is used to compress the number of channels back to 2c/c. Finally, the input of expansion operation and the output of second 1×1 Conv are added element-wise by a residual branch.

在EFE模块的尾部,第二个1×1 Conv用于将通道数压缩回2c/c。最后,将展开运算的输入和第二个1×1 Conv的输出通过一个剩余分支逐项相加。

Compared with the conventional residual block, our EFE module greatly decreases the consumption of feature extraction.


B. RMF module

Scale problem is a classical research topic for CNN, because it is not robust enough for the sizes of objects.Especially when the sizes of objects vary greatly, the plain topology model will encounter an awful performance.


through multi-scale strategy, we design a RMF module combining the parameter-based and parameter-free methods.


RMF module is a hierarchical structure for obtaining multi- scale contextual information.


which utilizes multiple maxpool operations with different sizes on input feature map. There are not any parameters introduced in this stage, hence we regard it as parameter-free.


Parameter-free method makes the most of existing data, but not generating new information in a sense.


Dilated convolution can enhance the ability to extract un- derlying information through changing the receptive field [5].


If we use dilated convolution directly at the tail of backbone, it would be expensive on storage and computation.


To address this problem, GDConv achieves dilation process based on a lighter form. Specifically, we retain the structure of original Ghost Conv but operate depthwise Conv with dilation version, and its inner detail is shown in Fig. 5.

为了解决这一问题,GDConv基于更轻的形式实现了膨胀过程。具体来说,我们保留了原来的Ghost Conv的结构,但对扩张版进行了深度Conv,其内部细节如图5所示。

GDConv is the core ingredient for RMF module to learn implicit information through parameters of convolution kernels. Three GDConvs form the elements of a hierarchy group, and their dilation rates are set as 1, 5, 9 respectively.


Note that when dilation rate is 1, it is equivalent to normal Ghost Conv, and the new features from different dilation branches would be concatenated.

需要注意的是,当膨胀率为1时,它相当于正常的Ghost Conv,将不同膨胀分支的新特征串联起来。

the parameter-free method provides a multi-scale base through optimizing existing feature maps, and parameter- based method exploits new multi-scale data based on the former. Hence, the base and expansion pyramid of hierarchy have a superposition effect and enhance the ability to better develop effective representation.


C. The architecture of LF-YOLO


In this paper, we propose a highly effective EFE module as the basic feature extraction block, and it can encode sufficient information of X-ray weld image with low consumption.


The parameter-free stage contributes to a basis containing existing multi-scale information, and parameter-based stage further learn implicit feature among different receptive fields.



  • JavaScript随手笔记---保留小数位
  • 检查网络端口是否正常
  • 基于JavaSwing开发房产管理系统(access数据库) 课程设计 大作业
  • [面试直通版]操作系统之编程语言与运行原理(下)
  • DFP 数据转发协议规则说明
  • Qt开发经验小技巧246-250
  • 《算法导论》第14章-数据结构的扩张 14.1-动态顺序统计 14.2-如何扩张数据结构
  • 前端面试丨综合整理中高级前端最新面试题
  • 大端与小端
  • GBase 8c 数据库内置角色
  • 无需训练、APP可玩,商品、车辆、菜品20+场景一键识别
  • 【Linux 基础笔记】(一)
  • Notion + CloudFlare + 域名搭建网站
  • 自媒体平台上剪视频的素材都是从哪来的?
  • 图像识别与处理学习笔记(四)贝叶斯决策和概率密度估计
  • 【面试系列】之二:关于js原型
  • 【跃迁之路】【585天】程序员高效学习方法论探索系列(实验阶段342-2018.09.13)...
  • 3.7、@ResponseBody 和 @RestController
  • canvas绘制圆角头像
  • ECMAScript 6 学习之路 ( 四 ) String 字符串扩展
  • express如何解决request entity too large问题
  • Java,console输出实时的转向GUI textbox
  • JAVA之继承和多态
  • Js基础知识(一) - 变量
  • Puppeteer:浏览器控制器
  • WebSocket使用
  • 基于组件的设计工作流与界面抽象
  • 来,膜拜下android roadmap,强大的执行力
  • 聊聊flink的TableFactory
  • 七牛云假注销小指南
  • 如何在 Tornado 中实现 Middleware
  • 我建了一个叫Hello World的项目
  • 优秀架构师必须掌握的架构思维
  • MiKTeX could not find the script engine ‘perl.exe‘ which is required to execute ‘latexmk‘.
  • elasticsearch-head插件安装
  • Python 之网络式编程
  • ​​​​​​​ubuntu16.04 fastreid训练过程
  • #每天一道面试题# 什么是MySQL的回表查询
  • (2)Java 简介
  • (4.10~4.16)
  • (JS基础)String 类型
  • (第8天)保姆级 PL/SQL Developer 安装与配置
  • (二十三)Flask之高频面试点
  • (论文阅读32/100)Flowing convnets for human pose estimation in videos
  • (一)python发送HTTP 请求的两种方式(get和post )
  • (原創) X61用戶,小心你的上蓋!! (NB) (ThinkPad) (X61)
  • *Django中的Ajax 纯js的书写样式1
  • ./include/caffe/util/cudnn.hpp: In function ‘const char* cudnnGetErrorString(cudnnStatus_t)’: ./incl
  • .bat批处理出现中文乱码的情况
  • .NET BackgroundWorker
  • .NET CORE 第一节 创建基本的 asp.net core
  • .NET Core 版本不支持的问题
  • .NET HttpWebRequest、WebClient、HttpClient
  • .NET 将多个程序集合并成单一程序集的 4+3 种方法
  • .NET/ASP.NETMVC 大型站点架构设计—迁移Model元数据设置项(自定义元数据提供程序)...