本帖最后由 matlab的旋律 于 2020-9-3 09:15 编辑
1.前言所谓轻量级网络,直观得解释就是相对于重量级网络而言,参数量少, 计算量小, 推理时间短 (这里的计算量和推理时间不能直接换算) 的深度学习网络。轻量级网络与重量级网络只是对一些经典网络的粗略划为,没有明显的界限。 通常获得轻量级网络的方法包括: a) 通过对经典的重量级网络进行剪枝,去除预训练部分和冗余网络来提升速度。 b) 根据人工经验设计,通常采用的方法:减少网络的每一层输出通道数;用多个小尺寸卷积替代单个大尺寸卷积;用分组卷积/深度可分离卷积替代一般卷积。 其中MobileNet的 V1/V2 就可以归于此类。 c) 利用 NAS 方法搜索得到。 MobileNet的 V3 版本可归于这一类。 这里的NAS(强化学习进行神经网络结构搜索),在图像分类和语言建模任务上超越了此前人工经验设计的网络。 2.网络结构其中MobileNet V1/V2/V3各自的微结构和主要特征如下: MobileNet V1: 微结构
1) 采用复古的直筒结构。 2) 采用深度可分离卷积(DW),将标准卷积分解成深度卷积(depthwise convolution)和逐点卷积(pointwise convolution),其优点是可以大幅度降低参数量和计算量。 3) 采用的是激活ReLU6,相对Relu这个激活函数在6的时候有一个边界,使得模型在低精度计算下具有更强的鲁棒性。 MobileNet V2: 微结构
相比MobileNet V1,MobileNet V2 1) 在DW之前增加了了一个1*1的“扩张”层(PW),由于 DW 卷积本身没有改变通道数的能力,上一层输入多少通道,它就输出多少通道。因此如果上一层输入的通道数较少,DW 就只能在低维空间提特征,导致效果不够好。针对这个问题MobileNet V2 给每个 DW 之前都增加了一个 PW,专门用来升维,从而可以提取更高维的特征。 2) 将第二个PW之后激活函数ReLU6改成Linear,这是因为非线性在高维空间有益处,但在低维空间会破坏特征,反而不如线性好。而第二个PW的作用就是降维,因此PW之后就不宜再使用 ReLU6 了。 MobileNet V3: 微结构
相比MobileNet V1,MobileNet V2 1) 引入了基于Squeeze-and-Excitation结构的轻量级注意力模型(SE模块),放在DW之后。其计算原理就相当于对每一个Feature map加权计算。计算过程包括:首先通过一个Avgpool得到一个一维的向量,元素个数和Feature map数目一样。然后两个带ReLU激活函数的全连接层,最后加一个带h-sigmoid激活函数的全连接层。 2) 引入使用了一种新的激活函数h-swish(x) 首先,由于几乎所有的软件和硬件框架上都可以使用ReLU6的优化实现,因此选择ReLU6来逼近swish函数,让swish变得更硬(hard)。其次,它能在特定模式下消除了由于近似sigmoid的不同实现而带来的潜在的数值精度损失。 3) 对MobileNet V2网络端部最后段进行了修改。论文作者在实验中发现最后提取特征1*1卷积作用在7*7和1*1接受域上的准确率相同,因此最后一个用来减小计算量的bottleneck 就显得冗余了,因此去掉了这两层,使得总计算量和latency都大大降低。
4) 修改了头部卷积核channel数量。通过使用h-swish代替ReLU6的方式提高了精度,所以减少了第一层的通道数,从32*3*3减为16*3*3,在精度的得到保证的情况下,提升了3ms的速度。 5) 网络结构搜索中,结合两种技术:资源受限的NAS(platform-aware NAS)与NetAdapt (这一技术基本上土豪才有机会详细了解,这里就不介绍了) 。
3.实验结果Google团队使用MobileNet V1/V2/V3在ImageNet分类任务上排名第一的类别与实际结果相符的准确率Top-1,乘法累加运算Madds和网络参数量Params对比如下: 其中Mobilenet V2/V3采用了缩减参数alpha调节每层的宽度,它可用来影响输入通道数M及输出通道N的数目。若施加了alpha参数,那么在真正计算时所用的M与N将分别为alpha x M与alpha x N。 4.Matlab实现由于Matlab深度学习库函数中还没有自带的Mobilenet V3实现,因此这里选择Mobilenet V2的Matlab代码作为实例。数据集选择包含煤和矸石的2类图像数据,其中煤的图像个数为408个,矸石图像个数为396个,共计804个数据样本按照随机排列的8:2比例进行训练和预测分类。由于选用Adam优化器,可以不设置测试集。 - numClasses = 2;
- lgraph = layerGraph();
- tempLayers =[imageInputLayer([size(XTrain,1) size(XTrain,2) size(XTrain,3)], "Name", "input_1","Normalization","zscore")
- convolution2dLayer([3 3],32,"Name","Conv1","Padding","same","Stride",[2 2])
- batchNormalizationLayer("Name","bn_Conv1","Epsilon",0.001)
- clippedReluLayer(6,"Name","Conv1_relu")
- groupedConvolution2dLayer([3 3],1,32,"Name","expanded_conv_depthwise","Padding","same")
- batchNormalizationLayer("Name","expanded_conv_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","expanded_conv_depthwise_relu")
- convolution2dLayer([1 1],16,"Name","expanded_conv_project","Padding","same")
- batchNormalizationLayer("Name","expanded_conv_project_BN","Epsilon",0.001)
- convolution2dLayer([1 1],96,"Name","block_1_expand","Padding","same")
- batchNormalizationLayer("Name","block_1_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_1_expand_relu")
- groupedConvolution2dLayer([3 3],1,96,"Name","block_1_depthwise","Padding","same","Stride",[2 2])
- batchNormalizationLayer("Name","block_1_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_1_depthwise_relu")
- convolution2dLayer([1 1],24,"Name","block_1_project","Padding","same")
- batchNormalizationLayer("Name","block_1_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],144,"Name","block_2_expand","Padding","same")
- batchNormalizationLayer("Name","block_2_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_2_expand_relu")
- groupedConvolution2dLayer([3 3],1,144,"Name","block_2_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_2_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_2_depthwise_relu")
- convolution2dLayer([1 1],24,"Name","block_2_project","Padding","same")
- batchNormalizationLayer("Name","block_2_project_BN","Epsilon",0.001)];
- lgraph = addLayers(lgraph,tempLayers);
- tempLayers = [
- additionLayer(2,"Name","block_2_add")
- convolution2dLayer([1 1],144,"Name","block_3_expand","Padding","same")
- batchNormalizationLayer("Name","block_3_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_3_expand_relu")
- groupedConvolution2dLayer([3 3],1,144,"Name","block_3_depthwise","Padding","same","Stride",[2 2])
- batchNormalizationLayer("Name","block_3_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_3_depthwise_relu")
- convolution2dLayer([1 1],32,"Name","block_3_project","Padding","same")
- batchNormalizationLayer("Name","block_3_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],192,"Name","block_4_expand","Padding","same")
- batchNormalizationLayer("Name","block_4_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_4_expand_relu")
- groupedConvolution2dLayer([3 3],1,192,"Name","block_4_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_4_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_4_depthwise_relu")
- convolution2dLayer([1 1],32,"Name","block_4_project","Padding","same")
- batchNormalizationLayer("Name","block_4_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = additionLayer(2,"Name","block_4_add");
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],192,"Name","block_5_expand","Padding","same")
- batchNormalizationLayer("Name","block_5_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_5_expand_relu")
- groupedConvolution2dLayer([3 3],1,192,"Name","block_5_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_5_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_5_depthwise_relu")
- convolution2dLayer([1 1],32,"Name","block_5_project","Padding","same")
- batchNormalizationLayer("Name","block_5_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- additionLayer(2,"Name","block_5_add")
- convolution2dLayer([1 1],192,"Name","block_6_expand","Padding","same")
- batchNormalizationLayer("Name","block_6_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_6_expand_relu")
- groupedConvolution2dLayer([3 3],1,192,"Name","block_6_depthwise","Padding","same","Stride",[2 2])
- batchNormalizationLayer("Name","block_6_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_6_depthwise_relu")
- convolution2dLayer([1 1],64,"Name","block_6_project","Padding","same")
- batchNormalizationLayer("Name","block_6_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],384,"Name","block_7_expand","Padding","same")
- batchNormalizationLayer("Name","block_7_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_7_expand_relu")
- groupedConvolution2dLayer([3 3],1,384,"Name","block_7_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_7_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_7_depthwise_relu")
- convolution2dLayer([1 1],64,"Name","block_7_project","Padding","same")
- batchNormalizationLayer("Name","block_7_project_BN","Epsilon",0.001)];
- lgraph = addLayers(lgraph,tempLayers);
- tempLayers = additionLayer(2,"Name","block_7_add");
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],384,"Name","block_8_expand","Padding","same")
- batchNormalizationLayer("Name","block_8_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_8_expand_relu")
- groupedConvolution2dLayer([3 3],1,384,"Name","block_8_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_8_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_8_depthwise_relu")
- convolution2dLayer([1 1],64,"Name","block_8_project","Padding","same")
- batchNormalizationLayer("Name","block_8_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = additionLayer(2,"Name","block_8_add");
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],384,"Name","block_9_expand","Padding","same")
- batchNormalizationLayer("Name","block_9_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_9_expand_relu")
- groupedConvolution2dLayer([3 3],1,384,"Name","block_9_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_9_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_9_depthwise_relu")
- convolution2dLayer([1 1],64,"Name","block_9_project","Padding","same")
- batchNormalizationLayer("Name","block_9_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- additionLayer(2,"Name","block_9_add")
- convolution2dLayer([1 1],384,"Name","block_10_expand","Padding","same")
- batchNormalizationLayer("Name","block_10_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_10_expand_relu")
- groupedConvolution2dLayer([3 3],1,384,"Name","block_10_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_10_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_10_depthwise_relu")
- convolution2dLayer([1 1],96,"Name","block_10_project","Padding","same")
- batchNormalizationLayer("Name","block_10_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],576,"Name","block_11_expand","Padding","same")
- batchNormalizationLayer("Name","block_11_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_11_expand_relu")
- groupedConvolution2dLayer([3 3],1,576,"Name","block_11_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_11_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_11_depthwise_relu")
- convolution2dLayer([1 1],96,"Name","block_11_project","Padding","same")
- batchNormalizationLayer("Name","block_11_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = additionLayer(2,"Name","block_11_add");
- lgraph = addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],576,"Name","block_12_expand","Padding","same")
- batchNormalizationLayer("Name","block_12_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_12_expand_relu")
- groupedConvolution2dLayer([3 3],1,576,"Name","block_12_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_12_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_12_depthwise_relu")
- convolution2dLayer([1 1],96,"Name","block_12_project","Padding","same")
- batchNormalizationLayer("Name","block_12_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- additionLayer(2,"Name","block_12_add")
- convolution2dLayer([1 1],576,"Name","block_13_expand","Padding","same")
- batchNormalizationLayer("Name","block_13_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_13_expand_relu")
- groupedConvolution2dLayer([3 3],1,576,"Name","block_13_depthwise","Padding","same","Stride",[2 2])
- batchNormalizationLayer("Name","block_13_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_13_depthwise_relu")
- convolution2dLayer([1 1],160,"Name","block_13_project","Padding","same")
- batchNormalizationLayer("Name","block_13_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],960,"Name","block_14_expand","Padding","same")
- batchNormalizationLayer("Name","block_14_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_14_expand_relu")
- groupedConvolution2dLayer([3 3],1,960,"Name","block_14_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_14_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_14_depthwise_relu")
- convolution2dLayer([1 1],160,"Name","block_14_project","Padding","same")
- batchNormalizationLayer("Name","block_14_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = additionLayer(2,"Name","block_14_add");
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- convolution2dLayer([1 1],960,"Name","block_15_expand","Padding","same")
- batchNormalizationLayer("Name","block_15_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_15_expand_relu")
- groupedConvolution2dLayer([3 3],1,960,"Name","block_15_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_15_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_15_depthwise_relu")
- convolution2dLayer([1 1],160,"Name","block_15_project","Padding","same")
- batchNormalizationLayer("Name","block_15_project_BN","Epsilon",0.001)];
- lgraph =addLayers(lgraph,tempLayers);
- tempLayers = [
- additionLayer(2,"Name","block_15_add")
- convolution2dLayer([1 1],960,"Name","block_16_expand","Padding","same")
- batchNormalizationLayer("Name","block_16_expand_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_16_expand_relu")
- groupedConvolution2dLayer([3 3],1,960,"Name","block_16_depthwise","Padding","same")
- batchNormalizationLayer("Name","block_16_depthwise_BN","Epsilon",0.001)
- clippedReluLayer(6,"Name","block_16_depthwise_relu")
- convolution2dLayer([1 1],320,"Name","block_16_project","Padding","same")
- batchNormalizationLayer("Name","block_16_project_BN","Epsilon",0.001)
- convolution2dLayer([1 1],1280,"Name","Conv_1")
- batchNormalizationLayer("Name","Conv_1_bn","Epsilon",0.001)
- clippedReluLayer(6,"Name","out_relu")
- globalAveragePooling2dLayer("Name","global_average_pooling2d_1")
- fullyConnectedLayer(numClasses,"Name","Logits")
- softmaxLayer("Name","Logits_softmax")
- classificationLayer("Name","ClassificationLayer_Logits")];
- lgraph =addLayers(lgraph,tempLayers);
- lgraph = connectLayers(lgraph,"block_1_project_BN","block_2_expand");
- lgraph = connectLayers(lgraph,"block_1_project_BN","block_2_add/in2");
- lgraph = connectLayers(lgraph,"block_2_project_BN","block_2_add/in1");
- lgraph = connectLayers(lgraph,"block_3_project_BN","block_4_expand");
- lgraph = connectLayers(lgraph,"block_3_project_BN","block_4_add/in2");
- lgraph = connectLayers(lgraph,"block_4_project_BN","block_4_add/in1");
- lgraph = connectLayers(lgraph,"block_4_add","block_5_expand");
- lgraph = connectLayers(lgraph,"block_4_add","block_5_add/in2");
- lgraph = connectLayers(lgraph,"block_5_project_BN","block_5_add/in1");
- lgraph = connectLayers(lgraph,"block_6_project_BN","block_7_expand");
- lgraph = connectLayers(lgraph,"block_6_project_BN","block_7_add/in2");
- lgraph = connectLayers(lgraph,"block_7_project_BN","block_7_add/in1");
- lgraph = connectLayers(lgraph,"block_7_add","block_8_expand");
- lgraph = connectLayers(lgraph,"block_7_add","block_8_add/in2");
- lgraph = connectLayers(lgraph,"block_8_project_BN","block_8_add/in1");
- lgraph = connectLayers(lgraph,"block_8_add","block_9_expand");
- lgraph = connectLayers(lgraph,"block_8_add","block_9_add/in2");
- lgraph = connectLayers(lgraph,"block_9_project_BN","block_9_add/in1");
- lgraph = connectLayers(lgraph,"block_10_project_BN","block_11_expand");
- lgraph = connectLayers(lgraph,"block_10_project_BN","block_11_add/in2");
- lgraph = connectLayers(lgraph,"block_11_project_BN","block_11_add/in1");
- lgraph = connectLayers(lgraph,"block_11_add","block_12_expand");
- lgraph = connectLayers(lgraph,"block_11_add","block_12_add/in2");
- lgraph = connectLayers(lgraph,"block_12_project_BN","block_12_add/in1");
- lgraph = connectLayers(lgraph,"block_13_project_BN","block_14_expand");
- lgraph = connectLayers(lgraph,"block_13_project_BN","block_14_add/in2");
- lgraph = connectLayers(lgraph,"block_14_project_BN","block_14_add/in1");
- lgraph = connectLayers(lgraph,"block_14_add","block_15_expand");
- lgraph = connectLayers(lgraph,"block_14_add","block_15_add/in2");
- lgraph = connectLayers(lgraph,"block_15_project_BN","block_15_add/in1");
复制代码
其中训练和测试网络的参数设计如下:
从下图中可以看出Mobilenet V2在训练第6轮时训练集得分类准确率和损失函数曲线达到稳定状态。 训练好的Mobilenet V2模型对测试集的预测分类混淆矩阵如下图,准确率达到了88.2%,之前用一个两层的卷积网络准确率达到了99%以上,可能对这个数据集较少,使用深度网络Mobilenet V2可能会导致模型过拟合。当然这里的目的不是要说明Mobilenet V2的性能,仅仅只是提供程序实现的参考实例,有关提升Mobilenet V2对所使用数据集提升预测分类的准确率调参方法这里不再详细展开讨论。 通过Matlab中的函数activations可以查看测试集中数据使用训练好的模型Mobilenet V2提取的中间层特征,其中对应一个煤和矸石的第八层expanded_conv_project对应的特征图如下两图所示: 通过对Mobilenet v2第151层做t-SNE图能比较直观地看出数据集分类的数据结构,数据大致聚成2团,但是有部分几乎很难分开,这也与最后的分类准确率基本相符。 参考文献: 论文:MobileNets: Efficient Convolutional Neural Networks for MobileVision Applications 论文:MobileNetV2: Inverted Residuals and Linear Bottlenecks 论文:Searching for MobileNetV3
|