MobileNetV3

news2024/12/30 3:38:36

相对重量级网络而言,轻量级网络的特点是参数少、计算量小、推理时间短。更适用于存储空间和功耗受限的场景,例如移动端嵌入式设备等边缘计算设备。因此轻量级网络受到了广泛的关注,其中MobileNet可谓是其中的佼佼者。MobileNetV3经过了V1和V2前两代的积累,性能和速度都表现优异,MobileNetV3 参数是由NAS(network architecture search)搜索获取的,又继承的V1和V2的一些实用成果,并引人SE通道注意力机制,可谓集大成者。本文以应用为主,结合代码剖析MobileNetV3的网络结构。

主要特点

  1. 论文推出两个版本:Large 和 Small,分别适用于不同的场景;
  2. 使用NetAdapt算法获得卷积核和通道的最佳数量;
  3. 继承V1的深度可分离卷积;
  4. 继承V2的具有线性瓶颈的残差结构;
  5. 引入SE通道注意力结构;
  6. 使用了一种新的激活函数h-swish(x)代替Relu6,h的意思表示hard;
  7. 使用了Relu6(x + 3)/6来近似SE模块中的sigmoid;
  8. 修改了MobileNetV2后端输出head

整体结构

 上图为MobileNetV3的网络结构图,large和small的整体结构一致,区别就是基本单元bneck的个数以及内部参数上,主要是通道数目。(左图为small,右图为large)

上表为具体的参数设置,其中bneck是网络的基本结构。SE代表是否使用通道注意力机制。NL代表激活函数的类型,包括HS(h-swish),RE(ReLU)。NBN 代表没有BN操作。 s 是stride的意思,网络使用卷积stride操作进行降采样,没有使用pooling操作。

 

pytorch官方代码MobileNetV3

给出修改后的 代码

from typing import Callable, List, Optional

import torch
from torch import nn, Tensor
from torch.nn import functional as F
from functools import partial


def _make_divisible(ch, divisor=8, min_ch=None):
    """
    This function is taken from the original tf repo.
    It ensures that all layers have a channel number that is divisible by 8
    It can be seen here:
    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
    """
    if min_ch is None:
        min_ch = divisor
    new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_ch < 0.9 * ch:
        new_ch += divisor
    return new_ch


class ConvBNActivation(nn.Sequential):
    def __init__(self,
                 in_planes: int,
                 out_planes: int,
                 kernel_size: int = 3,
                 stride: int = 1,
                 groups: int = 1,
                 norm_layer: Optional[Callable[..., nn.Module]] = None,
                 activation_layer: Optional[Callable[..., nn.Module]] = None):
        padding = (kernel_size - 1) // 2
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d
        if activation_layer is None:
            activation_layer = nn.ReLU6
        super(ConvBNActivation, self).__init__(nn.Conv2d(in_channels=in_planes,
                                                         out_channels=out_planes,
                                                         kernel_size=kernel_size,
                                                         stride=stride,
                                                         padding=padding,
                                                         groups=groups,
                                                         bias=False),
                                               norm_layer(out_planes),
                                               activation_layer(inplace=True))


class SqueezeExcitation(nn.Module):
    def __init__(self, input_c: int, squeeze_factor: int = 4):
        super(SqueezeExcitation, self).__init__()
        squeeze_c = _make_divisible(input_c // squeeze_factor, 8)
        self.fc1 = nn.Conv2d(input_c, squeeze_c, 1)
        self.fc2 = nn.Conv2d(squeeze_c, input_c, 1)

    def forward(self, x: Tensor) -> Tensor:
        scale = F.adaptive_avg_pool2d(x, output_size=(1, 1))
        scale = self.fc1(scale)
        scale = F.relu(scale, inplace=True)
        scale = self.fc2(scale)
        scale = F.hardsigmoid(scale, inplace=True)
        return scale * x


class InvertedResidualConfig:
    def __init__(self,
                 input_c: int,
                 kernel: int,
                 expanded_c: int,
                 out_c: int,
                 use_se: bool,
                 activation: str,
                 stride: int,
                 width_multi: float):
        self.input_c = self.adjust_channels(input_c, width_multi)
        self.kernel = kernel
        self.expanded_c = self.adjust_channels(expanded_c, width_multi)
        self.out_c = self.adjust_channels(out_c, width_multi)
        self.use_se = use_se
        self.use_hs = activation == "HS"  # whether using h-swish activation
        self.stride = stride

    @staticmethod
    def adjust_channels(channels: int, width_multi: float):
        return _make_divisible(channels * width_multi, 8)


class InvertedResidual(nn.Module):
    def __init__(self,
                 cnf: InvertedResidualConfig,
                 norm_layer: Callable[..., nn.Module]):
        super(InvertedResidual, self).__init__()

        if cnf.stride not in [1, 2]:
            raise ValueError("illegal stride value.")

        self.use_res_connect = (cnf.stride == 1 and cnf.input_c == cnf.out_c)

        layers: List[nn.Module] = []
        activation_layer = nn.Hardswish if cnf.use_hs else nn.ReLU

        # expand
        if cnf.expanded_c != cnf.input_c:
            layers.append(ConvBNActivation(cnf.input_c,
                                           cnf.expanded_c,
                                           kernel_size=1,
                                           norm_layer=norm_layer,
                                           activation_layer=activation_layer))

        # depthwise
        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.expanded_c,
                                       kernel_size=cnf.kernel,
                                       stride=cnf.stride,
                                       groups=cnf.expanded_c,
                                       norm_layer=norm_layer,
                                       activation_layer=activation_layer))

        if cnf.use_se:
            layers.append(SqueezeExcitation(cnf.expanded_c))

        # project
        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.out_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Identity))

        self.block = nn.Sequential(*layers)
        self.out_channels = cnf.out_c
        self.is_strided = cnf.stride > 1

    def forward(self, x: Tensor) -> Tensor:
        result = self.block(x)
        if self.use_res_connect:
            result += x

        return result


class MobileNetV3(nn.Module):
    def __init__(self,
                 inverted_residual_setting: List[InvertedResidualConfig],
                 last_channel: int,
                 num_classes: int = 1000,
                 block: Optional[Callable[..., nn.Module]] = None,
                 norm_layer: Optional[Callable[..., nn.Module]] = None):
        super(MobileNetV3, self).__init__()

        if not inverted_residual_setting:
            raise ValueError("The inverted_residual_setting should not be empty.")
        elif not (isinstance(inverted_residual_setting, List) and
                  all([isinstance(s, InvertedResidualConfig) for s in inverted_residual_setting])):
            raise TypeError("The inverted_residual_setting should be List[InvertedResidualConfig]")

        if block is None:
            block = InvertedResidual

        if norm_layer is None:
            norm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.01)

        layers: List[nn.Module] = []

        # building first layer
        firstconv_output_c = inverted_residual_setting[0].input_c
        layers.append(ConvBNActivation(3,
                                       firstconv_output_c,
                                       kernel_size=3,
                                       stride=2,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        # building inverted residual blocks
        for cnf in inverted_residual_setting:
            layers.append(block(cnf, norm_layer))

        # building last several layers
        lastconv_input_c = inverted_residual_setting[-1].out_c
        lastconv_output_c = 6 * lastconv_input_c
        layers.append(ConvBNActivation(lastconv_input_c,
                                       lastconv_output_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        self.features = nn.Sequential(*layers)
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.classifier = nn.Sequential(nn.Linear(lastconv_output_c, last_channel),
                                        nn.Hardswish(inplace=True),
                                        nn.Dropout(p=0.2, inplace=True),
                                        nn.Linear(last_channel, num_classes))

        # initial weights
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode="fan_out")
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.zeros_(m.bias)

    def _forward_impl(self, x: Tensor) -> Tensor:
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)

        return x

    def forward(self, x: Tensor) -> Tensor:
        return self._forward_impl(x)


def mobilenet_v3_large(num_classes: int = 1000,
                       reduced_tail: bool = False) -> MobileNetV3:
    """
    Constructs a large MobileNetV3 architecture from
    "Searching for MobileNetV3" <https://arxiv.org/abs/1905.02244>.

    weights_link:
    https://download.pytorch.org/models/mobilenet_v3_large-8738ca79.pth

    Args:
        num_classes (int): number of classes
        reduced_tail (bool): If True, reduces the channel counts of all feature layers
            between C4 and C5 by 2. It is used to reduce the channel redundancy in the
            backbone for Detection and Segmentation.
    """
    width_multi = 1.0
    bneck_conf = partial(InvertedResidualConfig, width_multi=width_multi)
    adjust_channels = partial(InvertedResidualConfig.adjust_channels, width_multi=width_multi)

    reduce_divider = 2 if reduced_tail else 1

    inverted_residual_setting = [
        # input_c, kernel, expanded_c, out_c, use_se, activation, stride
        bneck_conf(16, 3, 16, 16, False, "RE", 1),
        bneck_conf(16, 3, 64, 24, False, "RE", 2),  # C1
        bneck_conf(24, 3, 72, 24, False, "RE", 1),
        bneck_conf(24, 5, 72, 40, True, "RE", 2),  # C2
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 3, 240, 80, False, "HS", 2),  # C3
        bneck_conf(80, 3, 200, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 480, 112, True, "HS", 1),
        bneck_conf(112, 3, 672, 112, True, "HS", 1),
        bneck_conf(112, 5, 672, 160 // reduce_divider, True, "HS", 2),  # C4
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
    ]
    last_channel = adjust_channels(1280 // reduce_divider)  # C5

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)


def mobilenet_v3_small(num_classes: int = 1000,
                       reduced_tail: bool = False) -> MobileNetV3:
    """
    Constructs a large MobileNetV3 architecture from
    "Searching for MobileNetV3" <https://arxiv.org/abs/1905.02244>.

    weights_link:
    https://download.pytorch.org/models/mobilenet_v3_small-047dcff4.pth

    Args:
        num_classes (int): number of classes
        reduced_tail (bool): If True, reduces the channel counts of all feature layers
            between C4 and C5 by 2. It is used to reduce the channel redundancy in the
            backbone for Detection and Segmentation.
    """
    width_multi = 1.0
    bneck_conf = partial(InvertedResidualConfig, width_multi=width_multi)
    adjust_channels = partial(InvertedResidualConfig.adjust_channels, width_multi=width_multi)

    reduce_divider = 2 if reduced_tail else 1

    inverted_residual_setting = [
        # input_c, kernel, expanded_c, out_c, use_se, activation, stride
        bneck_conf(16, 3, 16, 16, True, "RE", 2),  # C1
        bneck_conf(16, 3, 72, 24, False, "RE", 2),  # C2
        bneck_conf(24, 3, 88, 24, False, "RE", 1),
        bneck_conf(24, 5, 96, 40, True, "HS", 2),  # C3
        bneck_conf(40, 5, 240, 40, True, "HS", 1),
        bneck_conf(40, 5, 240, 40, True, "HS", 1),
        bneck_conf(40, 5, 120, 48, True, "HS", 1),
        bneck_conf(48, 5, 144, 48, True, "HS", 1),
        bneck_conf(48, 5, 288, 96 // reduce_divider, True, "HS", 2),  # C4
        bneck_conf(96 // reduce_divider, 5, 576 // reduce_divider, 96 // reduce_divider, True, "HS", 1),
        bneck_conf(96 // reduce_divider, 5, 576 // reduce_divider, 96 // reduce_divider, True, "HS", 1)
    ]
    last_channel = adjust_channels(1024 // reduce_divider)  # C5

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1133054.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

C++简单实现内存池原理

前言 该代码仅用于学习和理解其原理&#xff0c;当然高效内存池肯定是需要算法的。 // demo.cpp : 此文件包含 "main" 函数。程序执行将在此处开始并结束。 //#pragma pack(1) #include <iostream> using namespace std; //简单内存池 class gril { public…

国产内存强势崛起,光威龙武挑战D5内存24×2新标杆

今年国产内存的表现非常亮眼&#xff0c;出现了很多高质量的普惠产品&#xff0c;像是最近光威推出的一款内存条龙武ddr5 242就很有竞争力&#xff0c;加上之前神策新加入的ddr5 242版本&#xff0c;都是备受瞩目的新品&#xff0c;凭实力把DIY主机的内存配置拉高到了48GB。 龙…

Python遍历删除列表元素的一个奇怪bug

假定有一个Python列表&#xff0c;比如[CFFEX.IF, CFFEX.TS,SHFE.FU]&#xff0c;现在需要将其中带‘CFFEX’前缀的所有元素都删除。在使用列表推导式一行代码搞定之前&#xff0c;用了一种最朴素的遍历删除方法&#xff0c;结果出现了意想不到的的问题。复盘了下&#xff0c;结…

【带头学C++】----- 1.基础知识 ---- 1.23 运算符概述

1.23 运算符概述 运算符&#xff0c;在数学中常见的加减乘除之类的符号&#xff0c;那么在C在编程语言中呢&#xff0c;将使用特定的符号或标记对操作数进行操作以生成结果。用算术运算符将运算对象(也称操作数)连接起来的、符合C 语法规则的式子&#xff0c;称为C 算术表达式运…

配置Sentinel 控制台

1.遇到的问题 服务网关 | RuoYi 最近调试若依的微服务版本需要用到Sentinel这个组件&#xff0c;若依内部继承了这个组件连上即用。 Sentinel是阿里巴巴开源的限流器熔断器&#xff0c;并且带有可视化操作界面。 在日常开发中&#xff0c;限流功能时常被使用&#xff0c;用…

《现代音乐人编曲手册_传统管弦乐配器和MIDI》 笔记

MIDI 已经广泛应用在音乐制作的各个领域&#xff0c;但模仿传统管弦乐队的演奏仍是公认的 MIDI制作难点。很多制作者使用了庞大的采样音源&#xff0c;但仍然制作不出像样的管弦乐作品。即便是音乐专业出身、学习过传统管弦乐配器法的专业人士&#xff0c;也经常出现“谱面没问…

UGO+DRS评复之路

前言 针对数据库整体迁移方案&#xff0c;为解决异构平台数据库迁移&#xff0c;为减轻迁移人员的工作强度以及迁移周期。华为云GaussDB迁移UGO&DRS迁移工具应运而生。 UGO介绍 数据库和应用迁移&#xff08;Database and Application Migration UGO&#xff0c;简称为UG…

2 https原理

1 HTTPS与HTTP的区别&#xff1f;

Linux友人帐之日志与备份

一、日志 1.1概述 日志文件是重要的系统信息文件&#xff0c;其中记录了许多重要的系统事件&#xff0c;包括用户的登录信息、系统的启动信息、系统的安全信息、邮件相关信息、各种服务相关信息等。日志对于安全来说也很重要&#xff0c;它记录了系统每天发生的各种事情&#…

【lesson1】数据库基础

文章目录 连接数据库服务器什么是数据库初步见识数据库 连接数据库服务器 指令&#xff1a; -h&#xff1a;指明登入部署了MySQL服务的主机 -P&#xff1a;指明我们要访问的端口号 -u&#xff1a;指明登入用户 -p&#xff1a;指明需要输入密码 什么是数据库 在Linux查看具…

高防CDN:保卫您的网站免受攻击之利与弊

在当今数字化时代&#xff0c;网络安全对于网站经营者至关重要。高防CDN&#xff08;Content Delivery Network&#xff09;技术旨在提供强大的安全性&#xff0c;以保护网站免受恶意攻击。本文将探讨高防CDN为普通网站带来的优势与不足之处&#xff0c;并分析国内外高防CDN的发…

Use nvidia card in docker

1.确保在宿主机上已经安装了nvidia 显卡的驱动 $ nvidia-smi 2.准备Nvidia-docker的环境 $ distribution$(. /etc/os-release;echo $ID$VERSION_ID) && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/k…

基于SSM的点餐平台系统设计与实现

末尾获取源码 开发语言&#xff1a;Java Java开发工具&#xff1a;JDK1.8 后端框架&#xff1a;SSM 前端&#xff1a;Vue 数据库&#xff1a;MySQL5.7和Navicat管理工具结合 服务器&#xff1a;Tomcat8.5 开发软件&#xff1a;IDEA / Eclipse 是否Maven项目&#xff1a;是 目录…

OpenCV #以图搜图:均值哈希算法(Average Hash Algorithm)原理与实验

1. 介绍 均值哈希算法&#xff08;Average Hash Algorithm&#xff09; 是哈希算法的一种&#xff0c;主要用来做相似图片的搜索工作。 2. 原理 均值哈希算法&#xff08;aHash&#xff09;首先将原图像缩小成一个固定大小的像素图像&#xff0c;然后将图像转换为灰度图像&am…

贝叶斯优化分步指南:基于 Python 的方法

图片 奥坎耶尼贡 一、说明 对于存在隐含变量的模型&#xff0c;有卡尔曼、隐马尔可夫、混合高斯模型、EM算法&#xff0c;这些模型都是建立在一种理论&#xff0c;贝叶斯推断理论&#xff0c;本篇讲授典型的贝叶斯推断原理。 二、原理综述 贝叶斯优化是一种用于黑盒函数全局&am…

【多线程】Java如何实现多线程?如何保证线程安全?如何自定义线程池?

个人简介&#xff1a;Java领域新星创作者&#xff1b;阿里云技术博主、星级博主、专家博主&#xff1b;正在Java学习的路上摸爬滚打&#xff0c;记录学习的过程~ 个人主页&#xff1a;.29.的博客 学习社区&#xff1a;进去逛一逛~ 多线程 Java多线程1. 进程与线程2. 多线程1&am…

Pytorch指定数据加载器使用子进程

torch.utils.data.DataLoader(train_dataset, batch_sizebatch_size, shuffleTrue,num_workers4, pin_memoryTrue) num_workers 参数是 DataLoader 类的一个参数&#xff0c;它指定了数据加载器使用的子进程数量。通过增加 num_workers 的数量&#xff0c;可以并行地读取和预处…

多伦多 Pwn2Own 大赛首日战报!三星 Galaxy S23 被黑两次

Bleeping Computer 网站披露&#xff0c;加拿大多伦多举行的 Pwn2Own 2023 黑客大赛的第一天&#xff0c;网络安全研究人员就成功两次攻破三星 Galaxy S23。 大会现场&#xff0c;研究人员还“演示"了针对小米 13 Pro 智能手机、打印机、智能扬声器、网络附加存储 (NAS) 设…

Ubuntu卸载或重置防火墙规则

Ubuntu卸载或重置防火墙规则 1、开启防火墙后查看对应规则编号&#xff0c;然后进行删除 sudo ufw status numbered ——查看所有规则编号id sudo ufw delete 2 ——删除对应id的规则&#xff08;比如删除2号规则&#xff09; 2、按规则来删除。 例如&#xff0c;如果你使用s…

Meetup 回顾|Data Infra 研究社第十六期(含资料发布)

本文整理于上周六&#xff08;10月21日&#xff09;Data Infra 第 16 期的活动内容。本次活动由 Databend 研发工程师-王旭东为大家带来了一场主题为《Databend hash join spill 设计与实现》的分享&#xff0c;让我们一起回顾一下吧~ 以下是本次活动的相关视频、资料及文字&a…