LinkNet分割模型搭建

news2024/11/25 20:50:19

原论文:LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation

直接步入正题~~~

一、LinkNet

1.decoder模块

class DecoderBlock(nn.Module):
    def __init__(self, in_channels, n_filters): #512, 256
        super(DecoderBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels, in_channels // 4, 1)
        self.norm1 = nn.BatchNorm2d(in_channels // 4)
        self.relu1 = nonlinearity

        self.deconv2 = nn.ConvTranspose2d(in_channels // 4, in_channels // 4, 3, stride=2, padding=1, output_padding=1)
        self.norm2 = nn.BatchNorm2d(in_channels // 4)
        self.relu2 = nonlinearity

        self.conv3 = nn.Conv2d(in_channels // 4, n_filters, 1)
        self.norm3 = nn.BatchNorm2d(n_filters)
        self.relu3 = nonlinearity

    def forward(self, x):
        x = self.conv1(x)
        x = self.norm1(x)
        x = self.relu1(x)
        x = self.deconv2(x)
        x = self.norm2(x)
        x = self.relu2(x)
        x = self.conv3(x)
        x = self.norm3(x)
        x = self.relu3(x)
        return x

2.整体网络结构

class LinkNet34(nn.Module):
    def __init__(self, num_classes=1):
        super(LinkNet34, self).__init__()

        filters = [64, 128, 256, 512]
        resnet = models.resnet34(pretrained=False)
        self.firstconv = resnet.conv1
        self.firstbn = resnet.bn1
        self.firstrelu = resnet.relu
        self.firstmaxpool = resnet.maxpool
        self.encoder1 = resnet.layer1
        self.encoder2 = resnet.layer2
        self.encoder3 = resnet.layer3
        self.encoder4 = resnet.layer4

        self.decoder4 = DecoderBlock(filters[3], filters[2])
        self.decoder3 = DecoderBlock(filters[2], filters[1])
        self.decoder2 = DecoderBlock(filters[1], filters[0])
        self.decoder1 = DecoderBlock(filters[0], filters[0])

        self.finaldeconv1 = nn.ConvTranspose2d(filters[0], 32, 3, stride=2)
        self.finalrelu1 = nonlinearity
        self.finalconv2 = nn.Conv2d(32, 32, 3)
        self.finalrelu2 = nonlinearity
        self.finalconv3 = nn.Conv2d(32, num_classes, 2, padding=1)

    def forward(self, x):
        # Encoder
        x = self.firstconv(x) #[1, 64, 128, 128]
        #print(f'x0:{x.shape}')
        x = self.firstbn(x)
        x = self.firstrelu(x)
        x = self.firstmaxpool(x) #[1, 64, 64, 64]
        e1 = self.encoder1(x) #[1, 64, 64, 64]
        e2 = self.encoder2(e1) #[1, 128, 32, 32]
        e3 = self.encoder3(e2) #[1, 256, 16, 16]
        e4 = self.encoder4(e3) #[1, 512, 8, 8]

        # Decoder
        d4 = self.decoder4(e4) + e3 #[1, 256, 16, 16]
        d3 = self.decoder3(d4) + e2
        d2 = self.decoder2(d3) + e1
        d1 = self.decoder1(d2) #[1, 64, 128, 128]
        out = self.finaldeconv1(d1) #[1, 32, 257, 257]
        out = self.finalrelu1(out)
        out = self.finalconv2(out) #[1, 32, 255, 255]
        out = self.finalrelu2(out)
        out = self.finalconv3(out) #[1, 4, 256, 256]

        return F.sigmoid(out)

if __name__ == '__main__':
    input_tensor = torch.randn((1, 3, 256, 256))
    model = LinkNet34(num_classes=4)
    out1 = model(input_tensor)
    print(out1.shape)

二、D-LinkNet

原论文:D-LinkNet: LinkNet with Pretrained Encoder and Dilated Convolution for High Resolution Satellite Imagery Road Extraction

1.decoder模块

class DecoderBlock(nn.Module):
    def __init__(self, in_channels, n_filters):
        super(DecoderBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels, in_channels // 4, 1)
        self.norm1 = nn.BatchNorm2d(in_channels // 4)
        self.relu1 = nonlinearity

        self.deconv2 = nn.ConvTranspose2d(in_channels // 4, in_channels // 4, 3, stride=2, padding=1, output_padding=1)
        self.norm2 = nn.BatchNorm2d(in_channels // 4)
        self.relu2 = nonlinearity

        self.conv3 = nn.Conv2d(in_channels // 4, n_filters, 1)
        self.norm3 = nn.BatchNorm2d(n_filters)
        self.relu3 = nonlinearity

    def forward(self, x):
        x = self.conv1(x)
        x = self.norm1(x)
        x = self.relu1(x)
        x = self.deconv2(x)
        x = self.norm2(x)
        x = self.relu2(x)
        x = self.conv3(x)
        x = self.norm3(x)
        x = self.relu3(x)
        return x

2.dblock模块

class Dblock_more_dilate(nn.Module):
    def __init__(self, channel):
        super(Dblock_more_dilate, self).__init__()
        self.dilate1 = nn.Conv2d(channel, channel, kernel_size=3, dilation=1, padding=1)
        self.dilate2 = nn.Conv2d(channel, channel, kernel_size=3, dilation=2, padding=2)
        self.dilate3 = nn.Conv2d(channel, channel, kernel_size=3, dilation=4, padding=4)
        self.dilate4 = nn.Conv2d(channel, channel, kernel_size=3, dilation=8, padding=8)
        self.dilate5 = nn.Conv2d(channel, channel, kernel_size=3, dilation=16, padding=16)
        for m in self.modules():
            if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
                if m.bias is not None:
                    m.bias.data.zero_()

    def forward(self, x):
        dilate1_out = nonlinearity(self.dilate1(x))
        dilate2_out = nonlinearity(self.dilate2(dilate1_out))
        dilate3_out = nonlinearity(self.dilate3(dilate2_out))
        dilate4_out = nonlinearity(self.dilate4(dilate3_out))
        dilate5_out = nonlinearity(self.dilate5(dilate4_out))
        out = x + dilate1_out + dilate2_out + dilate3_out + dilate4_out + dilate5_out
        return out

3.整体网络结构

class DinkNet34(nn.Module):
    def __init__(self, num_classes=1):
        super(DinkNet34, self).__init__()

        filters = [64, 128, 256, 512]
        resnet = models.resnet34(pretrained=True)
        self.firstconv = resnet.conv1
        self.firstbn = resnet.bn1
        self.firstrelu = resnet.relu
        self.firstmaxpool = resnet.maxpool
        self.encoder1 = resnet.layer1
        self.encoder2 = resnet.layer2
        self.encoder3 = resnet.layer3
        self.encoder4 = resnet.layer4

        self.dblock = Dblock(512)

        self.decoder4 = DecoderBlock(filters[3], filters[2])
        self.decoder3 = DecoderBlock(filters[2], filters[1])
        self.decoder2 = DecoderBlock(filters[1], filters[0])
        self.decoder1 = DecoderBlock(filters[0], filters[0])

        self.finaldeconv1 = nn.ConvTranspose2d(filters[0], 32, 4, 2, 1)
        self.finalrelu1 = nonlinearity
        self.finalconv2 = nn.Conv2d(32, 32, 3, padding=1)
        self.finalrelu2 = nonlinearity
        self.finalconv3 = nn.Conv2d(32, num_classes, 3, padding=1)

    def forward(self, x):
        # Encoder
        x = self.firstconv(x) #[1, 64, 128, 128]
        x = self.firstbn(x)
        x = self.firstrelu(x)
        x = self.firstmaxpool(x) #[1, 64, 64, 64]
        e1 = self.encoder1(x) #[1, 64, 64, 64]
        e2 = self.encoder2(e1) #[1, 128, 32, 32]
        e3 = self.encoder3(e2) #[1, 256, 16, 16]
        e4 = self.encoder4(e3) #[1, 512, 8, 8]

        # Center
        e4 = self.dblock(e4) #[1, 512, 8, 8]

        # Decoder
        d4 = self.decoder4(e4) + e3
        d3 = self.decoder3(d4) + e2
        d2 = self.decoder2(d3) + e1
        d1 = self.decoder1(d2)

        out = self.finaldeconv1(d1)
        out = self.finalrelu1(out)
        out = self.finalconv2(out)
        out = self.finalrelu2(out)
        out = self.finalconv3(out)

        return F.sigmoid(out)

if __name__ == '__main__':
    input_tensor = torch.randn((1, 3, 256, 256))
    model = DinkNet34_less_pool(num_classes=4)
    out = model(input_tensor)
    print(out.shape)

此处还可以将主干网络换为resnet50/resnet101

三、NL-LinkNet

原论文:NL-LinkNet: Toward Lighter but More Accurate Road Extraction with Non-Local Operations

1.decoder模块

class DecoderBlock(nn.Module):
    def __init__(self, in_channels, n_filters):
        super(DecoderBlock, self).__init__()

        self.conv1 = nn.Conv2d(in_channels, in_channels // 4, 1)
        self.norm1 = nn.BatchNorm2d(in_channels // 4)
        self.relu1 = nonlinearity

        self.deconv2 = nn.ConvTranspose2d(in_channels // 4, in_channels // 4, 3, stride=2, padding=1, output_padding=1)
        self.norm2 = nn.BatchNorm2d(in_channels // 4)
        self.relu2 = nonlinearity

        self.conv3 = nn.Conv2d(in_channels // 4, n_filters, 1)
        self.norm3 = nn.BatchNorm2d(n_filters)
        self.relu3 = nonlinearity

    def forward(self, x):
        x = self.conv1(x)
        x = self.norm1(x)
        x = self.relu1(x)
        x = self.deconv2(x)
        x = self.norm2(x)
        x = self.relu2(x)
        x = self.conv3(x)
        x = self.norm3(x)
        x = self.relu3(x)
        return x

2.NONLocalBlock2D_EGaussian模块

这个模块在http://t.csdn.cn/wQGan这篇文章中用到过!!!

class _NonLocalBlock2D_EGaussian(nn.Module):
    def __init__(self, in_channels, inter_channels=None, dimension=3, sub_sample=True, bn_layer=True):
        super(_NonLocalBlock2D_EGaussian, self).__init__()

        assert dimension in (1, 2, 3)

        self.dimension = dimension
        self.sub_sample = sub_sample

        self.in_channels = in_channels
        self.inter_channels = inter_channels

        if self.inter_channels is None:
            self.inter_channels = in_channels // 2
            if self.inter_channels == 0:
                self.inter_channels = 1

        conv_nd = nn.Conv2d
        max_pool_layer = nn.MaxPool2d(kernel_size=(2, 2))
        bn = nn.BatchNorm2d

        self.g = conv_nd(in_channels=self.in_channels, out_channels=self.inter_channels,
                         kernel_size=1, stride=1, padding=0)

        if bn_layer:
            self.W = nn.Sequential(
                conv_nd(in_channels=self.inter_channels, out_channels=self.in_channels,
                        kernel_size=1, stride=1, padding=0),
                bn(self.in_channels)
            )
            nn.init.constant_(self.W[1].weight, 0)
            nn.init.constant_(self.W[1].bias, 0)
        else:
            self.W = conv_nd(in_channels=self.inter_channels, out_channels=self.in_channels,
                             kernel_size=1, stride=1, padding=0)
            nn.init.constant_(self.W.weight, 0)
            nn.init.constant_(self.W.bias, 0)

        self.theta = conv_nd(in_channels=self.in_channels, out_channels=self.inter_channels,
                             kernel_size=1, stride=1, padding=0)
        self.phi = conv_nd(in_channels=self.in_channels, out_channels=self.inter_channels,
                           kernel_size=1, stride=1, padding=0)

        if sub_sample:
            self.g = nn.Sequential(self.g, max_pool_layer)
            self.phi = nn.Sequential(self.phi, max_pool_layer)

    def forward(self, x): #例: 1, 128, 32, 32
        batch_size = x.size(0) #1
        # 128, 32, 32--64, 32, 32--64, 16, 16--1, 64, 16*16
        g_x = self.g(x).view(batch_size, self.inter_channels, -1)
        g_x = g_x.permute(0, 2, 1) #1, 16*16, 64
        # print(f'g_x:{g_x.shape}')

        # 128, 32, 32--64, 32, 32--1, 64, 32*32
        theta_x = self.theta(x).view(batch_size, self.inter_channels, -1)
        theta_x = theta_x.permute(0, 2, 1) #1, 32*32, 64
        # print(f'theta_x:{theta_x.shape}')

        # 128, 32, 32--64, 32, 32--64, 16, 16--1, 64, 16*16
        phi_x = self.phi(x).view(batch_size, self.inter_channels, -1)
        f = torch.matmul(theta_x, phi_x) #1, 32*32, 16*16
        # print(f'f:{f.shape}')
        f_div_C = F.softmax(f, dim=-1)

        y = torch.matmul(f_div_C, g_x) #1, 32*32, 64
        y = y.permute(0, 2, 1).contiguous() #1, 64, 32*32
        y = y.view(batch_size, self.inter_channels, *x.size()[2:]) #1, 64, 32, 32
        # print(f'y:{y.shape}')
        W_y = self.W(y) #1, 128, 32, 32
        z = W_y + x #1, 128, 32, 32

        return z

3.整体网络结构

class NONLocalBlock2D_EGaussian(_NonLocalBlock2D_EGaussian):
    def __init__(self, in_channels, inter_channels=None, sub_sample=True, bn_layer=True):
        super(NONLocalBlock2D_EGaussian, self).__init__(in_channels,
                                                        inter_channels=inter_channels,
                                                        dimension=2, sub_sample=sub_sample,
                                                        bn_layer=bn_layer)


class NL34_LinkNet(nn.Module):
    def __init__(self, num_classes=1):
        super(NL34_LinkNet, self).__init__()

        filters = (64, 128, 256, 512)
        resnet = models.resnet34(pretrained=True)
        self.firstconv = resnet.conv1
        self.firstbn = resnet.bn1
        self.firstrelu = resnet.relu
        self.firstmaxpool = resnet.maxpool

        self.encoder1 = resnet.layer1
        self.encoder2 = resnet.layer2
        self.nonlocal3 = NONLocalBlock2D_EGaussian(128)
        self.encoder3 = resnet.layer3
        self.nonlocal4 = NONLocalBlock2D_EGaussian(256)
        self.encoder4 = resnet.layer4

        self.decoder4 = DecoderBlock(filters[3], filters[2])
        self.decoder3 = DecoderBlock(filters[2], filters[1])
        self.decoder2 = DecoderBlock(filters[1], filters[0])
        self.decoder1 = DecoderBlock(filters[0], filters[0])

        self.finaldeconv1 = nn.ConvTranspose2d(filters[0], 32, 4, 2, 1)
        self.finalrelu1 = nonlinearity
        self.finalconv2 = nn.Conv2d(32, 32, 3, padding=1)
        self.finalrelu2 = nonlinearity
        self.finalconv3 = nn.Conv2d(32, num_classes, 3, padding=1)

    def forward(self, x):
        # Encoder
        x = self.firstconv(x) #[1, 64, 128, 128]
        x = self.firstbn(x)
        x = self.firstrelu(x)
        x = self.firstmaxpool(x)
        e1 = self.encoder1(x) #[1, 64, 64, 64]
        e2 = self.encoder2(e1) #[1, 128, 32, 32]
        e3 = self.nonlocal3(e2) #[1, 128, 32, 32]
        e3 = self.encoder3(e3) #[1, 256, 16, 16]
        e4 = self.nonlocal4(e3) #[1, 256, 16, 16]
        e4 = self.encoder4(e4) #[1, 512, 8, 8]

        # Decoder
        d4 = self.decoder4(e4) + e3
        d3 = self.decoder3(d4) + e2
        d2 = self.decoder2(d3) + e1
        d1 = self.decoder1(d2)

        out = self.finaldeconv1(d1)
        out = self.finalrelu1(out)
        out = self.finalconv2(out)
        out = self.finalrelu2(out)
        out = self.finalconv3(out)

        return F.sigmoid(out)

if __name__ == '__main__':
    input_tensor = torch.randn((1, 4, 256, 256))
    model = NL34_LinkNet(num_classes=4)
    out = model(input_tensor)
    print(out.shape)

reference:

https://github.com/zstar1003/Road-Extractionicon-default.png?t=N6B9https://github.com/zstar1003/Road-Extraction

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/764948.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

计算机毕业论文选题推荐|软件工程|信息管理|数据分析|系列二

文章目录 导文题目导文 计算机毕业论文选题推荐|软件工程|信息管理|数据分析|系列二 (***语言)==使用其他任何编程语言 例如:基于(***语言)门窗账务管理系统的设计与实现 得到:基于JAVA门窗账务管理系统的设计与实现 基于vue门窗账务管理系统的设计与实现 等等 题目 …

关于21电赛数字识别

这里我们只讲数字识别的代码。 关于模型的训练,这里就不多讲了,看我的这篇文章: K210学习篇(八)在MaixHub训练模型_ODF..的博客-CSDN博客 这里着重讲我们得到训练后的模型该怎么去修改以及和stm32单片机通信。 当我们把下载的模型解压后,就得到一些这些文件,我们只需…

EasyDSS视频直播点播平台如何修改登录密码与开启接口鉴权?

随着互联网的发展,网络安全问题也越来越受到重视。近期我们也对旗下所有的视频平台进行了技术升级,以增强平台的数据安全性,保障用户的信息安全。用户也可以通过以下指导步骤,对平台相关配置进行修改,提高保护等级。 1…

阻塞、挂起和睡眠

挂起(主动)和阻塞(被动) 本质:正在执行的进程/线程,由于某些原因主动或者被动的释放 CPU,暂停执行;挂起会将进程移出内存,阻塞的进程还在内存中;挂起时会释放…

CSDN发表文章的常用语法说明

CSDN常用语法说明 一、标题二、文本样式三、列表四、图片五、链接六、目录一级目录二级目录三级目录 七、表格八、注释九、自定义列表十、LaTeX 数学公式十一、插入甘特图十二、插入UML图十三、插入Mermaid流程图十五、插入Flowchart流程图十六、 插入类图十七、快捷键十八、脚…

汽车供应链专场对接会 | 8月25日大会同期活动

爱普搜汽车供应链对接会,是根据采购商的项目需求,有针对性地组织全国各地采购商与供应商,进行面对面交流与沟通,促成实质性交易。参会群体为汽车行业制造型企业、主机厂、Tier1/2。 供应商在参加对接会前已做足功课,现…

使用Linux Deploy搭建服务器(二)使用chroot容器安装linux发行版

一、先下载好软件 Linux Deploy(一)Linux Deploy简介与软件安装_吻等离子的博客-CSDN博客 二、搭建debian 首先手机要获取root权限 linux Deploy支持许多发行版linux,发行版建议选择Debian,这个版本最好装,Ubuntu …

提升互联网创业项目在搜索结果中的排名的SEO技巧

搜索引擎优化(SEO)技巧:提升互联网创业项目在搜索结果中的排 在当今竞争激烈的互联网创业领域,拥有一个高排名的搜索结果对于项目的成功至关重要。搜索引擎优化(SEO)是一种有效的策略,可以提高您的互联网创业项目在搜索…

RocketMQ 5.1.0 在java中的使用

版本&#xff1a; 当前测试版本&#xff1a;springBoot 2.3.9、 RocketMQ 5.1.0 Maven或Gradle RocketMQ的依赖项&#xff1a; <dependency><groupId>org.apache.rocketmq</groupId><artifactId>rocketmq-client</artifactId><version>5…

国内开源框架(快速开发,避免重复造轮子)

若依开源框架&#xff08;最容易上手&#xff0c;轻巧简洁&#xff09; 若依开源框架是一款基于SpringBoot2.x和Vue.js的前后端分离的权限管理系统。它采用了前后端分离的架构&#xff0c;使得系统更加灵活、易扩展。同时&#xff0c;它还集成了多种常见的功能模块&#xff0c…

UEC++: 接口

1. 2. 3. 4.一般接口的源文件是不用写逻辑的&#xff0c;一般是在接口头文件中编写 5.被C类继承&#xff1a; 写完函数&#xff0c;千万不允许定义&#xff01;&#xff01;&#xff01; 添加标记宏 找到一个类&#xff1a;继承I开头的接口&#xff1a;引用头文件 错误重写&…

移 动 端

移动端 国内的UC和QQ&#xff0c;百度等手机浏览器都是根据 Webkit 修改过来的内核 兼容移动端主流浏览器 处理 webkit 内核浏览器即可 常见移动端屏幕尺寸 调式 Chrome DevTools&#xff08;谷歌浏览器&#xff09;的模拟手机调试搭建本地 web 服务器&#xff0c; 手机和服…

嵌入式开发--STM32用DMA+IDLE中断方式串口接收不定长数据

回顾 之前讲过用 利用IDLE空闲中断来接收不定长数据 &#xff0c;但是没有用到DMA&#xff0c;其实用DMA会更加的高效&#xff0c;MCU也可以腾出更多的性能去处理应该做的事情。 原理简介 IDLE顾名思义&#xff0c;就是空闲的意思&#xff0c;即当监测到串口空闲超过1个串口…

Java---第五章(类和对象,方法带参)

Java---第五章 一 类和对象类的由来&#xff1a;二者之间的关系this关键字&#xff1a;构造方法 二 方法带参构造方法带参&#xff1a;方法带参对象数组引用数据类型作为方法参数方法重载面向对象说明面向对象和面向过程的区别 一 类和对象 类的由来&#xff1a; 人们在日常生…

【HCIA】11.ACL与NAT地址转换

ACL 通过ACL可以实现对网络中报文流的精确识别和控制&#xff0c;达到控制网络访问行为、防止网络攻击和提高网络带宽利用率的目的。 ACL是由permit或deny语句组成的一系列有顺序的规则的集合&#xff1b;它通过匹配报文的相关字段实现对报文的分类。ACL是能够匹配一个IP数据包…

结合ChatGPT制作PPT

今天看到圈友的一个AI分享&#xff0c;然后自己本身需要做一个分享的PPT。刚好那着帖子实战一下。先说下整体感受。 优点&#xff1a;制作成本确实会比较低&#xff0c;很熟练的话大概就是1分钟一个都有可能。整体流程是先找个第三方PPT制作网站&#xff0c;看下支不支持文本转…

Unity游戏源码分享-Third Person Controller - Shooter Template v1.3.1

Unity游戏源码分享-Third Person Controller - Shooter Template v1.3.1 功能非常齐全 AI格斗 2.5D 完整工程地址&#xff1a;https://download.csdn.net/download/Highning0007/88057824

兴达易控modbus转profinet网关与流量变送器兼容转modbusTCP网口协议

本案例演示电磁流量计通过兴达易控modbus转profinet网关&#xff08;XD-MDPN100&#xff09;连接西门子1200PLC实现Profinet转ModbusTCP&#xff0c;协议网关MDPN100兼容转ModbusTCP网口协议&#xff0c;大大减少了对plc编程的工作 网络拓展图 打开博图&#xff0c;添加PLC并加…

django报错设置auth User

1.报错&#xff1a;auth.User.groups... auth.User.user_permissions... 我们的用户组、用户权限只能关联一个用户 &#xff0c;我们自己定义了一个用户表&#xff0c;系统还有一个用户表&#xff0c;这时候就会出问题。 解决办法&#xff1a; 让给我们自己定义的user替换系…

(学习笔记-TCP连接建立)TCP 为什么是三次握手?不是两次、四次?

常规回答&#xff1a;“因为三次握手才能保证双方具有接收和发送的能力” 原因一&#xff1a;避免历史连接 三次握手的首要原因是为了防止旧的重复连接初始化造成混乱。 假设&#xff1a;客户端先发送了SYN(seq90)报文&#xff0c;然后客户端宕机了&#xff0c;而且这个SYN报…