前言

最近呢，在忙一个项目，需要将pytorch训练的模型部署在移动端。然后遇到也遇到了一些坑，简单的记录一下整个过程，转换的模型就使用经典的分类网络模型mobilenet_v2。

将pytorch模型转换为onnx模型

环境准备

这个步骤比较简单，只需要安装pytorch即可，笔者这里使用的是pytorch 1.9.1的版本，直接用pip 安装即可

转换步骤

pytorch转为onnx的代码网上很多，也比较简单，就是需要注意几点：1）模型导入的时候，是需要导入模型的网络结构和模型的参数，有的pytorch模型只保存了模型参数，还需要导入模型的网络结构；2）pytorch转为onnx的时候需要输入onnx模型的输入尺寸，有的模型是固定尺寸输入，有的是不固定，这个需要注意区分

转换代码如下，供参考：

import os
import sys
import torch
import torchvision

def export_onnx():

    onnx_model_name = "mobilenet.onnx"

    print("Loading Model")
    model = torchvision.models.mobilenet_v2(pretrained=True)
    model.eval() # Put in inference mode
    
    # Create dummy input
    dummy_input = torch.randn(1, 3, 224, 224)

    # Export as ONNX
    print(f"Exporting as ONNX: {onnx_model_name}")
    torch.onnx._export(
        model,
        dummy_input,
        onnx_model_name, # Output name
        opset_version=13, # ONNX Opset Version
        export_params=True, # Store the trained parameters in the model file
        do_constant_folding=True, # Execute constant folding for optimization
        input_names = ['input'],   # the model's input names 
        # output_names = ['pred_logits', 'pred_points'], # the model's output names (see forward in the architecture)
        output_names = ['pred_logits'], # the model's output names (see forward in the architecture)
        # dynamic_axes={
        #     # Input is an image [batch_size, channels, width, height]
        #     # all of it can be variable so we need to add it in dynamic_axes
        #     'input': {
        #         0: 'batch_size',
        #         1: 'channels',
        #         2: 'width',
        #         3: 'height'
        #     }, 
        #     'pred_logits': [0, 1, 2]
        # } 
    )



if __name__ == '__main__':

    export_onnx()

像笔者这种使用torchvision加载模型的方式，脚本会自动从torchvision库上下载模型保存在本地，本地的保存路径是C:\Users\用户名.cache\torch\hub\checkpoints

将onnx模型转为ncnn

环境搭建

需要编译安装的软件包括：vs2019/2017、 cmake、opencv(可选)、protobuf 3.11.2、VulkanSDK 1.2.148.0（可选）、ncnn。

这里除了opencv和VulkanSDK ，其他的软件都的必装的。opencv是图像处理库，如果需要在windows系统上运行转好的ncnn的模型，一般都需要用到opencv做一些图像读写操作，或者是前处理之类的。VulkanSDK是gpu加速的一个sdk库，可以根据自己的电脑环境选装。

因为笔者的环境之前就已经安装了vs2019和cmake，所以这里就不详细讲述这两个软件的安装，vs2019和cmake的安装，网上也有很多教程可以参考。vs2019整体是在线安装，比较简单，基本上就是一直点下一步就可以了。cmake可以在这个官网下载最新的版本，然后本地解压一下，再配置一下环境，能够在命令行输入cmake，就能有如下的效果即代表环境配置成功：
在这里插入图片描述

编译安装protobuf 3.11.2

Protobuf是一种平台无关、语言无关、可扩展且轻便高效的序列化数据结构的协议，可以用于网络通信和数据存储。
从该地址下载protobuf3.11.2的压缩包，解压后放在之前新建的文件中，然后在开始菜单找到Visual Studio 2019=>x64 Native Tools Command Prompt for VS 2019右击，点击更多，以管理员身份运行，输入以下命令编译protobuf3.4.0：

cd <protobuf-root-dir>
mkdir build-vs2019
cd build-vs2019
cmake -A x64 -DCMAKE_INSTALL_PREFIX=%cd%/install -Dprotobuf_BUILD_TESTS=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF ../cmake
cmake --build . --config Release -j 2
cmake --build . --config Release --target install

其中protobuf-root-dir是你的protobuf解压的路径。
在这里插入图片描述

基本上安装好的库目录就是这样的

编译安装ncnn

编译安装ncnn，首先也是从官网下载稳定的版本，就是带有版本号的版本，笔者这里下载的是ncnn-20230223-full-source这个版本。下载压缩包后，解压，然后和编译protobuf一样，在开始菜单找到Visual Studio 2019=>x64 Native Tools Command Prompt for VS 2019右击，点击更多，以管理员身份运行，输入以下命令编译：

cd <ncnn-root-dir>
mkdir -p build-vs2019
cd build-vs2019
cmake -A x64 -DCMAKE_INSTALL_PREFIX=%cd%/install -DProtobuf_INCLUDE_DIR=<protobuf-root-dir>/build/install/include -DProtobuf_LIBRARIES=<protobuf-root-dir>/build/install/lib/libprotobuf.lib -DProtobuf_PROTOC_EXECUTABLE=<protobuf-root-dir>/build/install/bin/protoc.exe ..
cmake --build . --config Release -j 8
cmake --build . --config Release --target install

这里需要说明的是ncnn-root-dir是本地ncnn-20230223-full-source安装包解压后的路径，protobuf-root-dir是本地protobuf安装包路径。另外，protobuf-root-dir后面的build也需要换成编译protobuf文件夹名字，比如上一步中，我们使用的文件夹名字是build-vs2019，那就要换成这个名字，对应起来就行了。

在这里插入图片描述

到这里，基本上ncnn就成功编译完成了。另外需要注意的是，如果有的小伙伴没有能够成功编译，出现了报错的情况，一定一定要仔细检查第一个cmake命令行，是否出现了一些错误。笔者在第一次编译安装ncnn时候，在最后一步的cmake中出现了如下的错误：
请添加图片描述
错位中提示"file cannot create directory"，当时debug这个错误了一两个小时，在网上也没有搜到对应的错误以及解决方案，甚至一度以为是权限不够导致的这个错误，后来才发现在第一步命令行中

cmake -A x64 -DCMAKE_INSTALL_PREFIX=%cd%/install -DProtobuf_INCLUDE_DIR=<protobuf-root-dir>/build/install/include -DProtobuf_LIBRARIES=<protobuf-root-dir>/build/install/lib/libprotobuf.lib -DProtobuf_PROTOC_EXECUTABLE=<protobuf-root-dir>/build/install/bin/protoc.exe ..

配置的protobuf-root-dir的路径不对，-DProtobuf_INCLUDE_DIR和 DProtobuf_LIBRARIES之间缺少了空格，而且运行该命令后，也明显提示错误，比如说是找不到libprotobuf.lib，或者protoc.exe，但是没有仔细查看命令行的错误，导致在第三步中报错，还找错了解决问题的方向。

另外需要提醒一点的是，如果在编译过程中出现了错误，建议是删掉build目录后，重新新建这个目录，再进行编译，否则容易重复出现上一次的错误，这是因为有cmakecache的原因。

转换步骤

进入到上一步中编译好的ncnn-20230223-full-source/build/文件夹，这里就是安装好的ncnn目录，大致是这样的
在这里插入图片描述
然后进入到build\tools\onnx\Release目录下，打开cmd命令行，输入如下命令：

onnx2ncnn.exe mobilenet.onnx mobilenet.bin mobilenet.param

就可以在当前目录生成两个ncnn格式的模型。

另外，由于ncnn支持的算子是有限的，所以很多时候会在转模型的时候报错，比如本次实验使用的mv2，居然也报错，有Shape not supported yet!
Unknown data type 0
在这里插入图片描述
真是一次失败的尝试！！

后记

其实呢，在费劲巴拉的把整个ncnn所有的依赖库编译安装成功，才得到了算子不支持的结果，真是很扫兴。后来笔者在ncnn官网发现，其实有编译好的ncnn库，完全可以不用自己辛苦去编译，直接下载就好了。比如笔者下载的是windows vs2019环境编译的版本
在这里插入图片描述
下载解压之后，打开ncnn-20230223-windows-vs2019\x64\bin，就可以看到有onnx2ncnn.exe这个工具

同样尝试将mv2的onnx模型转为ncnn试试，也得到了同样的错误