BM1684X移植自写算法记录

news2025/1/9 15:23:37

移植步骤------------------------------------------------------------------------

首先搭建好自己的网络模型，并导出为onnx格式--具体可以参照-->

https://github.com/warren-wzw/MNIST-pytorch.git

将onnx模型使用tpu-mlir工具转化为bmodel格式--具体可以参照--->
https://kdocs.cn/l/coKOub6BysyfBM1684X-onnx模型转化为bmodel_warren@伟_的博客-CSDN博客https://kdocs.cn/l/coKOub6Bysyf
在板端搭建好sophon-sail环境---->
https://kdocs.cn/l/ce7T9GNtS3D3BM1684X开发环境搭建--SOC mode_warren@伟_的博客-CSDN博客https://kdocs.cn/l/ce7T9GNtS3D3
在板端新建一个MNIST文件夹，文件目录如下，其中datasets存放测试数据集train-images-idx3-ubyte，test_output_fp16_1b.bmodel以及test_output_fp32_1b.bmodel为onnx转化后的bmodel模型，test.py为测试代码。

主要的原理就是使用sophon提供的api加载能够适配于BM1684X的bmodel类型的模型，并使用他们的api进行模型的推理，官方sail的API可以参考-->

3. API 参考 — sophon-sail v23.03.01 文档

下面讲解一下测试代码

#import cv2
import numpy as np
import sophon.sail as sail
import time

num = -1 
inference_time =[0]
print("--0-5 1-0 2-4 3-1 4-9 5-2 6-1 7-3 8-1 9-4 for example:if num =9 the pic's num is 4")
 
engine = sail.Engine("./test_output_fp32_1b.bmodel",0,sail.IOMode.SYSIO) #load model-use FP32model on tpu-0 and use sys memery
#engine = sail.Engine("./test_output_fp16_1b.bmodel",0,sail.IOMode.SYSIO) #load model-use FP16 on tpu-0 and use sys memery

graph_name =engine.get_graph_names()[0]                      #get_graph_names-test_output
input_tensor_name = engine.get_input_names(graph_name)[0]    #get_input_names-input.1
output_tensor_name = engine.get_output_names(graph_name)[0]  #get_output_names-25_LogSoftmax

batchsize,channel,height,width = engine.get_input_shape(graph_name,input_tensor_name) #get batchsize-1,channel-1,input image's height-28 & width-28

#read image
with open("./datasets/train-images-idx3-ubyte","rb") as f:
    file = f.read()
for i in range(8000): 
    num =num +1  
    i = 16+784*num
    image1 = [int(str(item).encode('ascii'),16) for item in file[i:i+784]]

    #reshap input data
    input_data = np.array(image1,dtype=np.float32).reshape(1,1,28,28)  #reshape the image to 1 1 28 28
    input_data_final = {input_tensor_name:input_data}     #because the process's parmeter(input_data)  must be dictionary so use{}
    start_time = time.time()
    outputs = engine.process(graph_name,input_data_final) #model inference
    end_time = time.time()
    inference_time.append(end_time - start_time)  
 
    result = outputs[output_tensor_name]  #use output_tensor_name to get the tensor
    max_value=np.argmax(result)           #get the index of the best score
    print("----------------------------------the result is ",max_value,"the time is ",inference_time[num]*1000,"ms")

mean = (sum(inference_time) / len(inference_time))*1000
print("-----FP32--","loop ",num+1,"times","average time",mean,"ms")