( Tencent -TNN 学习)pytorch模型部署到移动端

    技术2023-08-19  101

    记录分为

    pytorch2onnxonnx2tnntnn结果验证移动端(安卓)使用

    1、pytorch2onnx

    环境:

    pytorch 1.4.0 onnx 1.6.0 (转换) onnxruntime 1.3.0 (测试) onnx-simplifier 0.2.9 (模型量化,不执行后续报错了,我测试是这样的)

    转换代码:

    import onnx import torch from test_net import TestModel import numpy as np import cv2 if 1: torch_model = TestModel("model.pt") torch_model.eval() batch_size = 1 #批处理大小 input_shape = (3,384,384) #输入数据 # set the model to inference mode # torch_model.eval() x = torch.randn(batch_size,*input_shape) # 生成张量 export_onnx_file = "./model.onnx" # 目的ONNX文件名 torch.onnx.export(torch_model, x, export_onnx_file, export_params=True, opset_version=11, do_constant_folding=True, # wether to execute constant folding for optimization input_names = ['input'], # the model's input names output_names = ['output'], # the model's output names dynamic_axes={'input' : {0 : 'batch_size'}, # variable lenght axes 'output' : {0 : 'batch_size'}} ) print ('get onnx ok!')

    利用 onnxruntime 测试转换的模型:

    import onnxruntime import imageio import time (width, height) = (384,384) cap = cv2.VideoCapture(0) while 1: ret,img = cap.read() time_start = time.time() if img is None: print('no image input!') break if img.ndim == 2: img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR) in_height ,in_width ,_ = img.shape img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) / 255.0 img_resized = cv2.resize(img, (width, height), interpolation=cv2.INTER_AREA) img_resized = ( torch.from_numpy(np.transpose(img_resized, (2, 0, 1))).contiguous().float() ) value = img_resized.unsqueeze(0) def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy() ort_session = onnxruntime.InferenceSession("model.onnx") ort_inputs = {ort_session.get_inputs()[0].name: (to_numpy(value)).astype(np.float32)} #Actual: (N11onnxruntime17PrimitiveDataTypeIdEE) , expected: (N11onnxruntime17PrimitiveDataTypeIfEE) #传入数据类型不对 ort_outs = ort_session.run(None, ort_inputs) result = ort_outs[0][0, 0, :, :] result = np.array(result) print (result.shape) reslut_resized = cv2.resize( result, (in_width, in_height), interpolation=cv2.INTER_AREA ) print('cost : %.3f (s)'%(time.time() - time_start)) cv2.namedWindow('re',2) cv2.imshow('re',reslut_resized) if cv2.waitKey(1) ==27: break cap.release() cv2.destroyAllWindows()

    模型简化操作:

    python -m onnxsim model10.onnx model_sim.onnx --input-shape 1,3,384,384

    2、onnx2tnn

    参考TNN官方文档

    下载tnn源码,https://github.com/Tencent/TNN ;进入 ~/TNN-master/tools/onnx2tnn/onnx-converter 文件夹,运行 ./build 进行编译。 2.运行命令进行转换 python onnx2tnn.py model/model_sim.onnx -version=algo_version -optimize=1 结果为: algo_optimize 1 onnx_net_opt_path /home/jiang/TNN-master/tools/onnx2tnn/onnx-converter/model/model_sim.opt.onnx 1.----onnx_optimizer: /home/jiang/TNN-master/tools/onnx2tnn/onnx-converter/model/model_sim.onnx /home/jiang/TNN-master/tools/onnx2tnn/onnx-converter ----load onnx model: /home/jiang/TNN-master/tools/onnx2tnn/onnx-converter/model/model_sim.onnx ----onnxsim.simplify error: You'd better check the result with Netron ----onnxsim.simplify error: <class 'RuntimeError'> ----export optimized onnx model: /home/jiang/TNN-master/tools/onnx2tnn/onnx-converter/model/model_sim.opt.onnx ----export optimized onnx model done 2.----onnx2tnn: /home/jiang/TNN-master/tools/onnx2tnn/onnx-converter/model/model_sim.opt.onnx get_node_attr_ai [Line 116] name :546 get_node_attr_ai [Line 116] name :585 get_node_attr_ai [Line 116] name :624 get_node_attr_ai [Line 116] name :663 get_node_attr_ai [Line 116] name :693 TNNLayerParam [Line 61] resize: coordinate_transformation_mode(pytorch_half_pixel) is not supported, result may be different. 3.----onnx2tnn status: 0

    出现了错误----onnxsim.simplify error: You’d better check the result with Netron ----onnxsim.simplify error: <class ‘RuntimeError’>。 参数 algo_optimize=0即不进行优化就不保存转换成功。

    3、tnn结果验证

    1)、TNN编译,不同平台库编译见链接。以下为安卓库编译: 环境要求

    依赖库 cmake(使用3.6及以上版本) sudo apt-get install attr

    NDK配置

    下载ndk版本(>=15c) https://developer.android.com/ndk/downloads 配置环境变量 :

    sudo gedit ~/.bashrc

    #ndk export ANDROID_NDK=/home/jiang/Android/android-ndk-r21b export PATH= P A T H : {PATH}: PATH:ANDROID_NDK #tnn export TNN_ROOT_PATH=/home/jiang/TNN-master export PATH= P A T H : {PATH}: PATH:TNN_ROOT_PATH

    source ~/.bashrc 编译

    切换到~/TNN-master/scripts下,修改脚本 build_android.sh;然后再执行./build_android.sh进行编译。

    ABIA32=“armeabi-v7a with NEON” ABIA64=“arm64-v8a” STL=“c++_static” SHARED_LIB=“ON” # ON表示编译动态库,OFF表示编译静态库 ARM=“ON” # ON表示编译带有Arm CPU版本的库 OPENMP=“ON” # ON表示打开OpenMP OPENCL=“ON” # ON表示编译带有Arm GPU版本的库 SHARING_MEM_WITH_OPENGL=0 # 1表示OpenGL的Texture可以与OpenCL共享

    编译完成后,在当前目录的release目录下生成对应的armeabi-v7a库,arm64-v8a库和include头文件。 2)tnn模型验证 进入examples/linux,编译./build_linux

    报错1:

    /home/jiang/TNN-master/TNN-master/examples/base/tnn_sdk_sample.cc: In function ‘void tnn::NMS(std::vectortnn::ObjectInfo&, std::vectortnn::ObjectInfo&, float, tnn::TNNNMSType)’: /home/jiang/TNN-master/TNN-master/examples/base/tnn_sdk_sample.cc:727:10: error: ‘sort’ is not a member of ‘std’; did you mean ‘sqrt’? 727 | std::sort(input.begin(), input.end(), [](const ObjectInfo &a, const ObjectInfo &b) { return a.score > b.score; });

    解决办法: 在 tnn_sdk_sample.cc 文件头增加: #include <algorithm> 报错2

    /usr/bin/ld: 找不到 -lTNN collect2: error: ld returned 1 exit status make[2]: *** [CMakeFiles/demo_arm_linux_facedetector.dir/build.make:279:demo_arm_linux_facedetector] 错误 1 make[1]: *** [CMakeFiles/Makefile2:78:CMakeFiles/demo_arm_linux_facedetector.dir/all] 错误 2 make[1]: *** 正在等待未完成的任务… [100%] Linking CXX executable demo_arm_linux_imageclassify /usr/bin/ld: 找不到 -lTNN collect2: error: ld returned 1 exit status make[2]: *** [CMakeFiles/demo_arm_linux_imageclassify.dir/build.make:279:demo_arm_linux_imageclassify] 错误 1 make[1]: *** [CMakeFiles/Makefile2:105:CMakeFiles/demo_arm_linux_imageclassify.dir/all] 错误 2 make: *** [Makefile:84:all] 错误 2

    解决办法: 1、编译tnn; 在tnn目录下,执行 cmake . ; make 编译成功后出现 libTNN.so 2、sudo ln -s libTNN.so /usr/lib/libTNN.so 软链接到lib下 (我软链接总是说失效,然后直接cp过去的)

    编译成功后,build_linux文件夹下出现 demo_arm_linux_facedetector、demo_arm_linux_imageclassify,也就是官方的例子…

    Processed: 0.013, SQL: 12