linux CUDAtoolkit+cudnn+tensorrt 的安装

news2024/10/3 2:20:24
  • windows上 CUDAtoolkit+cudnn的安装

CUDAtoolkit+cudnn的安装

须知

  • use command ubuntu-drivers devices查看你的显卡类型和推荐的驱动版本
  • 百度 nvidia-driver-*** 支持的 cuda 或 去文档查找驱动(比如450,460)匹配的cuda版本

在这里插入图片描述

下载

网盘下载

  • https://www.aliyundrive.com/s/VXfZQfRqf1y

  • cuda11.4版本的网页下载链接

NAMEDIR
CUDA_TOOLKIThttps://musetransfer.com/s/ty5q8bfjm 请点击链接获取《无主题 - cuda_11.4.0_470.42.01_linux.run》, 有效期至3月3日
CUDNNhttps://musetransfer.com/s/0wjwsgw9a 请点击链接获取《无主题 - cudnn-linux-x86_64-8.8.0.121_cuda11-archi…》, 有效期至3月3日

官方下载

cuda_11.0.3_450.51.06_linux.run这里安装的是11.0(之前用的是10.2,lsb_release -a)

https://developer.nvidia.com/rdp/cudnn-download

在这里插入图片描述

tensorrt

https://developer.nvidia.com/nvidia-tensorrt-8x-download
在这里插入图片描述
在这里插入图片描述

  • 下载链接
  • https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.5.1/tars/TensorRT-8.5.1.7.Linux.x86_64-gnu.cuda-11.8.cudnn8.6.tar.gz
  • https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.5.1/tars/TensorRT-8.5.1.7.Linux.x86_64-gnu.cuda-10.2.cudnn8.6.tar.gz

安装

检测是否已安装

~/文档/cuda$ nvidia-smi
Command 'nvidia-smi' not found, but can be installed with:
sudo apt install nvidia-utils-390         # version 390.157-0ubuntu0.22.04.1, or
sudo apt install nvidia-utils-450-server  # version 450.216.04-0ubuntu0.22.04.1
sudo apt install nvidia-utils-470         # version 470.161.03-0ubuntu0.22.04.1
sudo apt install nvidia-utils-470-server  # version 470.161.03-0ubuntu0.22.04.1
sudo apt install nvidia-utils-510         # version 510.108.03-0ubuntu0.22.04.1
sudo apt install nvidia-utils-515         # version 515.86.01-0ubuntu0.22.04.1
sudo apt install nvidia-utils-515-server  # version 515.86.01-0ubuntu0.22.04.1
sudo apt install nvidia-utils-525         # version 525.78.01-0ubuntu0.22.04.1
sudo apt install nvidia-utils-525-server  # version 525.60.13-0ubuntu0.22.04.1
sudo apt install nvidia-utils-418-server  # version 418.226.00-0ubuntu4
sudo apt install nvidia-utils-510-server  # version 510.47.03-0ubuntu3

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

安装步骤(driver+toolkit)

  • ubuntu-drivers devices查看你的显卡类型和推荐的驱动版本

  • sudo apt-get install nvidia-driver-460(11.0需要450.80以上的driver)或sudo apt install nvidia-driver-470

  • (11.0需要450.80以上的driver,如果不满足,remove命令)sudo apt-get remove nvidia-driver-460 , sudo apt autoremove

  • sudo reboot

  • nvidia-smi

  • sudo sh cuda_11.0.3_450.51.06_linux.run

  • 然后有两个选择 abort和continue,是否继续,选择continue

在这里插入图片描述

  • 若:Failed to verify gcc version. See log at /var/log/cuda-installer.log for details.Error: unsupported compiler: 11.3.0. Use --override to override this check.

    则:sudo sh cuda_11.0.3_450.51.06_linux.run --override

在这里插入图片描述在这里插入图片描述

  • 选中toolkit即可

在这里插入图片描述

dell@dell-ThinkPad-Edge:~/文档$ sudo sh cuda_11.0.3_450.51.06_linux.run --override
[sudo] dell 的密码: 
===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.0/
Samples:  Not Selected

Please make sure that
 -   PATH includes /usr/local/cuda-11.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.0/lib64, or, add /usr/local/cuda-11.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.0/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least .00 is required for CUDA 11.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log
  • vim ~/.bashrc添加如下到末尾:
export PATH="/usr/local/cuda-11.0/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH"

在这里插入图片描述

  • source ~/.bashrc
  • nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0

cudatoolkit remove

cudnn

  • below there 2 install way

安装法

  • sudo dpkg -i cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb
dell@dell-ThinkPad-Edge:~/文档$ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb 
[sudo] dell 的密码: 
正在选中未选择的软件包 cudnn-local-repo-ubuntu2204-8.8.0.121。
(正在读取数据库 ... 系统当前共安装有 236160 个文件和目录。)
准备解压 cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb  ...
正在解压 cudnn-local-repo-ubuntu2204-8.8.0.121 (1.0-1) ...
正在设置 cudnn-local-repo-ubuntu2204-8.8.0.121 (1.0-1) ...
The public cudnn-local-repo-ubuntu2204-8.8.0.121 GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/cudnn-local-repo-ubuntu2204-8.8.0.121/cudnn-local-B66125A0-keyring.gpg /usr/share/keyrings/
dell@dell-ThinkPad-Edge:~/文档$ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb
(正在读取数据库 ... 系统当前共安装有 236176 个文件和目录。)
准备解压 cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb  ...
正在解压 cudnn-local-repo-ubuntu2204-8.8.0.121 (1.0-1) 并覆盖 (1.0-1) ...
正在设置 cudnn-local-repo-ubuntu2204-8.8.0.121 (1.0-1) ...

直接复制法

  • tar -xf cudnn-linux-x86_64-8.8.0.121_cuda11-archive.tar.xz
  • 复制文件到cuda的对应文件夹下,并给予文件的执行权限

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

sudo cp ./include/cudnn* /usr/local/cuda-11.0/include
sudo cp ./lib/libcudnn* /usr/local/cuda-11.0/lib64 
sudo chmod 777 /usr/local/cuda-11.0/include/cudnn* /usr/local/cuda-11.0/lib64/libcudnn*

tensorrt 官方指导

        注意安装tensorrt的版本许多模型是在TensorRT 7.x运行的,这里测试的为TensorRT 8.x.

  • tar -xf TensorRT-8.5.1.7.Linux.x86_64-gnu.cuda-11.8.cudnn8.6.tar.gz

在这里插入图片描述

  • 复制文件
  • sudo cp -r ./TensorRT-8.5.1.7 /usr/local
  • sudo chmod -R 777 /usr/local/TensorRT-8.5.1.7
  • vim ~/.bashrc
  • export LD_LIBRARY_PATH=/usr/local/TensorRT-8.5.1.7/lib:$LD_LIBRARY_PATH
  • export LIBRARY_PATH=/usr/local/TensorRT-8.5.1.7/lib::$LIBRARY_PATH
export LD_LIBRARY_PATH=/path/TensorRT-8.5.1.7/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=/path/TensorRT-8.5.1.7/lib::$LIBRARY_PATH

LIBRARY_PATH is used by gcc before compilation to search for directories containing libraries that need to be linked to your program.
LD_LIBRARY_PATH is used by your program to search for directories containing the libraries after it has been successfully compiled and linked.
  • source ~/.bashrc

使用实例

  • https://github.com/wang-xinyu/tensorrtx
  • https://github.com/wang-xinyu/tensorrtx/blob/master/tutorials/getting_started.md
  • TensorRT 7.x
    在这里插入图片描述在这里插入图片描述
    jieya wenti在这里插入图片描述

https://www.5axxw.com/questions/content/lkk5xr

在这里插入图片描述

  • 由于使用了较新的API,本工程只适用于TensorRT8.2.3+,但可自行查文档修改相应的API
    使用过conda环境的torch,然后发现速度会相对较慢(6ms->30ms)
cmake_minimum_required(VERSION 2.6)

project(alexnet)

add_definitions(-std=c++11)

option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_BUILD_TYPE Debug)

include_directories(${PROJECT_SOURCE_DIR}/include)
# include and link dirs of cuda and tensorrt, you need adapt them if yours are different
# cuda
include_directories(/usr/local/cuda/include)
link_directories(/usr/local/cuda/lib64/)
link_directories(/usr/local/cuda-11.0/lib6m4/libcudnn.som.8)
# tensorrt
include_directories(/usr/local/TensorRT-7.2.3.4/include)
link_directories(/usr/local/TensorRT-7.2.3.4/lib/)

#include_directories(/usr/include/x86_64-linux-gnu/)
#link_directories(/usr/lib/x86_64-linux-gnu/)

add_executable(alexnet ${PROJECT_SOURCE_DIR}/alex.cpp)
target_link_libraries(alexnet nvinfer)
target_link_libraries(alexnet cudart)

add_definitions(-O2 -pthread)
set(CMAKE_CXX_FLAGS "-Wno-error=deprecated-declarations -Wno-deprecated-declarations ")# https://blog.csdn.net/weixin_42156097/article/details/106091555

target_link_libraries(alexnet "/usr/local/cuda-11.0/lib64/libcudnn.so.8")
#include "NvInfer.h"
#include "cuda_runtime_api.h"
#include "logging.h"
#include <fstream>
#include <map>
#include <chrono>

#define CHECK(status) \
    do\
    {\
        auto ret = (status);\
        if (ret != 0)\
        {\
            std::cerr << "Cuda failure: " << ret << std::endl;\
            abort();\
        }\
    } while (0)

// stuff we know about the network and the input/output blobs
static const int INPUT_H = 224;
static const int INPUT_W = 224;
static const int OUTPUT_SIZE = 1000;

const char* INPUT_BLOB_NAME = "data";
const char* OUTPUT_BLOB_NAME = "prob";

using namespace nvinfer1;

static Logger gLogger;

// Load weights from files shared with TensorRT samples.
// TensorRT weight files have a simple space delimited format:
// [type] [size] <data x size in hex>
std::map<std::string, Weights> loadWeights(const std::string file)
{
    std::cout << "Loading weights: " << file << std::endl;
    std::map<std::string, Weights> weightMap;

    // Open weights file
    std::ifstream input(file);
    assert(input.is_open() && "Unable to load weight file.");

    // Read number of weight blobs
    int32_t count;
    input >> count;
    assert(count > 0 && "Invalid weight map file.");

    while (count--)
    {
        Weights wt{DataType::kFLOAT, nullptr, 0};
        uint32_t size;

        // Read name and type of blob
        std::string name;
        input >> name >> std::dec >> size;
        wt.type = DataType::kFLOAT;

        // Load blob
        uint32_t* val = reinterpret_cast<uint32_t*>(malloc(sizeof(val) * size));
        for (uint32_t x = 0, y = size; x < y; ++x)
        {
            input >> std::hex >> val[x];
        }
        wt.values = val;
        
        wt.count = size;
        weightMap[name] = wt;
    }

    return weightMap;
}

// Creat the engine using only the API and not any parser.
ICudaEngine* createEngine(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt)
{
    INetworkDefinition* network = builder->createNetworkV2(0U);

    // Create input tensor of shape { 1, 1, 32, 32 } with name INPUT_BLOB_NAME
    ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{3, INPUT_H, INPUT_W});
    assert(data);

    std::map<std::string, Weights> weightMap = loadWeights("../alexnet.wts");
    Weights emptywts{DataType::kFLOAT, nullptr, 0};

    IConvolutionLayer* conv1 = network->addConvolutionNd(*data, 64, DimsHW{11, 11}, weightMap["features.0.weight"], weightMap["features.0.bias"]);
    assert(conv1);
    conv1->setStrideNd(DimsHW{4, 4});
    conv1->setPaddingNd(DimsHW{2, 2});

    // Add activation layer using the ReLU algorithm.
    IActivationLayer* relu1 = network->addActivation(*conv1->getOutput(0), ActivationType::kRELU);
    assert(relu1);

    // Add max pooling layer with stride of 2x2 and kernel size of 2x2.
    IPoolingLayer* pool1 = network->addPoolingNd(*relu1->getOutput(0), PoolingType::kMAX, DimsHW{3, 3});
    assert(pool1);
    pool1->setStrideNd(DimsHW{2, 2});

    IConvolutionLayer* conv2 = network->addConvolutionNd(*pool1->getOutput(0), 192, DimsHW{5, 5}, weightMap["features.3.weight"], weightMap["features.3.bias"]);
    assert(conv2);
    conv2->setPaddingNd(DimsHW{2, 2});
    IActivationLayer* relu2 = network->addActivation(*conv2->getOutput(0), ActivationType::kRELU);
    assert(relu2);
    IPoolingLayer* pool2 = network->addPoolingNd(*relu2->getOutput(0), PoolingType::kMAX, DimsHW{3, 3});
    assert(pool2);
    pool2->setStrideNd(DimsHW{2, 2});

    IConvolutionLayer* conv3 = network->addConvolutionNd(*pool2->getOutput(0), 384, DimsHW{3, 3}, weightMap["features.6.weight"], weightMap["features.6.bias"]);
    assert(conv3);
    conv3->setPaddingNd(DimsHW{1, 1});
    IActivationLayer* relu3 = network->addActivation(*conv3->getOutput(0), ActivationType::kRELU);
    assert(relu3);

    IConvolutionLayer* conv4 = network->addConvolutionNd(*relu3->getOutput(0), 256, DimsHW{3, 3}, weightMap["features.8.weight"], weightMap["features.8.bias"]);
    assert(conv4);
    conv4->setPaddingNd(DimsHW{1, 1});
    IActivationLayer* relu4 = network->addActivation(*conv4->getOutput(0), ActivationType::kRELU);
    assert(relu4);

    IConvolutionLayer* conv5 = network->addConvolutionNd(*relu4->getOutput(0), 256, DimsHW{3, 3}, weightMap["features.10.weight"], weightMap["features.10.bias"]);
    assert(conv5);
    conv5->setPaddingNd(DimsHW{1, 1});
    IActivationLayer* relu5 = network->addActivation(*conv5->getOutput(0), ActivationType::kRELU);
    assert(relu5);
    IPoolingLayer* pool3 = network->addPoolingNd(*relu5->getOutput(0), PoolingType::kMAX, DimsHW{3, 3});
    assert(pool3);
    pool3->setStrideNd(DimsHW{2, 2});

    IFullyConnectedLayer* fc1 = network->addFullyConnected(*pool3->getOutput(0), 4096, weightMap["classifier.1.weight"], weightMap["classifier.1.bias"]);
    assert(fc1);

    IActivationLayer* relu6 = network->addActivation(*fc1->getOutput(0), ActivationType::kRELU);
    assert(relu6);

    IFullyConnectedLayer* fc2 = network->addFullyConnected(*relu6->getOutput(0), 4096, weightMap["classifier.4.weight"], weightMap["classifier.4.bias"]);
    assert(fc2);

    IActivationLayer* relu7 = network->addActivation(*fc2->getOutput(0), ActivationType::kRELU);
    assert(relu7);

    IFullyConnectedLayer* fc3 = network->addFullyConnected(*relu7->getOutput(0), 1000, weightMap["classifier.6.weight"], weightMap["classifier.6.bias"]);
    assert(fc3);

    fc3->getOutput(0)->setName(OUTPUT_BLOB_NAME);
    std::cout << "set name out" << std::endl;
    network->markOutput(*fc3->getOutput(0));

    // Build engine
    builder->setMaxBatchSize(maxBatchSize);
    config->setMaxWorkspaceSize(1 << 20);
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    std::cout << "build out" << std::endl;

    // Don't need the network any more
    network->destroy();

    // Release host memory
    for (auto& mem : weightMap)
    {
        free((void*) (mem.second.values));
    }

    return engine;
}

void APIToModel(unsigned int maxBatchSize, IHostMemory** modelStream)
{
    // Create builder
    IBuilder* builder = createInferBuilder(gLogger);
    IBuilderConfig* config = builder->createBuilderConfig();

    // Create model to populate the network, then set the outputs and create an engine
    ICudaEngine* engine = createEngine(maxBatchSize, builder, config, DataType::kFLOAT);
    assert(engine != nullptr);

    // Serialize the engine
    (*modelStream) = engine->serialize();

    // Close everything down
    engine->destroy();
    builder->destroy();
}

void doInference(IExecutionContext& context, float* input, float* output, int batchSize)
{
    const ICudaEngine& engine = context.getEngine();

    // Pointers to input and output device buffers to pass to engine.
    // Engine requires exactly IEngine::getNbBindings() number of buffers.
    assert(engine.getNbBindings() == 2);
    void* buffers[2];

    // In order to bind the buffers, we need to know the names of the input and output tensors.
    // Note that indices are guaranteed to be less than IEngine::getNbBindings()
    const int inputIndex = engine.getBindingIndex(INPUT_BLOB_NAME);
    const int outputIndex = engine.getBindingIndex(OUTPUT_BLOB_NAME);

    // Create GPU buffers on device
    CHECK(cudaMalloc(&buffers[inputIndex], batchSize * 3 * INPUT_H * INPUT_W * sizeof(float)));
    CHECK(cudaMalloc(&buffers[outputIndex], batchSize * OUTPUT_SIZE * sizeof(float)));

    // Create stream
    cudaStream_t stream;
    CHECK(cudaStreamCreate(&stream));

    // DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
    CHECK(cudaMemcpyAsync(buffers[inputIndex], input, batchSize * 3 * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, stream));
    context.enqueue(batchSize, buffers, stream, nullptr);
    CHECK(cudaMemcpyAsync(output, buffers[outputIndex], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));
    cudaStreamSynchronize(stream);

    // Release stream and buffers
    cudaStreamDestroy(stream);
    CHECK(cudaFree(buffers[inputIndex]));
    CHECK(cudaFree(buffers[outputIndex]));
}

//int main(int argc, char** argv)
//{
//    if (argc != 2) {
//        std::cerr << "arguments not right!" << std::endl;
//        std::cerr << "./alexnet -s   // serialize model to plan file" << std::endl;
//        std::cerr << "./alexnet -d   // deserialize plan file and run inference" << std::endl;
//        return -1;
//    }
int main() // char *str[] = {"hello","world"}; char **ch = str ;
{
    char *argv[] = {"-s","-s"};

    // create a model using the API directly and serialize it to a stream
    char *trtModelStream{nullptr};
    size_t size{0};

    if (std::string(argv[1]) == "-s") {
        IHostMemory* modelStream{nullptr};
        APIToModel(1, &modelStream);
        assert(modelStream != nullptr);

        std::ofstream p("alexnet.engine", std::ios::binary);
        if (!p)
        {
            std::cerr << "could not open plan output file" << std::endl;
            return -1;
        }
        p.write(reinterpret_cast<const char*>(modelStream->data()), modelStream->size());
        modelStream->destroy();
        return 1;
    } else if (std::string(argv[1]) == "-d") {
        std::ifstream file("alexnet.engine", std::ios::binary);
        if (file.good()) {
            file.seekg(0, file.end);
            size = file.tellg();
            file.seekg(0, file.beg);
            trtModelStream = new char[size];
            assert(trtModelStream);
            file.read(trtModelStream, size);
            file.close();
        }
    } else {
        return -1;
    }


    // Subtract mean from image
    float data[3 * INPUT_H * INPUT_W];
    for (int i = 0; i < 3 * INPUT_H * INPUT_W; i++)
        data[i] = 1;

    IRuntime* runtime = createInferRuntime(gLogger);
    assert(runtime != nullptr);
    ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream, size, nullptr);
    assert(engine != nullptr);
    IExecutionContext* context = engine->createExecutionContext();
    assert(context != nullptr);

    // Run inference
    float prob[OUTPUT_SIZE];
    for (int i = 0; i < 100; i++) {
        auto start = std::chrono::system_clock::now();
        doInference(*context, data, prob, 1);
        auto end = std::chrono::system_clock::now();
        std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
    }

    // Destroy the engine
    context->destroy();
    engine->destroy();
    runtime->destroy();

    // Print histogram of the output distribution
    std::cout << "\nOutput:\n\n";
    for (unsigned int i = 0; i < OUTPUT_SIZE; i++)
    {
        std::cout << prob[i] << ", ";
        if (i % 10 == 0) std::cout << i / 10 << std::endl;
    }
    std::cout << std::endl;

    return 0;
}

总找不到动态链接库->

sudo zip -r -q lib.zip ./lib # 先给/usr/lib/保存一下
sudo cp ./lib64/* /usr/lib/

sudo rm -f /usr/local/cuda-11.0/include/cudnn* /usr/local/cuda-11.0/lib64/libcudnn*

Linux找不到动态链接库 .so文件的解决方法

总找不到动态链接库<-

[E] [TRT] CUDA initialization failure with error 35. Please check your CUDA installation:

TensorRT-7.2.3.4TensorRT-7.2.3.4.Ubuntu-18.04.x86_64-gnu.cuda-11.0.cudnn8.1
$~/文档$ tar -xzvf cudnn-11.2-linux-x64-v8.1.0.77.tgz 
cuda/include/cudnn.h
cuda/include/cudnn_adv_infer.h……
sudo rm -f /usr/local/cuda-11.0/include/cudnn* /usr/local/cuda-11.0/lib64/libcudnn*
sudo cp ./include/cudnn* /usr/local/cuda-11.0/include
sudo cp ./lib64/libcudnn* /usr/local/cuda-11.0/lib64 
sudo chmod 777 /usr/local/cuda-11.0/include/cudnn* /usr/local/cuda-11.0/lib64/libcudnn*

CG

// https://zhuanlan.zhihu.com/p/430470397
为了防止找不到 TensorRT 的库,建议把 TensorRT 的库和头文件链接一下

sudo ln -s /usr/local/TensorRT-7.0.0.11/lib/* /usr/lib/
sudo ln -s /usr/local/TensorRT-7.0.0.11/include/* /usr/include/
  • /home/dell/CLionProjects/TENSORRT/tensorrtx-master/alexnet/cmake-build-debug/alexnet: error while loading shared libraries: libcudnn.so.7: cannot open shared object file: No such file or directory

  • finishing deferred symbolic links:
    在这里插入图片描述

在这里插入图片描述
在这里插入图片描述

  • sudo ldconfig
    在这里插入图片描述

https://github.com/dlunion/tensorRTIntegrate

  • https://github.com/ttanzhiqiang/onnx_tensorrt_project

  • 5.5https://github.com/iwatake2222/InferenceHelper

  • 5.9采用TensorRT的C++接口进行ONNX模型转TRT,并进行Inference推理。https://github.com/MAhaitao999/ONNX_TRT_CPP

  • 6/0Deploy an ONNX model to TensorRT usingOpenCV for I/O https://github.com/FauxShow/trt_cpp_opencv

  • https://zhuanlan.zhihu.com/p/430470397

  • 添加链接描述

  • Cuda与GPU显卡驱动版本一览(哈哈,这哥们博客下咋还有个相亲广告(希望他早日成功))
    在这里插入图片描述
    sudo sh cuda_9.0.176_384.81_linux.run https://developer.nvidia.com/cuda-90-download-archive?

在这里插入图片描述

  • 如果你的系统突然断电 ,可能报错 : NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.再重装驱动即可
(base):~/Documents/node-v18.10.0-linux-x64/bin$ sudo apt install nvidia-driver-470
[sudo] pdd 的密码: 
正在读取软件包列表... 完成
正在分析软件包的依赖关系树... 完成
正在读取状态信息... 完成                 
下列软件包是自动安装的并且现在不需要了:
  fcitx-config-common fcitx-config-gtk fcitx-frontend-all fcitx-frontend-gtk2 fcitx-frontend-gtk3 fcitx-frontend-qt5 fcitx-module-dbus
  fcitx-module-kimpanel fcitx-module-lua fcitx-module-quickphrase-editor5 fcitx-module-x11 fcitx-modules fcitx-ui-classic g++-11 libfcitx-config4
  libfcitx-core0 libfcitx-gclient1 libfcitx-qt5-1 libfcitx-qt5-data libfcitx-utils0 libgettextpo0 libpresage-data libpresage1v5 libstdc++-11-dev
  libtinyxml2.6.2v5 presage
使用'sudo apt autoremove'来卸载它(它们)。
将会同时安装下列软件:
  dkms gcc nvidia-dkms-470
建议安装:
  menu gcc-multilib flex bison gcc-doc
下列【新】软件包将被安装:
  dkms gcc nvidia-dkms-470 nvidia-driver-470
升级了 0 个软件包,新安装了 4 个软件包,要卸载 0 个软件包,有 196 个软件包未被升级。
需要下载 0 B/559 kB 的归档。
解压缩后会消耗 2,064 kB 的额外空间。
您希望继续执行吗? [Y/n] y
正在选中未选择的软件包 gcc。
(正在读取数据库 ... 系统当前共安装有 240297 个文件和目录。)
准备解压 .../gcc_4%3a11.2.0-1ubuntu1_amd64.deb  ...
正在解压 gcc (4:11.2.0-1ubuntu1) ...
正在选中未选择的软件包 dkms。
准备解压 .../dkms_2.8.7-2ubuntu2.1_all.deb  ...
正在解压 dkms (2.8.7-2ubuntu2.1) ...
正在选中未选择的软件包 nvidia-dkms-470。
准备解压 .../nvidia-dkms-470_470.161.03-0ubuntu0.22.04.1_amd64.deb  ...
正在解压 nvidia-dkms-470 (470.161.03-0ubuntu0.22.04.1) ...
正在选中未选择的软件包 nvidia-driver-470。
准备解压 .../nvidia-driver-470_470.161.03-0ubuntu0.22.04.1_amd64.deb  ...
正在解压 nvidia-driver-470 (470.161.03-0ubuntu0.22.04.1) ...
正在设置 gcc (4:11.2.0-1ubuntu1) ...
正在设置 dkms (2.8.7-2ubuntu2.1) ...
正在设置 nvidia-dkms-470 (470.161.03-0ubuntu0.22.04.1) ...
update-initramfs: deferring update (trigger activated)
INFO:Enable nvidia
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/lenovo_thinkpad
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/dell_latitude
DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here
Loading new nvidia-470.161.03 DKMS files...
Building for 5.15.0-60-generic
Building for architecture x86_64
Building initial module for 5.15.0-60-generic
Secure Boot not enabled on this system.
Done.

nvidia.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.15.0-60-generic/updates/dkms/

nvidia-modeset.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.15.0-60-generic/updates/dkms/

nvidia-drm.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.15.0-60-generic/updates/dkms/

nvidia-uvm.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.15.0-60-generic/updates/dkms/

nvidia-peermem.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.15.0-60-generic/updates/dkms/

depmod......
正在设置 nvidia-driver-470 (470.161.03-0ubuntu0.22.04.1) ...
正在处理用于 man-db (2.10.2-1) 的触发器 ...
正在处理用于 initramfs-tools (0.140ubuntu13) 的触发器 ...
update-initramfs: Generating /boot/initrd.img-5.15.0-60-generic
(base) :~/Documents/node-v18.10.0-linux-x64/bin$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/369610.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Python实现贝叶斯优化器(Bayes_opt)优化Catboost回归模型(CatBoostRegressor算法)项目实战

说明&#xff1a;这是一个机器学习实战项目&#xff08;附带数据代码文档视频讲解&#xff09;&#xff0c;如需数据代码文档视频讲解可以直接到文章最后获取。1.项目背景贝叶斯优化器 (BayesianOptimization) 是一种黑盒子优化器&#xff0c;用来寻找最优参数。贝叶斯优化器是…

18523-47-2,3-Azidopropionic Acid,叠氮基丙酸,可以与炔烃发生点击化学反应

【中文名称】3-叠氮基丙酸【英文名称】 3-Azidopropionic Acid&#xff0c;3-Azidopropionic COOH【结 构 式】【CAS】18523-47-2【分子式】C3H5N3O2【分子量】115.09【纯度标准】95%【包装规格】1g&#xff0c;5g&#xff0c;10g【是否接受定制】可进行定制&#xff0c;定制时…

龙蜥开发者说:为爱发电!当一个龙蜥社区打包 Contributor 是怎样的体验?| 第16期

「龙蜥开发者说」第 16 期来了&#xff01;开发者与开源社区相辅相成&#xff0c;相互成就&#xff0c;这些个人在龙蜥社区的使用心得、实践总结和技术成长经历都是宝贵的&#xff0c;我们希望在这里让更多人看见技术的力量。本期故事&#xff0c;我们邀请了龙蜥社区开发者 Fun…

无线通信时代的新技术----信标( Beacon)

随着IT技术的发展&#xff0c;无线通信技术也在不断发展。 现已根据预期用途开发了各种无线通信技术&#xff0c;例如 NFC、WIFI、Bluetooth和 RFID。 车辆内部结构的复杂化和数字化&#xff0c;车载通信网络技术的重要性也越来越高。 一个典型的例子是远程信息处理。 远程信息…

注重邮件数据信息安全 保障企业稳步发展

近年来&#xff0c;世界各地的政府、银行、电信公司、制造业以及零售业等&#xff0c;不断发生数据泄密事件。 就企业而言&#xff0c;邮件数据很容易成为竞争对手或者诈骗者窃取的目标。 电子邮件是企业中一种重要的沟通工具但是随着网络攻击手段的不断升级&#xff0c;电子邮…

RN面试题

RN面试题1.React Native相对于原生的ios和Android有哪些优势&#xff1f;1.性能媲美原生APP 2.使用JavaScript编码&#xff0c;只要学习这一种语言 3.绝大部分代码安卓和IOS都能共用 4.组件式开发&#xff0c;代码重用性很高 5.跟编写网页一般&#xff0c;修改代码后即可自动刷…

关系数据库

关系的三类完整性约束实体完整性规则• 保证关系中的每个元组都是可识别的和惟一的 • 指关系数据库中所有的表都必须有主键&#xff0c;而且表中不允许存在如下记录&#xff1a;– 无主键值的记录– 主键值相同的记录• 原因&#xff1a;实体必须可区分• 就像实体-学生&#…

谷歌外推留痕,谷歌搜索留痕快速收录怎么做出来的?

本文主要分享谷歌搜索留痕的收录效果是怎么做的&#xff0c;让你对谷歌留痕技术有一个全面的了解。 本文由光算创作&#xff0c;有可能会被修改和剽窃&#xff0c;我们佛系对待这样的行为吧。 谷歌搜索留痕快速收录怎么做出来的&#xff1f; 答案是&#xff1a;通过谷歌蜘蛛…

XLSX.utils读取日期格式错误

表格中的时间为2023/2/16调用 XLSX.utils.sheet_to_json 读取到的时间为2/16/23时间格式不对-期待的时间格式为2023-02-16 00:00增加代码 cellDates: true, dateNF: "yyyy-MM-dd HH:mm" 解决问题readerData (rawFile) {this.loading truethis.isFile true // 流程结…

透射电镜测试样品的制备要求和方法

透射电镜&#xff08;Transmission Electron Microscope&#xff0c;TEM&#xff09;是一种高分辨率的显微镜&#xff0c;能够对样品进行高精度的成像和分析。为了得到高质量的TEM图像&#xff0c;样品制备是非常重要的。 ​ 样品选择 TEM样品应该是具有明确结构和化学成分的…

《分布式技术原理与算法解析》学习笔记Day21

分布式数据存储三要素 什么是分布式数据存储系统&#xff1f; 分布式存储系统的核心逻辑&#xff0c;就是将用户需要存储的数据根据某种规则存储到不同的机器上&#xff0c;当用户想要获取指定数据时&#xff0c;再按照规则到存储数据的机器中获取。 分布式存储系统的三要素…

苏州市软件行业协会第五届第四次理事会暨元宇宙专委会成立决议会在苏召开

2月17日&#xff0c;2022年度苏州市软件行业协会第五届第四次理事会暨苏州市软件行业协会元宇宙专委会成立决议会在西交利物浦大学顺利召开。会议选举西交利物浦大学担任苏州市软件行业协会元宇宙专委会第一届轮值会长单位。 苏州市工信局大数据处处长&#xff08;信息化和软件…

python+pytest接口自动化(1)-接口测试基础

接口定义一般我们所说的接口即API&#xff0c;那什么又是API呢&#xff0c;百度给的定义如下&#xff1a;API&#xff08;Application Programming Interface&#xff0c;应用程序接口&#xff09;是一些预先定义的接口&#xff08;如函数、HTTP接口&#xff09;&#xff0c;或…

MySQL锁篇

文章目录说明&#xff1a;锁篇一、MySQL有那些锁&#xff1f;二、MySQL 是怎么加锁的&#xff1f;三、update 没加索引会锁全表&#xff1f;四、MySQL 记录锁间隙锁可以防止删除操作而导致的幻读吗&#xff1f;五、MySQL 死锁了&#xff0c;怎么办&#xff1f;六、字节面试&…

【单例模式】单例模式创建的几种方式

一、饿汉模式饿汉模式是在类加载的时候就初始化了一份单例对象&#xff0c;所以他不存在线程安全问题。优点&#xff1a;不存在线程安全问题&#xff0c;天然的线程安全缺点&#xff1a;在类加载的时候就已经创建了对象&#xff0c;如果后续代码里没有使用到单例&#xff0c;就…

跟20%的同行去竞争80%的蓝海市场不香吗?

近年来&#xff0c;由于科技的发展等诸多因素&#xff0c;跨境电商行业有了长足的发展空间&#xff0c;不少人也有想要入行的打算。对于不是很了解这一行业的新手来说&#xff0c;如何选择合适的跨境电商市场与平台就显得至关重要。 一直以来&#xff0c;作为全球第四大电商市…

Android自定义View实现横向的双水波纹进度条

效果图&#xff1a;网上垂直的水波纹进度条很多&#xff0c;但横向的很少&#xff0c;将垂直的水波纹改为水平的还遇到了些麻烦&#xff0c;现在完善后发布出来&#xff0c;希望遇到的人少躺点坑。思路分析整体效果可分为三个&#xff0c;绘制圆角背景和圆角矩形&#xff0c;绘…

阅读HAL源码之重点总结

HAL封装中有如下特点&#xff08;自己总结的&#xff09;&#xff1a; 特定外设要设置的参数组成一个结构体&#xff1b; 特定外设所有寄存器组成一个结构体&#xff1b; 地址基本都是通过宏来定义的&#xff0c;定义了各外设的起始地址&#xff0c;也就是对应寄存器结构体的地…

问答系统(QA)调研

引言 智能问答系统广泛用于回答人们以自然语言形式提出的问题&#xff0c;经典应用场景包括&#xff1a;智能语音交互、在线客服、知识获取、情感类聊天等。根据QA任务&#xff0c;可以将QA大致分为5大类&#xff0c;分别为&#xff1a; 文本问答&#xff08;text-based QA&am…

使用Chemistry Development Kit (CDK) 来进行化学SMILES子结构匹配

摘要 SMILES是一种用于描述化合物结构的字符串表示法&#xff0c;其中子结构搜索是在大规模化合物数据库中查找特定的结构。然而&#xff0c;这种搜索方法存在一个误解&#xff0c;即将化合物的子结构视为一个独立的实体进行搜索&#xff0c;而忽略了它们在更大的化合物中的上…