背景:
想用 瀚博半导体载天VA1 加速卡 代替 NVIDIA 显卡跑深度学习模型
感谢瀚博的周工帮助解答。
正文:
-
小心拔出 NVIDIA 显卡,在PCIe 接口插上瀚博半导体载天VA1加速卡,如图:
这时显示屏连接主板的集成显卡
-
卸载旧驱动(没有旧驱动的话跳过这步)
(1)检查推理加速卡是否已成功安装。
$ lspci -d:0100 -vvv
01:00.0 Processing accelerators: Device 1ec6:0100
Subsystem: Device 1ec6:0031
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at f7000000 (32-bit, non-prefetchable) [size=8M]
Region 1: Memory at f6800000 (32-bit, non-prefetchable) [size=8M]
Region 2: Memory at e8000000 (64-bit, prefetchable) [size=32M]
Region 4: Memory at e0000000 (64-bit, prefetchable) [size=128M]
Capabilities: <access denied>
Kernel driver in use: vastai
Kernel modules: vastai_pci
如果输出这样,则说明推理加速卡安装成功。
(2)查看加速卡驱动包名
$ sudo dpkg --get-selections | grep vastai
vastai-pci-d2-3-v2-1-a1-3-hwtype-2-dkms install
(3)卸载加速卡驱动包
sudo dpkg -r vastai-pci-d2-3-v2-1-a1-3-hwtype-2-dkms
(3)重启电脑
sudo reboot
- 安装新驱动
(1) 先安装 dkms 和 dpkg
sudo apt-get install dkms dpkg
检查 dkms 和 dpkg是否安装成功
$ dkms --version
dkms: 2.2.1.0
$ dpkg --version
Debian 'dpkg' package management program version 1.19.0.5 (amd64).
This is free software; see the GNU General Public License version 2 or
later for copying conditions. There is NO warranty.
(2)安装驱动
sudo dpkg -i /opt/vastai/vaststream/vastai-pci-d2-3-v2-1-a1-3-hwtype-2_00.23.02.06_x86_64.deb
期间设置8-16位的密码,输入两次,记住密码
(3)重启电脑
sudo reboot
电脑重启会出现下图场景:
参考:安装Ubuntu后重启出现perform MOK management 解决,需要用到之前设置的密码。
- 查看驱动是否安装成功
$ ll /dev/va*
crw-rw-rw- 1 root root 321, 4 2月 10 08:22 /dev/vacc0
crw-rw-rw- 1 root root 321, 2 2月 10 08:22 /dev/vastai0_ctl
crw-rw-rw- 1 root root 321, 1 2月 10 08:22 /dev/vastai0_version
crw-rw-rw- 1 root root 321, 5 2月 10 08:22 /dev/vastai_video0
crw-rw-rw- 1 root root 10, 58 2月 10 08:22 /dev/vatools
$ cat /dev/vastai0_version
[DriverV2.3.0_VideoV2.1.0_AIV1.3.0_01_23_01]
[SMCU: d88b356 20230130_194311]
[BMCU: VE1S-A3-002001R 20221011] [active]
[BMCU: no_version no_build time] [backup]
[VDMCU: a118a09 20230203_033124]
[VEMCU: a2d5b1 Build: 20230204_033253]
[VDSP: tag_ks_sw_v1_2_mix-173-g878eb4-dev Build: 20230203_200119]
[CMCU: 35135c Build: 20230109_085228]
[LMCU: 9cdfed Build: 20230109_085248]
[ODSP: 9b7cac Build: 20230111_195135]
[Driver: 00.23.02.06 d2_3_v2_1_a1_3 013db8fa 2023-02-06 03:15:42]
[pcie_phy_1.0.0]
[die 0] [BL0:VE1-2.0.1R d6e1861 1028] [active]
[die 0] [BL0:VE1-2.0.0B 65647d0 1028] [backup]