一.基础资料
1.Git 地址
地址
2.issues
issues
3.参考
参考 csdn
二.服务器信息
1.GPU 服务器
- GPU 服务器自带 CUDA 安装(前提是需要勾选上)
- CUDA 需要选择大于 11.3 的版本
- 登录服务器后会自动安装 GPU 驱动
2.CUDA 安装
GPU 服务器自带 CUDA
CUDA 版本查看
3.登录信息
删除指定主机的秘钥:
ssh-keygen -R 47.107.139.237
ssh-keygen -R 47.107.139.237
的作用是从 known_hosts
文件中删除指定主机的密钥。known_hosts
文件是 SSH 用来存储已知主机的公钥的文件。通常情况下,当你首次连接到一个主机时,SSH 会将该主机的公钥添加到 known_hosts
文件中,以后的连接中会验证主机的公钥是否匹配,以确保连接的安全性。使用 -R
选项可以从该文件中删除指定主机的条目,这在你知道主机的密钥可能已经发生变化或需要清理旧密钥时很有用。
登录信息:
#
sshpass -p xxxxx ssh -A -g root@47.107.139.237
# 给豪哥的
47.107.139.237
root
xxxxx
4.查询系统信息
[root@lavm-ikopaz5aoj ~]# uname -a
Linux lavm-ikopaz5aoj 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@lavm-ikopaz5aoj ~]# cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)
[root@lavm-ikopaz5aoj ~]#
三.基础环境
1.安装 git
sudo apt update
sudo apt install git
git --version
2.环境准备
Ubuntu 18.04
Python 3.8
Pytorch 1.9.0
CUDA 11.3
3.安装 conda
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh
~/miniconda3/bin/conda init bash
~/miniconda3/bin/conda init zsh
vim ~/.bashrc
export PATH=$PATH:~/miniconda3/bin
source ~/.bashrc
4.Python 安装
建议用 conda 安装 python
# 创建虚拟环境
conda create -n m3dm python=3.8
# 进入虚拟环境
conda activate m3dm
5.Pytorch 安装
# torch版本---github要求的
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
# 指定cuda==11.3时,pytorch的版本pytorch==1.12.1
pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 -f https://download.pytorch.org/whl/torch_stable.html
版本对应关系:
版本关系
当已经装好
torch
包时,pip install torchvision torchaudio
会自动寻找对应的版本安装。
6.网络测试
# 会用到的网站
https://huggingface.co/
# 检查是否可以访问
curl https://huggingface.co/
telnet huggingface.co 443
(m3dm) root@iZwz9c1tow6mi9lnah1hrtZ:/kwan/M3DM# telnet huggingface.co 443
Trying 162.125.7.1...
Connected to huggingface.co.
Escape character is '^]'.
Connection closed by foreign host.
四.执行步骤
1.创建目录
mkdir /kwan
cd /kwan
mkdir software
2.代码
git clone https://github.com/nomewang/M3DM.git
3.requirements
cd M3DM
pip install -r requirements.txt
4.安装其他依赖
pip install ninja
pip install open3d
5.knn_cuda
# install knn_cuda
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
6.pointnet2_ops_lib
# install pointnet2_ops_lib
pip install "git+http://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
7.上传数据集
cd /kwan/M3DM
mkdir -p datasets/mvtec3d
scp /Users/qinyingjie/Downloads/000-训练/dowel.tar.xz root@47.107.139.237:/kwan/M3DM/datasets/mvtec3d
8.预处理
#进入目录
cd /kwan/M3DM
#解压
cd /kwan/M3DM/datasets/mvtec3d
tar -xvf dowel.tar.xz
#数据集预处理
cd /kwan/M3DM
python utils/preprocessing.py datasets/mvtec3d/
9.权重处理
# 下载权重放入文件夹 /checkpoints
cd /kwan/M3DM
mkdir checkpoints
scp /Users/qinyingjie/Downloads/001-资源/B_8-i21k-300ep-lr_0.001-aug_medium1-wd_0.1-do_0.0-sd_0.0--imagenet2012-steps_20k-lr_0.01-res_224.npz root@47.107.139.237:/kwan/M3DM/checkpoints
scp /Users/qinyingjie/Downloads/001-资源/B_8-i21k-300ep-lr_0.001-aug_medium1-wd_0.1-do_0.0-sd_0.0.npz root@47.107.139.237:/kwan/M3DM/checkpoints
scp /Users/qinyingjie/Downloads/001-资源/dino_deitsmall8_pretrain.pth.zip root@47.107.139.237:/kwan/M3DM/checkpoints
scp /Users/qinyingjie/Downloads/001-资源/dino_vitbase8_pretrain.pth root@47.107.139.237:/kwan/M3DM/checkpoints
scp /Users/qinyingjie/Downloads/001-资源/Point-BERT.pth root@47.107.139.237:/kwan/M3DM/checkpoints
scp /Users/qinyingjie/Downloads/001-资源/pointmae_pretrain.pth root@47.107.139.237:/kwan/M3DM/checkpoints
scp /Users/qinyingjie/Downloads/001-资源/uff_pretrain.pth root@47.107.139.237:/kwan/M3DM/checkpoints
10.训练
mkdir -p datasets/patch_lib
#开始训练
python3 main.py \
--method_name DINO+Point_MAE \
--memory_bank multiple \
--rgb_backbone_name vit_base_patch8_224_dino \
--xyz_backbone_name Point_MAE \
--save_feature
问题1:
# AttributeError: module 'torch' has no attribute 'frombuffer'
# 升级torch版本
pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 torchaudio==2.0.1 -f https://download.pytorch.org/whl/torch_stable.html
问题2:
RuntimeError: torch.cat(): expected a non-empty list of Tensors
五.数据集
1.数据集下载
- The
MVTec-3D AD
dataset can be download from the Official Website of MVTec-3D AD. - 下载地址
- The
Eyecandies
dataset can be download from the Official Website of Eyecandies.
After download, put the dataset in dataset
folder.
2.数据准备
To run the preprocessing
python utils/preprocessing.py datasets/mvtec3d/
It may take a few hours to run the preprocessing.
六.Checkpoints 与训练
1.Checkpoints
The following table lists the pretrain model used in M3DM:
Backbone | Pretrain Method |
---|---|
Point Transformer | Point-MAE |
Point Transformer | Point-Bert |
ViT-b/8 | DINO |
ViT-b/8 | Supervised ImageNet 1K |
ViT-b/8 | Supervised ImageNet 21K |
ViT-s/8 | DINO |
UFF | UFF Module |
Put the checkpoint files in checkpoints
folder.
2.训练
Train and test the double lib version and save the feature for UFF training:
mkdir -p datasets/patch_lib
python3 main.py \
--method_name DINO+Point_MAE \
--memory_bank multiple \
--rgb_backbone_name vit_base_patch8_224_dino \
--xyz_backbone_name Point_MAE \
--save_feature \