目录
一、安装环境(配置yolo、demo测试)
二、数据集准备(格式学习)
三、训练数据集
1.划分数据集
2.训练数据集
2.1常规训练
2.2微调
3.各种报错记录
3.1AttributeError
3.2TypeError
3.3Error while loading conda entry point
4.训练结果
4.1 RGB-yolov8n(train16/train)
4.2 RGB-yolov8s(train2)
4.3 RGB-yolov8m
4.4 RGB-yolov8l
4.5 RGB-yolov8x
4.6 NIR-yolov8n
4.7 NIR-yolov8m
4.8 NIR-yolov8l
4.9 NIR-yolov8x
4.10 NIR-yolov8s
四、验证(val)
五、预测结果(predict)
六、数据处理和分析
七、总结
写在前面:这篇笔记全过程花了大概两天的时间,第一部分花了一下午,第二部分+第三部分花了一下午+一晚上,第四+五部分也是一下午,最后结果分析花了一晚上时间。(完全Yolo零基础,只复现过一个深度学习算法的小白)
一、安装环境(配置yolo、demo测试)
最开始先连接服务器,相关内容如下(本地跑不动我的数据集)
服务器安装基本环境教程https://blog.csdn.net/qq_53826699/article/details/140666990?csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22140666990%22%2C%22source%22%3A%22qq_53826699%22%7D
Pycharm教程https://blog.csdn.net/weixin_45662399/article/details/134499605?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522172129266716800188593845%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=172129266716800188593845&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~top_positive~default-1-134499605-null-null.142%5Ev100%5Epc_search_result_base6&utm_term=YOLOV8&spm=1018.2226.3001.4187我用的是VsCode,全程在安装ultralytics包的过程中出了一点小问题(注意神秘的力量,安装的时候不要开,否则一直会报SSL Error),总共花了大约两小时完成了全部配置+demo测试(bus和自己用了一张图)
其他常见问题比如HTTP Error可以看我之前写的这篇文章:
Anaconda-HTTP Error、SSLErro怎么解决【5种方法综合试验后最强版解决方案】https://blog.csdn.net/qq_53826699/article/details/140520692?spm=1001.2014.3001.5501
首先cd到requirements.txt的路径上,然后执行下述命令行
pip install requirements.txt
pip install ultralytics
pip install yolo
总共就这三行指令的事,其他博客写的那个天花乱坠啊,注意点如下:
1、 神秘力量(有时候开着不能安装,有时候开着安装更快(不知道
2、记得activate环境之后再Pip这些玩意哦
然后pip list检查一下是否安装成功(如下图所示,即为成功)
pip list
然后测试一下是否可以运行(注意好路径)
yolo predict model=yolov8n.pt source=ultralytics/ultralytics/assets/bus.jpg
运行命令后可以实现检测,结果保存在runs->detect->predict文件夹下, 可以看到已经完成了目标框和类别概率的显示和绘制
看不懂的话还可参考以下两篇文章:
https://blog.csdn.net/weixin_45819759/article/details/131962654https://blog.csdn.net/weixin_45819759/article/details/131962654https://blog.csdn.net/weixin_45662399/article/details/134499605?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522172126313516800222832426%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=172126313516800222832426&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~top_positive~default-1-134499605-null-null.142%5Ev100%5Epc_search_result_base6&utm_term=yolov8&spm=1018.2226.3001.4187https://blog.csdn.net/weixin_45662399/article/details/134499605?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522172126313516800222832426%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=172126313516800222832426&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~top_positive~default-1-134499605-null-null.142%5Ev100%5Epc_search_result_base6&utm_term=yolov8&spm=1018.2226.3001.4187
二、数据集准备(格式学习)
修改yaml文件,给出了数据集的路径、训练集、验证集和测试集所在的位置
三、训练数据集
1.划分数据集
先找到数据分割的.py文件,然后一定要修改路径!!!
python splitDataset.py
上图所示,即为分割成功~并且会生成test.txt和train.txt两个文件
2.训练数据集
2.1常规训练
- data = datasets/mmship/data/RGB/MMship.yaml
- model = yolov8n.yaml
- pretrained = ultralytics/yolov8n.pt
- epoch = 12
yolo detect train data = ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True
yolo detect train data = ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True
2.2微调
考虑上wd和数据增强的指令行如下:
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
yolo detect train data = ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
3.各种报错记录
3.1AttributeError
AttributeError: module 'torch.amp' has no attribute 'GradScaler'
解决方案:找到GradScaler这个函数(ctrl+点击函数),然后将 torch.amp.GradScaler 改为torch.cuda.amp.GradScaler
上个问题解决了,重新运行一下:
3.2TypeError
然后又报错:TypeError: full() received an invalid combination of arguments - got (tuple, str, device=torch.device, dtype=torch.dtype), but expected one of:
* (tuple of ints size, Number fill_value, *, tuple of names names, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
* (tuple of SymInts size, Number fill_value, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
解决方案:
torch.full((1,), self._init_scale, dtype=torch.float32, device=dev)
torch.full((1,), self._init_scale, dtype=torch.float32, device=dev)
def full(size: _size, fill_value: Union[Number, _complex], *, out: Optional[Tensor] = None, layout: _layout = strided, dtype: Optional[_dtype] = None, device: Optional[DeviceLikeType] = None, requires_grad: _bool = False, pin_memory: _bool = False) -> Tensor:
def full(size: _size, fill_value: Union[Number, _complex], *, names: List[Union[str, None]], layout: _layout = strided, dtype: Optional[_dtype] = None, device: Optional[DeviceLikeType] = None, requires_grad: _bool = False, pin_memory: _bool = False) -> Tensor:
def full(size: Sequence[Union[_int, SymInt]], fill_value: Union[Number, _complex], *, out: Optional[Tensor] = None, dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor:
def full(size: _size, fill_value: Union[Number, _complex], *, names: Optional[Sequence[Union[str, ellipsis, None]]], dtype: Optional[_dtype] = None, layout: Optional[_layout] = None, device: Optional[Optional[DeviceLikeType]] = None, pin_memory: Optional[_bool] = False, requires_grad: Optional[_bool] = False) -> Tensor:
根据报错信息,传递给
torch.full()
函数的参数组合不正确。报错信息给出了两种正确的签名:
(tuple of ints size, Number fill_value, *, tuple of names names, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
(tuple of SymInts size, Number fill_value, *, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
从提供的四个函数定义来看,最接近的应该是第一种签名。尝试如下修改:
torch.full((1,), self._init_scale, dtype=torch.float32, device=dev)
或者
torch.empty(1, dtype=torch.float32, device=dev)
torch.full((1,), self._init_scale, out=out)这里的关键是:
size
参数应该是一个tuple
,即使只有一个元素也要用(1,)
的形式传递。fill_value
参数应该是一个Number
,您传递的self._init_scale
应该满足这个要求。- 其他可选参数
names
、layout
、pin_memory
、requires_grad
您可以不传,使用默认值即可。
哈哈哈哈哈
以为以上有用吗!
不!!!!!!!!
你只要安一个高版本的torch!!!
就解决啦!!!!!!!!!!
纯纯是torch内嵌包的问题啦!!!!!
3.3Error while loading conda entry point
然后创建环境的时候又有一个问题,但是貌似啥都不影响
Error while loading conda entry point: anaconda-cloud-auth (cannot import name 'Callable' from 'collections' (/data1/zhangjiening/anaconda3/lib/python3.12/collections/__init__.py))
然后就训练上了wwww
4.训练结果
练结束后训练结果都保存在runs这个文件夹下,可以看到有所有的指标曲线的可视化;
还有模型训练出来的权重,best.pt为训练的最好的一组权重,后面可以使用。
4.1 RGB-yolov8n(train16/train)
train16是2.1中指令,如下图所示
yolo detect train data = ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True
yolo detect train data = ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True
train是2.2中指令,如下图所示
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
我要复现的结果,如下图所示
4.2 RGB-yolov8s(train2)
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8s.yaml pretrained=ultralytics/yolov8s.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8s.yaml pretrained=ultralytics/yolov8s.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
然后报错:
Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt to 'ultralytics/yolov8s.pt'...
⚠️ Download failure, retrying 1/3 https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt...
curl: (28) Failed to connect to github.com port 443: 连接超时 # # # #
Warning: Transient problem: timeout Will retry in 1 seconds. 3 retries left.
解决方案:纯网络原因,重新下载就好了啦
结果如下图:
4.3 RGB-yolov8m(train3)
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8m.yaml pretrained=ultralytics/yolov8m.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.4 RGB-yolov8l(train4)
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8l.yaml pretrained=ultralytics/yolov8l.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.5 RGB-yolov8x(train5)
yolo detect train data=ultralytics/datasets/mmship/data/RGB/MMship.yaml model=yolov8x.yaml pretrained=ultralytics/yolov8x.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.6 NIR-yolov8n(train6)
yolo detect train data=ultralytics/datasets/mmship/data/NIR/MMship.yaml model=yolov8n.yaml pretrained=ultralytics/yolov8n.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.7 NIR-yolov8m(train7)
yolo detect train data=ultralytics/datasets/mmship/data/NIR/MMship.yaml model=yolov8m.yaml pretrained=ultralytics/yolov8m.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.8 NIR-yolov8l(train8)
yolo detect train data=ultralytics/datasets/mmship/data/NIR/MMship.yaml model=yolov8l.yaml pretrained=ultralytics/yolov8l.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.9 NIR-yolov8x(train9)
yolo detect train data=ultralytics/datasets/mmship/data/NIR/MMship.yaml model=yolov8x.yaml pretrained=ultralytics/yolov8x.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
4.10 NIR-yolov8s(train10)
yolo detect train data=ultralytics/datasets/mmship/data/NIR/MMship.yaml model=yolov8s.yaml pretrained=ultralytics/yolov8s.pt epochs=12 batch=4 lr0=0.001 resume=True device=0 augment=True weight_decay=0.05
四、验证(val)
进行模型的验证,这里的models为训练的最好的那一组权重
yolo detect val data = ultralytics/datasets/mmship/data/RGB/MMship.yaml model=runs/detect/train16/weights/best.pt batch=4
五、预测结果(predict)
随便找一张图片进行预测,可以看到标注出来了所属类别、位置和概率值
yolo predict model=runs/detect/train16/weights/best.pt source=ultralytics/datasets/mmship/data/RGB/images/0001032.jpg
六、数据处理和分析
文件里面jpg文件都是些可视化展示
png文件包含数据集分析和训练过程的指标,都是yolo自动生成的
主要看两个results文件,指标是AP和mAP
参数配置是yaml文件,训练好的模型参数文件是pt
yolo有很多可配置的参数,具体可以参考官方
https://docs.ultralytics.com/tasks/detect/#trainhttps://docs.ultralytics.com/tasks/detect/#train后续会更新对上述实验结果的数据处理,会新发一篇完整的笔记并将link放这~可以关注一下
七、总结
1. 大多数情况下gpt/poe对于代码报错的解释,给出的解释都没问题,但是轻易不要相信他提出的任何解决方案,基本都是没用的。
去github直接搜你的报错,然后看有回复的贴子
或者去goole直接搜报错,看最新的有回复的帖子
或者来CSDN(但大多数不靠谱,特别是时间久远的,版本都不对应)
2. 路径啊 路径啊 路径啊 重要的事情说三遍 时刻关注路径!!无论是安装的时候,还是运行的时候,最先关注的都应该是路径是否正确!
3.各种内嵌类型的报错,或者很难搜索到的报错,都极有可能是版本问题造成的