现象
python在windows环境下dist.init_process_group(backend, rank, world_size)处报错‘RuntimeError: Distributed package doesn’t have NCCL built in’
原因分析
windows不支持NCCL backend
方法1
import sys
if sys.platform == "win32":
os.environ["PL_TORCH_DISTRIBUTED_BACKEND"] = "gloo"
方法2
import os
os.environ["PL_TORCH_DISTRIBUTED_BACKEND"] = "gloo"
直接复制到项目中例如: