ER-NeRF是基于NeRF用于生成数字人的方法,可以达到实时生成的效果。
下载源码
cd D:\Projects\
git clone https://github.com/Fictionarry/ER-NeRF
cd D:\Projects\ER-NeRF
下载模型
准备面部解析模型
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_parsing/79999_iter.pth?raw=true -O data_utils/face_parsing/79999_iter.pth
准备basel面部模型
在data_utils/face_tracking文件夹中新建文件夹3DMM
下载01_MorphableModel.mat
https://faces.dmi.unibas.ch/bfm/main.php?nav=1-2&id=downloadshttps://faces.dmi.unibas.ch/bfm/main.php?nav=1-2&id=downloads
勾选选项并填写资料,提交之后一封会发一封邮件到邮箱,包含下载地址及账号密码,输入正确后即可下载到tar的压缩文件,解压后将01_MorphableModel.mat放入项目中的 data_utils/face_tracking/3DMM 文件夹中
其他文件
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/exp_info.npy?raw=true -O data_utils/face_tracking/3DMM/exp_info.npy
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/keys_info.npy?raw=true -O data_utils/face_tracking/3DMM/keys_info.npy
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/sub_mesh.obj?raw=true -O data_utils/face_tracking/3DMM/sub_mesh.obj
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/topology_info.npy?raw=true -O data_utils/face_tracking/3DMM/topology_info.npy
部署项目
拉取cuda116镜像
docker pull nvcr.io/nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04
创建容器
docker run -it --name ernerf -v D:\Projects\ER-NeRF:/ernerf nvcr.io/nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04
安装依赖环境
apt-get update -yq --fix-missing \
&& DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends \
pkg-config \
wget \
cmake \
curl \
git \
vim
# 对于Ubuntu,pyaudio需要portaudio的支持才能正常工作。
apt install portaudio19-dev
安装Miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh -b -u -p ~/miniconda3
~/miniconda3/bin/conda init
source ~/.bashrc
创建环境
conda create -n ernerf python=3.10
conda activate ernerf
安装依赖库
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
pip install -r requirements.txt
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d==0.7.4 -c pytorch3d -c pytorch -c conda-forge
conda install ffmpeg
pip install tensorflow-gpu==2.8.0
pip install numpy==1.22.4
pip install opencv-python-headless
pip install protobuf==3.20.0
下载deepspeech-0_1_0-b90017e8.pb.zip,并将里面的deepspeech-0_1_0-b90017e8.pb解压出来,放入/root/.tensorflow/models下
Releases · osmr/deepspeech_features · GitHubRoutines for DeepSpeech features processing. Contribute to osmr/deepspeech_features development by creating an account on GitHub.https://github.com/osmr/deepspeech_features/releases
cp deepspeech-0_1_0-b90017e8.pb /root/.tensorflow/models
运行 convert_BFM.py
cd data_utils/face_tracking
python convert_BFM.py
预处理
视频预处理
将视频放在 data/<ID>/<ID>.mp4 路径下
视频必须为 25FPS,所有帧都包含说话的人。 分辨率应约为 512x512,持续时间约为 1-5 分钟。
运行脚本以处理视频
python data_utils/process.py data/<ID>/<ID>.mp4
音频预处理
在训练和测试时指定音频功能的类型。
--asr_model <deepspeech, esperanto, hubert>
DeepSpeech
python data_utils/deepspeech_features/extract_ds_features.py --input data/<name>.wav
# save to data/<name>.npy
Wav2Vec
python data_utils/wav2vec.py --wav data/<name>.wav --save_feats
# save to data/<name>_eo.npy
HuBERT
# Borrowed from GeneFace. English pre-trained.
python data_utils/hubert.py --wav data/<name>.wav
# save to data/<name>_hu.npy
训练
首次运行需要一些时间来编译 CUDA 扩展。
# train (head and lpips finetune, run in sequence)
python main.py data/obama/ --workspace trial_obama/ -O --iters 100000
python main.py data/obama/ --workspace trial_obama/ -O --iters 125000 --finetune_lips --patch_size 32
# train (torso)
# <head>.pth should be the latest checkpoint in trial_obama
python main.py data/obama/ --workspace trial_obama_torso/ -O --torso --head_ckpt <head>.pth --iters 200000