文章目录
- 关于 ESPnet
- 安装配置
- 运行 yesno
关于 ESPnet
- github: https://github.com/espnet/espnet
ESPnet is an end-to-end speech processing toolkit covering end-to-end speech recognition, text-to-speech, speech translation, speech enhancement, speaker diarization, spoken language understanding, and so on.
ESPnet uses pytorch as a deep learning engine and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for various speech processing experiments.
ESPnet 是一个端到端语音处理工具包,包含语音识别、文字转语音、语音翻译、语音增强、说话人识别、口语理解等。
ESPnet 使用 PyTorch 作为深度学习引擎,并遵循 Kaldi 风格数据处理、特征抽取/格式、方案,为各种语音处理实验提供完整的设置。
安装配置
1、下载
git clone https://github.com/espnet/espnet.git
2、设置软链接
cd espnet/tools
ln -s <path to kaldi> .
3、安装依赖包
pip install chainer==6.0.0 cupy-cuda92==6.0.0
espnet/tools下执行check_install.py
python3 check_install.py
4、make
make KALDI=~/xxcode/kaldi PYTHON=~/miniconda3/bin/python CUDA_VERSION=11.3
运行 yesno
进入 espnet/egs/yesno
文件夹,下面有 tts1 和 asr1 文件夹。进入一个,然后执行:
sh run.sh
tts 执行成功后,打印如下:
Succeeded creating wav for test_yesno
Succeeded creating wav for train_dev
Finished.
asr 执行成功后,将打印如下:
2023-01-28 19:57:40,756 (json2trn:46) INFO: reading exp/train_nodev_pytorch_train/decode_test_yesno_decode/data.json
2023-01-28 19:57:40,756 (json2trn:50) INFO: reading data/lang_1char/train_nodev_units.txt
write a CER (or TER) result in exp/train_nodev_pytorch_train/decode_test_yesno_decode/result.txt
| SPKR | # Snt # Wrd | Corr Sub Del Ins Err S.Err |
| Sum/Avg| 30 835 | 47.9 51.1 1.0 47.2 99.3 100.0 |
Finished
如果执行失败,如果是某个文件、command 找不到,可以手动查找下。如果有就将其所在文件夹添加到环境变量。如果没有,需要检查下,是否某个步骤没有编译成功。
如果这些都没问题, 可以检查下,Kaldi 是否安装配置成功。
Kaldi 安装配置可参考:https://blog.csdn.net/lovechris00/article/details/128347128
2023-01-28(周六)
初七、开工第一天,伊织祝大家学有所成