时长比较短的音频:https://huggingface.co/datasets/PolyAI/minds14/viewer/en-US
时长比较长的音频:https://huggingface.co/datasets/librispeech_asr?row=8
下载数据集
from datasets import load_dataset
minds_14 = load_dataset("PolyAI/minds14", "en-US", split="train",cache_dir="./datasets") # for en-US
# to download all data for multi-lingual fine-tuning uncomment following line
# minds_14 = load_dataset("PolyAI/all", "all")
# see structure
print(minds_14)
# load audio sample on the fly
audio_input = minds_14[0]["audio"] # first decoded audio sample
intent_class = minds_14[0]["intent_class"] # first transcription
intent = minds_14.features["intent_class"].names[intent_class]
# use audio_input and language_class to fine-tune your model for audio classification
使用whisper测试
使用fast whisper
下载安装
报错提示:
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
Please make sure libcudnn_ops_infer.so.8 is in your library path!