https://www.cnblogs.com/ytxwzqin/p/12673661.htmlhttps://www.cnblogs.com/ytxwzqin/p/12673661.html伴奏提取福音,人声分离框架Spleeter1、引言对于制作人、DJ以及任何想分离音频进行单独演奏的人来说,将已经混音后的歌曲拆分为人声和伴奏一直是较为https://mp.weixin.qq.com/s/e8CO6JQE0RFaRFlNUDEjNw
如何利用深度学习实现单通道语音分离?-腾讯云开发者社区-腾讯云大家好,我是来自大象声科的闫永杰,接下来我会从以下六个方面为大家介绍深度学习在单通道语音分离中的应用:https://cloud.tencent.com/developer/article/1460644
https://colab.research.google.com/github/deezer/spleeter/blob/master/spleeter.ipynb#scrollTo=wB58Tiv1ikJ1https://colab.research.google.com/github/deezer/spleeter/blob/master/spleeter.ipynb#scrollTo=wB58Tiv1ikJ1法国音乐流媒体公司Deezer开源的音轨分离软件spleeter,该项目于2019年年中左右发布于github(源地址deezer/spleeter),属于代码交互的机器学习软件,能将音乐的人声和各种乐器声分离,最多支持分离成人声、鼓、贝斯、钢琴、其他共5部分(但实践中通常建议只分为2轨,即伴奏+人声)。
从功能上,目前预训练模型为2stems(分离出人声/伴奏),4stems(分离出人声/伴奏/鼓/贝斯/其他),5stems(人声/鼓/贝斯/钢琴/其他)。性能上,按照spleeter的官网解释,4stems在使用GPU加速的情况下可以达到100s长度的音乐1s分离完成。从效果上来看,spleeter的各项指标均优于目前的其他开源模型。
spleeter基于频域进行音轨分离。其网络结构中,每条音轨对应着一个unet网络结构。2stems对应着两个unet,4stems对应4个unet网络。unet的网络输入为音频幅度谱,输出为某条音轨的幅度谱。训练时损失函数为计算出音轨的幅度谱与标准幅度谱的L1距离。预测时稍有不同,通过多条音轨的幅度谱计算出每条音轨占据输入音频的能量比例,即每条音轨的mask,通过输入音频频谱乘以mask得到各个音轨的输出频谱,计算得到wav。
spleeter训练时的一组数据为(音乐,伴奏,人声),要求三者在时间轴上尽量完全一致,提取三者频谱并计算出幅度谱。将音乐幅度谱分别输入到人声U-Net和伴奏U-Net中,得到预测的人声U-Net和伴奏U-Net,分别计算预测结果和标准结果的距离并取均值。其中伴奏U-Net和人声U-Net内部参数会随着数据输入不断更新。
预测过程没有标准的人声和伴奏,只有混合后的音乐。当预测出伴奏和人声的幅度谱之后,Spleeter将两者分别进行平方,得到人声能量Engv和伴奏能量Enga,然后使用Maskv = Engv/( Engv +Enga)计算出每个时刻人声在各个频段音乐的占比,同时使用Maska= Enga /( Engv +Enga)计算出每个时刻伴奏在各个频段音乐的占比。利用输入的音乐频谱分别乘以Maskv和Maska得到人声和伴奏频谱,最后使用ISTFT得到人声和伴奏的WAV音频文件。
1.install
Package Version
-------------------------------- ---------------------
absl-py 1.4.0
aiohttp 3.8.4
aiosignal 1.3.1
alabaster 0.7.13
albumentations 1.2.1
altair 4.2.2
anyio 3.7.0
appdirs 1.4.4
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
array-record 0.4.0
arviz 0.15.1
astropy 5.2.2
astunparse 1.6.3
async-timeout 4.0.2
attrs 23.1.0
audioread 3.0.0
autograd 1.6.1
Babel 2.12.1
backcall 0.2.0
beautifulsoup4 4.11.2
bleach 6.0.0
blis 0.7.9
blosc2 2.0.0
bokeh 2.4.3
branca 0.6.0
build 0.10.0
CacheControl 0.13.1
cached-property 1.5.2
cachetools 5.3.1
catalogue 2.0.8
certifi 2023.5.7
cffi 1.15.1
chardet 4.0.0
charset-normalizer 2.0.12
chex 0.1.7
click 7.1.2
click-plugins 1.1.1
cligj 0.7.2
cloudpickle 2.2.1
cmake 3.25.2
cmdstanpy 1.1.0
colorcet 3.0.1
colorlover 0.3.0
community 1.0.0b1
confection 0.0.4
cons 0.4.6
contextlib2 0.6.0.post1
contourpy 1.1.0
convertdate 2.4.0
cufflinks 0.17.3
cupy-cuda11x 11.0.0
cvxopt 1.3.1
cvxpy 1.3.1
cycler 0.11.0
cymem 2.0.7
Cython 0.29.35
dask 2022.12.1
datascience 0.17.6
db-dtypes 1.1.1
dbus-python 1.2.16
debugpy 1.6.6
decorator 4.4.2
defusedxml 0.7.1
distributed 2022.12.1
dlib 19.24.2
dm-tree 0.1.8
docutils 0.16
dopamine-rl 4.0.6
duckdb 0.8.1
earthengine-api 0.1.357
easydict 1.10
ecos 2.0.12
editdistance 0.6.2
en-core-web-sm 3.5.0
entrypoints 0.4
ephem 4.1.4
et-xmlfile 1.1.0
etils 1.3.0
etuples 0.3.9
exceptiongroup 1.1.1
fastai 2.7.12
fastcore 1.5.29
fastdownload 0.0.7
fastjsonschema 2.17.1
fastprogress 1.0.3
fastrlock 0.8.1
ffmpeg-python 0.2.0
filelock 3.12.2
Fiona 1.9.4.post1
firebase-admin 5.3.0
Flask 2.2.5
flatbuffers 23.5.26
flax 0.6.11
folium 0.14.0
fonttools 4.40.0
frozendict 2.3.8
frozenlist 1.3.3
fsspec 2023.6.0
future 0.18.3
gast 0.4.0
gcsfs 2023.6.0
GDAL 3.3.2
gdown 4.6.6
gensim 4.3.1
geographiclib 2.0
geopandas 0.13.2
geopy 2.3.0
gin-config 0.5.0
glob2 0.7
google 2.0.3
google-api-core 2.11.1
google-api-python-client 2.84.0
google-auth 2.17.3
google-auth-httplib2 0.1.0
google-auth-oauthlib 1.0.0
google-cloud-bigquery 3.10.0
google-cloud-bigquery-connection 1.12.0
google-cloud-bigquery-storage 2.20.0
google-cloud-core 2.3.2
google-cloud-datastore 2.15.2
google-cloud-firestore 2.11.1
google-cloud-functions 1.13.0
google-cloud-language 2.9.1
google-cloud-storage 2.8.0
google-cloud-translate 3.11.1
google-colab 1.0.0
google-crc32c 1.5.0
google-pasta 0.2.0
google-resumable-media 2.5.0
googleapis-common-protos 1.59.1
googledrivedownloader 0.4
graphviz 0.20.1
greenlet 2.0.2
grpc-google-iam-v1 0.12.6
grpcio 1.56.0
grpcio-status 1.48.2
gspread 3.4.2
gspread-dataframe 3.0.8
gym 0.25.2
gym-notices 0.0.8
h11 0.12.0
h2 4.1.0
h5netcdf 1.2.0
h5py 3.8.0
holidays 0.27.1
holoviews 1.15.4
hpack 4.0.0
html5lib 1.1
httpcore 0.13.7
httpimport 1.3.0
httplib2 0.21.0
httpx 0.19.0
humanize 4.6.0
hyperframe 6.0.1
hyperopt 0.2.7
idna 3.4
imageio 2.25.1
imageio-ffmpeg 0.4.8
imagesize 1.4.1
imbalanced-learn 0.10.1
imgaug 0.4.0
importlib-resources 5.12.0
imutils 0.5.4
inflect 6.0.4
iniconfig 2.0.0
intel-openmp 2023.1.0
ipykernel 5.5.6
ipython 7.34.0
ipython-genutils 0.2.0
ipython-sql 0.4.1
ipywidgets 7.7.1
itsdangerous 2.1.2
jax 0.4.10
jaxlib 0.4.10+cuda11.cudnn86
jieba 0.42.1
Jinja2 3.1.2
joblib 1.2.0
jsonpickle 3.0.1
jsonschema 4.3.3
jupyter-client 6.1.12
jupyter-console 6.1.0
jupyter_core 5.3.1
jupyter-server 1.24.0
jupyterlab-pygments 0.2.2
jupyterlab-widgets 3.0.7
kaggle 1.5.13
keras 2.12.0
kiwisolver 1.4.4
langcodes 3.3.0
lazy_loader 0.2
libclang 16.0.0
librosa 0.8.1
lightgbm 3.3.5
lit 16.0.6
llvmlite 0.38.1
locket 1.0.0
logical-unification 0.4.6
LunarCalendar 0.0.9
lxml 4.9.2
Markdown 3.4.3
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.7.1
matplotlib-inline 0.1.6
matplotlib-venn 0.11.9
mdurl 0.1.2
miniKanren 1.0.3
missingno 0.5.2
mistune 0.8.4
mizani 0.8.1
mkl 2019.0
ml-dtypes 0.2.0
mlxtend 0.14.0
more-itertools 9.1.0
moviepy 1.0.3
mpmath 1.3.0
msgpack 1.0.5
multidict 6.0.4
multipledispatch 0.6.0
multitasking 0.0.11
murmurhash 1.0.9
music21 8.1.0
natsort 8.3.1
nbclient 0.8.0
nbconvert 6.5.4
nbformat 5.9.0
nest-asyncio 1.5.6
networkx 3.1
nibabel 3.0.2
nltk 3.8.1
norbert 0.2.1
notebook 6.4.8
numba 0.55.2
numexpr 2.8.4
numpy 1.22.4
oauth2client 4.1.3
oauthlib 3.2.2
opencv-contrib-python 4.7.0.72
opencv-python 4.7.0.72
opencv-python-headless 4.7.0.72
openpyxl 3.0.10
opt-einsum 3.3.0
optax 0.1.5
orbax-checkpoint 0.2.6
osqp 0.6.2.post8
packaging 23.1
palettable 3.3.3
pandas 1.5.3
pandas-datareader 0.10.0
pandas-gbq 0.17.9
pandocfilters 1.5.0
panel 0.14.4
param 1.13.0
parso 0.8.3
partd 1.4.0
pathlib 1.0.1
pathy 0.10.2
patsy 0.5.3
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.4.0
pip 23.1.2
pip-tools 6.13.0
platformdirs 3.7.0
plotly 5.13.1
plotnine 0.10.1
pluggy 1.2.0
polars 0.17.3
pooch 1.6.0
portpicker 1.5.2
prefetch-generator 1.0.3
preshed 3.0.8
prettytable 0.7.2
proglog 0.1.10
progressbar2 4.2.0
prometheus-client 0.17.0
promise 2.3
prompt-toolkit 3.0.38
prophet 1.1.4
proto-plus 1.22.3
protobuf 3.20.3
psutil 5.9.5
psycopg2 2.9.6
ptyprocess 0.7.0
py-cpuinfo 9.0.0
py4j 0.10.9.7
pyarrow 9.0.0
pyasn1 0.5.0
pyasn1-modules 0.3.0
pycocotools 2.0.6
pycparser 2.21
pyct 0.5.0
pydantic 1.10.9
pydata-google-auth 1.8.0
pydot 1.4.2
pydot-ng 2.0.0
pydotplus 2.0.2
PyDrive 1.3.1
pyerfa 2.0.0.3
pygame 2.4.0
Pygments 2.14.0
PyGObject 3.36.0
pymc 5.1.2
PyMeeus 0.5.12
pymystem3 0.2.0
PyOpenGL 3.1.7
pyparsing 3.1.0
pyproj 3.6.0
pyproject_hooks 1.0.0
pyrsistent 0.19.3
PySocks 1.7.1
pytensor 2.10.1
pytest 7.2.2
python-apt 0.0.0
python-dateutil 2.8.2
python-louvain 0.16
python-slugify 8.0.1
python-utils 3.7.0
pytz 2022.7.1
pyviz-comms 2.3.2
PyWavelets 1.4.1
PyYAML 6.0
pyzmq 23.2.1
qdldl 0.1.7
qudida 0.0.4
regex 2022.10.31
requests 2.27.1
requests-oauthlib 1.3.1
requests-unixsocket 0.2.0
requirements-parser 0.5.0
resampy 0.4.2
rfc3986 1.5.0
rich 13.4.2
rpy2 3.5.5
rsa 4.9
scikit-image 0.19.3
scikit-learn 1.2.2
scipy 1.10.1
scs 3.2.3
seaborn 0.12.2
Send2Trash 1.8.2
setuptools 67.7.2
shapely 2.0.1
six 1.16.0
sklearn-pandas 2.2.0
smart-open 6.3.0
sniffio 1.3.0
snowballstemmer 2.2.0
sortedcontainers 2.4.0
soundfile 0.12.1
soupsieve 2.4.1
soxr 0.3.5
spacy 3.5.3
spacy-legacy 3.0.12
spacy-loggers 1.0.4
Sphinx 3.5.4
sphinxcontrib-applehelp 1.0.4
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.1
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
spleeter 2.3.2
SQLAlchemy 2.0.16
sqlparse 0.4.4
srsly 2.4.6
statsmodels 0.13.5
sympy 1.11.1
tables 3.8.0
tabulate 0.8.10
tblib 2.0.0
tenacity 8.2.2
tensorboard 2.12.3
tensorboard-data-server 0.7.1
tensorflow 2.12.0
tensorflow-datasets 4.9.2
tensorflow-estimator 2.12.0
tensorflow-gcs-config 2.12.0
tensorflow-hub 0.13.0
tensorflow-io-gcs-filesystem 0.32.0
tensorflow-metadata 1.13.1
tensorflow-probability 0.20.1
tensorstore 0.1.38
termcolor 2.3.0
terminado 0.17.1
text-unidecode 1.3
textblob 0.17.1
tf-slim 1.1.0
thinc 8.1.10
threadpoolctl 3.1.0
tifffile 2023.4.12
tinycss2 1.2.1
toml 0.10.2
tomli 2.0.1
toolz 0.12.0
torch 2.0.1+cu118
torchaudio 2.0.2+cu118
torchdata 0.6.1
torchsummary 1.5.1
torchtext 0.15.2
torchvision 0.15.2+cu118
tornado 6.3.1
tqdm 4.65.0
traitlets 5.7.1
triton 2.0.0
tweepy 4.13.0
typer 0.3.2
types-setuptools 68.0.0.0
typing_extensions 4.6.3
tzlocal 5.0.1
uritemplate 4.1.1
urllib3 1.26.16
vega-datasets 0.9.0
wasabi 1.1.2
wcwidth 0.2.6
webcolors 1.13
webencodings 0.5.1
websocket-client 1.6.0
Werkzeug 2.3.6
wheel 0.40.0
widgetsnbextension 3.6.4
wordcloud 1.8.2.2
wrapt 1.14.1
xarray 2022.12.0
xarray-einstats 0.5.1
xgboost 1.7.6
xlrd 2.0.1
yarl 1.9.2
yellowbrick 1.5
yfinance 0.2.21
zict 3.0.0
zipp 3.15.0
2.使用
import os
from glob import glob
from moviepy.editor import VideoFileClip, AudioFileClip
from spleeter.separator import Separator
from pathlib import Path
separator = Separator('spleeter:2stems')
# 提取音频文件
def extract_audio(video_path, audio_path):
video = VideoFileClip(video_path)
audio = video.audio
audio.write_audiofile(audio_path)
def separate_vocals(audio_path, vocals_path):
separator.separate_to_file(audio_path, vocals_path)
# 去除背景音
def remove_background(video_path, no_bg_path, vocals_path):
vocals = AudioFileClip(vocals_path)
video = VideoFileClip(video_path)
video = video.set_audio(vocals)
video.write_videofile(no_bg_path)
def main():
# 视频文件路径
video_path = glob(os.path.join(r"E:\common_tools\wav2lip_tools\wav2lip_tools\data", '*.mp4'))
save_path = ""
Path(save_path).mkdir(parents=True, exist_ok=True)
for video in video_path:
# 音频文件路径
audio_path = os.path.join(save_path, f"audio/{Path(video).stem}.wav")
Path(os.path.join(save_path, "audio")).mkdir(parents=True, exist_ok=True)
# 分离人声后的音频文件路径
vocals_path = os.path.join(save_path, "output_vocals")
Path(vocals_path).mkdir(parents=True, exist_ok=True)
# 去除背景音后的音频文件路径
no_bg_path = os.path.join(save_path, f"output_no_bg/{Path(video).stem}.mp4")
Path(os.path.join(save_path, "output_no_bg")).mkdir(parents=True, exist_ok=True)
# 提取音频
extract_audio(video, audio_path)
# 分离人声
separate_vocals(audio_path, vocals_path)
# 去除背景音
vocals_path_ = os.path.join(vocals_path, f"{Path(video).stem}/vocals.wav")
remove_background(video, no_bg_path, vocals_path_)
if __name__ == "__main__":
main()
即便用cpu,速度也很快。