一文梳理ChatTTS的进阶用法,手把手带你实现个性化配音,音色、语速、停顿,口语,全搞定

news2024/11/24 6:22:46

前几天和大家分享了如何从0到1搭建一套语音交互系统。

其中,语音合成(TTS)是提升用户体验的关键所在。于是,上一篇接着和大家聊了聊:全网爆火的AI语音合成工具-ChatTTS,有人已经拿它赚到了第一桶金,送增强版整合包。

后台有小伙伴反应实测中发现了一些常见的问题,今天,单独开一篇关于ChatTTS的进阶教程,手把手带你实现如何固定音色、设置语速、添加停顿词、口头语、笑声等,以及超长文本生成背后的原理

全程干货满满,赶紧上车~

全文目录

# ChatTTS 简介
# ChatTTS 基础-环境准备
# ChatTTS 进阶-固定音色
# ChatTTS 进阶-固定语速
# ChatTTS 进阶-添加停顿
## 添加停顿词
## 添加笑声
## 添加口头语
# ChatTTS 进阶-超长文本语音生成

ChatTTS 简介

GitHub 仓库地址:https://github.com/2noise/ChatTTS

ChatTTS 是一个文本转语音的开源项目,短短2周左右的时间,在 GitHub 上已经斩获了 24.4k 的 Star,绝对的爆火级项目。
在这里插入图片描述
还没有体验过 ChatTTS 的小伙伴,出门右转看这篇👉::全网爆火的AI语音合成工具-ChatTTS,有人已经拿它赚到了第一桶金,送增强版整合包。

下面我们直接进入实操,先准备好本地环境,然后开启进阶用法。

ChatTTS 基础-环境准备

1. 安装环境

在自己本地,打开一个终端:

git clone https://github.com/2noise/ChatTTS
cd ChatTTS
pip install -r requirements.txt

2.下载模型

最后先下载模型到本地(代码中自动下载非常慢)。

**方式一:**从huggingface下载(太慢了,很有可能失败,不推荐

# 先安装需要用到的包
pip install huggingface-hub
pip install hf_transfer
conda install -c conda-forge pynini=2.1.5 && pip install WeTextProcessing
pip install nemo_text_processing

# 编写代码
import os
os.environ['HF_ENDPOINT']="https://hf-mirror.com"
os.environ['HF_HUB_ENABLE_HF_TRANSFER']="1"
from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="2Noise/ChatTTS", # 模型名称 or 地址
    local_dir='../models/ChatTTS', # 保存模型到指定位置,供下次加载使用
    cache_dir='../models' # 缓存地址
    )

方式二:从modelscope上下载 (速度更快,强烈推荐

git clone https://www.modelscope.cn/henjicc/ChatTTS.git

3.小试牛刀-文本转语音

进行模型推理实现中文文本转语音,注意将 local_path 改为你自己下载好的模型地址。

import torch
import torchaudio
import ChatTTS

torch._dynamo.config.cache_size_limit = 64
torch._dynamo.config.suppress_errors = True
torch.set_float32_matmul_precision('high')

chat = ChatTTS.Chat()
# compile=True效果好,但是更慢
chat.load_models(source='local', local_path='/path/to/models/ChatTTS', compile=False)

texts = [
    "你好啊,我是猴哥,欢迎您关注 猴哥的AI知识库 公众号,下次更新不迷路",
    ]
wavs = chat.infer(texts)

torchaudio.save("output.wav", torch.from_numpy(wavs[0]), 24000)

如果没有报错,上述代码会将生成结果保存在当前目录,打开试试吧~

你会发现每次生成的音频音色是不一样的,这是因为 ChatTTS 每次都会随机选择一个音色进行合成。

如果想要固定音色,以及更多个性化设置,继续往下看~

ChatTTS 进阶-固定音色

在ChatTTS中,控制音色的主要是参数 spk_emb,让我们看下源码:

def sample_random_speaker(self, ):
  dim = self.pretrain_models['gpt'].gpt.layers[0].mlp.gate_proj.in_features
  # dim=768
  std, mean = self.pretrain_models['spk_stat'].chunk(2)
  # std.size:(768),mean.size(768)
  return torch.randn(dim, device=std.device) * std + mean

sample_random_speaker 返回的是一个 768 维的向量,也就是说如果你把这个 768 维的向量固定住,就可以获取一个固定的音色了。

为了节省大家抽卡获取音色的时间,这里猴哥梳理了两个有特点的音色。保存下来,直接取用~

音色一:男-铿锵有力:

'3.281,2.916,2.316,2.280,-0.884,0.723,-10.224,-0.266,2.000,4.442,1.427,-1.075,-3.027,0.438,1.224,1.043,-0.423,-3.808,0.740,-4.704,10.824,-0.362,-0.291,10.761,1.843,6.068,-4.458,5.142,4.183,4.333,-4.859,-1.444,0.901,3.424,-0.992,-1.645,0.230,0.818,-2.294,1.417,2.839,-0.533,-2.302,-1.379,-3.898,0.998,-0.196,14.249,1.641,-6.927,-1.283,6.005,-0.125,1.531,3.420,8.003,-0.202,-2.823,4.435,2.378,5.186,4.951,1.945,-2.100,1.644,1.921,-1.138,6.229,-5.203,3.610,-0.919,-5.326,2.130,-4.196,21.073,-0.012,11.642,-4.142,-0.581,-0.138,0.882,-1.235,2.377,6.930,3.499,-0.239,4.493,0.411,-2.122,-7.575,0.809,0.885,0.008,0.900,9.930,-3.983,-0.462,-0.699,-4.674,5.071,-3.361,-2.448,0.742,-3.553,-6.620,-0.570,2.859,-3.424,8.462,2.013,-0.204,6.866,-1.946,0.946,-0.440,-4.125,1.194,-2.351,5.368,-5.819,0.049,-0.918,-1.812,-2.288,4.262,12.440,2.084,-0.156,-5.392,0.106,3.279,4.475,0.139,1.006,5.619,0.716,-5.918,-0.774,-5.805,-7.207,-0.402,0.228,-1.422,4.098,1.244,-1.481,-0.672,-3.778,1.220,-0.947,0.807,-1.658,5.977,7.529,-1.070,-2.112,-2.946,-1.739,-2.284,-1.045,-0.442,-1.813,1.269,-2.797,0.321,0.155,-0.379,-2.554,-1.409,2.421,0.318,4.154,0.647,0.610,-3.516,4.489,3.286,3.552,-0.358,1.132,-2.382,-0.612,-5.613,-0.168,-0.680,1.723,-20.408,-2.504,5.870,1.631,4.703,-0.359,0.367,-1.661,-2.172,-9.155,-3.521,-1.058,4.984,-4.086,-3.736,8.959,-1.759,-1.117,17.161,1.497,1.160,2.355,2.584,-2.312,-2.436,0.787,-4.522,5.730,-2.144,1.276,-5.419,-5.749,8.496,5.873,-0.472,-1.497,1.162,8.649,0.326,-0.714,4.123,-3.431,-1.579,0.016,-4.171,-15.907,0.607,1.909,-3.616,-0.803,-0.046,1.836,5.704,-0.730,3.511,7.312,-0.066,0.972,0.199,1.777,-3.750,0.089,18.306,1.156,-2.320,1.280,-2.491,-2.765,-19.535,-1.302,-8.004,-1.833,0.941,-3.547,-0.741,-1.860,-1.626,-2.555,-0.086,0.221,7.494,8.971,-2.007,-2.635,-2.494,-1.794,-5.080,4.963,-1.599,-1.862,-0.225,0.518,2.853,1.657,-0.257,-1.165,-7.366,2.681,5.232,0.592,-2.189,1.606,-1.407,-2.102,-2.945,2.853,8.901,0.029,-5.533,2.786,-0.961,2.544,1.200,-1.171,1.578,-1.681,-0.955,-2.784,2.281,-3.195,2.811,-1.718,0.810,-0.603,-1.736,-1.373,-0.891,-5.330,-1.381,9.614,4.155,0.047,-1.053,3.823,1.010,0.042,-2.849,1.482,0.798,19.538,-12.677,3.608,-1.660,4.493,2.087,-0.906,1.919,1.318,-1.815,-1.287,9.502,0.361,-4.073,-0.589,-1.339,5.567,1.668,0.479,0.333,-26.911,-3.391,-1.020,-4.389,-4.838,2.512,1.478,-3.802,-1.676,-0.469,0.898,3.632,-1.620,0.432,-1.906,-1.674,-7.568,0.361,1.675,-1.164,-13.232,-1.971,-3.983,-2.921,8.597,-0.231,-0.449,0.392,4.953,-5.841,0.047,4.419,-5.355,-4.528,-1.681,-8.136,0.295,3.060,-7.177,-1.483,1.266,2.680,2.265,-8.495,-2.133,1.485,-1.022,-3.286,-1.811,-1.127,-4.633,0.473,6.305,-4.399,-0.452,1.234,-0.245,4.392,2.415,-2.063,3.707,-4.495,3.831,29.219,-11.790,-0.259,1.078,-7.773,1.709,-1.397,-0.067,0.768,2.663,-1.454,-3.022,-1.810,5.015,1.889,1.119,-1.413,-3.240,4.198,14.864,-2.950,1.651,-3.228,-1.095,-2.941,1.231,-8.138,-1.936,-2.160,5.722,-5.481,0.544,-0.786,-1.140,-1.677,2.252,0.031,-2.385,-1.336,3.705,-3.270,-0.495,2.222,1.482,-6.423,-0.509,-1.519,5.775,0.420,-2.140,-3.850,2.567,8.821,-6.976,-6.974,-1.999,-3.226,-1.778,-6.607,9.911,-9.480,1.611,-0.308,-9.211,-1.476,1.751,-0.466,-2.363,-2.358,-4.866,-1.147,-3.444,-6.753,-0.507,-1.245,-2.413,-1.962,2.347,-0.094,-0.745,0.668,2.071,-0.755,-2.450,6.657,7.899,-6.277,2.189,-2.127,2.291,6.095,0.927,-0.250,-1.662,4.031,6.098,-17.201,-9.840,-0.729,1.427,3.222,-0.208,2.380,1.103,1.147,4.051,2.980,0.371,4.143,-1.133,0.486,-0.588,-0.392,-1.507,-0.803,-6.217,-0.947,-5.456,-9.972,-1.221,-1.677,0.376,-0.066,-0.987,-1.144,2.161,-0.282,1.924,1.364,10.975,-6.569,-5.145,-1.047,-1.040,-6.781,-0.387,-21.836,0.683,3.180,-0.757,-0.025,3.418,0.626,1.140,-4.496,-2.867,-2.973,10.865,2.019,-6.465,9.939,-0.107,-2.146,4.367,0.148,-1.867,1.160,-13.273,7.023,0.068,-1.266,-4.770,-1.669,-0.643,1.833,-0.005,-3.790,3.982,-4.962,-0.175,2.481,-0.260,-1.224,1.529,1.225,5.921,5.203,2.303,-2.092,-0.826,5.299,-0.106,0.513,-3.272,1.970,-2.140,-0.033,-3.860,2.144,2.398,-1.478,4.514,1.553,0.901,-2.297,4.627,-0.360,-0.540,0.280,4.952,-2.520,-3.369,19.334,-0.363,0.924,-0.099,-1.544,1.592,-0.163,6.088,-2.863,2.072,-0.336,-1.897,4.755,-3.300,-6.699,-3.852,0.276,-5.298,-0.012,-6.344,3.159,9.078,-2.398,-1.005,-5.604,1.775,4.505,1.172,-3.949,-1.302,5.698,-0.228,-8.495,-4.654,2.533,-4.065,0.098,0.392,2.624,0.928,-0.316,-0.798,-1.814,-3.183,-3.962,-2.339,-2.051,3.898,-1.971,-2.940,8.407,0.673,2.367,7.541,3.822,-6.822,-2.951,-2.204,10.548,-0.255,-1.850,1.263,1.779,1.912,0.608,-1.325,2.220,2.637,1.699,-8.412,-1.397,-0.422,-4.485,3.247,0.382,7.712,-2.262,-1.700,-0.506,7.307,7.266,-0.498,-6.427,-5.879,14.950,0.991,3.057,10.976,-3.662,-0.781,-0.486,7.302,1.052,-3.824,-9.337,-0.783,3.696,-1.935,1.468,-2.889,-1.950,-1.511,2.164,2.791,0.533,-0.986,0.150,1.650,0.909,1.949,-4.902,1.626,0.463,2.622,-1.272,2.164,2.691,-4.966,-0.418,-0.016,-8.490,-3.339,5.198,-3.525,2.392,0.445,1.293,3.217,6.489,1.553,2.088,-1.272,-1.077,3.093,-7.780,4.537,0.824,-1.956,1.215,0.527,2.576,-3.828,2.519,3.089,-4.081,-2.068,-0.241,0.574,-0.245,-1.646,-0.643,3.113,0.861,9.045,-4.593,5.160,3.660,-1.495'

音色二:女-铿锵有力:

'-4.741,0.419,-3.355,3.652,-1.682,-1.254,9.719,1.436,0.871,12.334,-0.175,-2.653,-3.132,0.525,1.573,-0.351,0.030,-3.154,0.935,-0.111,-6.306,-1.840,-0.818,9.773,-1.842,-3.433,-6.200,-4.311,1.162,1.023,11.552,2.769,-2.408,-1.494,-1.143,12.412,0.832,-1.203,5.425,-1.481,0.737,-1.487,6.381,5.821,0.599,6.186,5.379,-2.141,0.697,5.005,-4.944,0.840,-4.974,0.531,-0.679,2.237,4.360,0.438,2.029,1.647,-2.247,-1.716,6.338,1.922,0.731,-2.077,0.707,4.959,-1.969,5.641,2.392,-0.953,0.574,1.061,-9.335,0.658,-0.466,4.813,1.383,-0.907,5.417,-7.383,-3.272,-1.727,2.056,1.996,2.313,-0.492,3.373,0.844,-8.175,-0.558,0.735,-0.921,8.387,-7.800,0.775,1.629,-6.029,0.709,-2.767,-0.534,2.035,2.396,2.278,2.584,3.040,-6.845,7.649,-2.812,-1.958,8.794,2.551,3.977,0.076,-2.073,-4.160,0.806,3.798,-1.968,-4.690,5.702,-4.376,-2.396,1.368,-0.707,4.930,6.926,1.655,4.423,-1.482,-3.670,2.988,-3.296,0.767,3.306,1.623,-3.604,-2.182,-1.480,-2.661,-1.515,-2.546,3.455,-3.500,-3.163,-1.376,-12.772,1.931,4.422,6.434,-0.386,-0.704,-2.720,2.177,-0.666,12.417,4.228,0.823,-1.740,1.285,-2.173,-4.285,-6.220,2.479,3.135,-2.790,1.395,0.946,-0.052,9.148,-2.802,-5.604,-1.884,1.796,-0.391,-1.499,0.661,-2.691,0.680,0.848,3.765,0.092,7.978,3.023,2.450,-15.073,5.077,3.269,2.715,-0.862,2.187,13.048,-7.028,-1.602,-6.784,-3.143,-1.703,1.001,-2.883,0.818,-4.012,4.455,-1.545,-14.483,-1.008,-3.995,2.366,3.961,1.254,-0.458,-1.175,2.027,1.830,2.682,0.131,-1.839,-28.123,-1.482,16.475,2.328,-13.377,-0.980,9.557,0.870,-3.266,-3.214,3.577,2.059,1.676,-0.621,-6.370,-2.842,0.054,-0.059,-3.179,3.182,3.411,4.419,-1.688,-0.663,-5.189,-5.542,-1.146,2.676,2.224,-5.519,6.069,24.349,2.509,4.799,0.024,-2.849,-1.192,-16.989,1.845,6.337,-1.936,-0.585,1.691,-3.564,0.931,0.223,4.314,-2.609,0.544,-1.931,3.604,1.248,-0.852,2.991,-1.499,-3.836,1.774,-0.744,0.824,7.597,-1.538,-0.009,0.494,-2.253,-1.293,-0.475,-3.816,8.165,0.285,-3.348,3.599,-4.959,-1.498,-1.492,-0.867,0.421,-2.191,-1.627,6.027,3.667,-21.459,2.594,-2.997,5.076,0.197,-3.305,3.998,1.642,-6.221,3.177,-3.344,5.457,0.671,-2.765,-0.447,1.080,2.504,1.809,1.144,2.752,0.081,-3.700,0.215,-2.199,3.647,1.977,1.326,3.086,34.789,-1.017,-14.257,-3.121,-0.568,-0.316,11.455,0.625,-6.517,-0.244,-8.490,9.220,0.068,-2.253,-1.485,3.372,2.002,-3.357,3.394,1.879,16.467,-2.271,1.377,-0.611,-5.875,1.004,12.487,2.204,0.115,-4.908,-6.992,-1.821,0.211,0.540,1.239,-2.488,-0.411,2.132,2.130,0.984,-10.669,-7.456,0.624,-0.357,7.948,2.150,-2.052,3.772,-4.367,-11.910,-2.094,3.987,-1.565,0.618,1.152,1.308,-0.807,1.212,-4.476,0.024,-6.449,-0.236,5.085,1.265,-0.586,-2.313,3.642,-0.766,3.626,6.524,-1.686,-2.524,-0.985,-6.501,-2.558,0.487,-0.662,-1.734,0.275,-9.230,-3.785,3.031,1.264,15.340,2.094,1.997,0.408,9.130,0.578,-2.239,-1.493,11.034,2.201,6.757,3.432,-4.133,-3.668,2.099,-6.798,-0.102,2.348,6.910,17.910,-0.779,4.389,1.432,-0.649,5.115,-1.064,3.580,4.129,-4.289,-2.387,-0.327,-1.975,-0.892,5.327,-3.908,3.639,-8.247,-1.876,-10.866,2.139,-3.932,-0.031,-1.444,0.567,-5.543,-2.906,1.399,-0.107,-3.044,-4.660,-1.235,-1.011,9.577,2.294,6.615,-1.279,-2.159,-3.050,-6.493,-7.282,-8.546,5.393,2.050,10.068,3.494,8.810,2.820,3.063,0.603,1.965,2.896,-3.049,7.106,-0.224,-1.016,2.531,-0.902,1.436,-1.843,1.129,6.746,-2.184,0.801,-0.965,-7.555,-18.409,6.176,-3.706,2.261,4.158,-0.928,2.164,-3.248,-4.892,-0.008,-0.521,7.931,-10.693,4.320,-0.841,4.446,-1.591,-0.702,4.075,3.323,-3.406,-1.198,-5.518,-0.036,-2.247,-2.638,2.160,-9.644,-3.858,2.402,-2.640,1.683,-0.961,-3.076,0.226,5.106,0.712,0.669,2.539,-4.340,-0.892,0.732,0.775,-2.757,4.365,-2.368,5.368,0.342,-0.655,0.240,0.775,3.686,-4.008,16.296,4.973,1.851,4.747,0.652,-2.117,6.470,2.189,-8.467,3.236,3.745,-1.332,3.583,-2.504,5.596,-2.440,0.995,-2.267,-3.322,3.490,1.156,1.716,0.669,-3.640,-1.709,5.055,6.265,-3.963,2.863,14.129,5.180,-3.590,0.393,0.234,-3.978,6.946,-0.521,1.925,-1.497,-0.283,0.895,-3.969,5.338,-1.808,-3.578,2.699,2.728,-0.895,-2.175,-2.717,2.574,4.571,1.131,2.187,3.620,-0.388,-3.685,0.979,2.731,-2.164,1.628,-1.006,-7.766,-11.033,-10.985,-2.413,-1.967,0.790,0.826,-1.623,-1.783,3.021,1.598,-0.931,-0.605,-1.684,1.408,-2.771,-2.354,5.564,-2.296,-4.774,-2.830,-5.149,2.731,-3.314,-1.002,3.522,3.235,-1.598,1.923,-2.755,-3.900,-3.519,-1.673,-2.049,-10.404,6.773,1.071,0.247,1.120,-0.794,2.187,-0.189,-5.591,4.361,1.772,1.067,1.895,-5.649,0.946,-2.834,-0.082,3.295,-7.659,-0.128,2.077,-1.638,0.301,-0.974,4.331,11.711,4.199,1.545,-3.236,-4.404,-1.333,0.623,1.414,-0.240,-0.816,-0.808,-1.382,0.632,-5.238,0.120,10.634,-2.026,1.702,-0.469,1.252,1.173,3.015,-8.798,1.633,-5.323,2.149,-6.481,11.635,3.072,5.642,5.252,4.702,-3.523,-0.594,4.150,1.392,0.554,-4.377,3.646,-0.884,1.468,0.779,2.372,-0.101,-5.702,0.539,-0.440,5.149,-0.011,-1.899,-1.349,-0.355,0.076,-0.100,-0.004,5.346,6.276,0.966,-3.138,-2.633,-3.124,3.606,-3.793,-3.332,2.359,-0.739,-3.301,-2.775,-0.491,3.283,-1.394,-1.883,1.203,1.097,2.233,2.170,-2.980,-15.800,-6.791,-0.175,-4.600,-3.840,-4.179,6.568,5.935,-0.431,4.623,4.601,-1.726,0.410,2.591,4.016,8.169,1.763,-3.058,-1.340,6.276,4.682,-0.089,1.301,-4.817'

将音色向量放到代码中,就可以实现生成固定音色的音频,代码奉上:

speaker_vector = ' xxx ' # 768维向量
speaker = torch.tensor([float(x) for x in speaker_vector.split(',')])
my_text = """你好啊,我是猴哥,欢迎您关注 猴哥的AI知识库 公众号,下次更新不迷路"""
params_infer_code = {'prompt':'[speed_2]', 'temperature':.1,'top_P':0.9, 'top_K':1, 'spk_emb' : speaker}
params_refine_text = {'prompt':'[oral_0][laugh_0][break_6]'}

wavs = chat.infer([my_text],
                  do_text_normalization=True,
                  params_refine_text=params_refine_text,
                  params_infer_code=params_infer_code)

torchaudio.save("output.wav", torch.from_numpy(wavs[0]), 24000)

ChatTTS 进阶-固定语速

有心的同学可能已经发现了,上述代码中多了两个参数。

没错,ChatTTS 中,语速是 speed 参数实现的,设置播放的语速,从慢到快,共有10个等级[speed_0]~[speed_9]。

params_infer_code = {'prompt':'[speed_2]'}

ChatTTS 进阶-添加停顿

ChatTTS 中,主要的停顿有三种,分别是:

  • 停顿词
  • 笑声
  • 口头语

添加停顿词

在ChatTTS中,停顿词主要有10个等级 [break_0]~[break_9] 。其中[break_0]是完全没有停顿,[break_9] 停顿最长。当然,模型会根据文本内容自行添加停顿,你也可以在文本中手动添加停顿 [uv_break]。

如果需要 ChatTTS 模型自动添加停顿,只需加上参数 refine_text_only=True

my_text="""不跟生活认输,不向岁月低头。如果风景不够好,就让我们多走几步。如果道路不好走,就让我们放慢脚步。只要还在路上,总能看到想要的风景,总能找到属于自己的方向。"""
params_infer_code = {'prompt':'[speed_4]', 'temperature':.1,'top_P':0.9, 'top_K':1, 'spk_emb' : girl_speaker}
params_refine_text = {'prompt':'[oral_0][laugh_0][break_6]'}
text = chat.infer([my_text], 
                          skip_refine_text=False,
                          refine_text_only=True,
                          params_refine_text=params_refine_text,
                          params_infer_code=params_infer_code
                          )
print("\n".join(text))

输出效果如下:

'不 跟 生 活 认 输 , 不 向 岁 月 低 头 [uv_break] 。 如 果 风 景 不 够 好 [uv_break] , 就 让 我 们 多 走 几 步 。 如 果 道 路 不 好 走 , 就 让 我 们 放 慢 脚 步 [uv_break] 。 只 要 还 在 路 上 , 总 能 看 到 想 要 的 风 景 [uv_break] , 总 能 找 到 属 于 自 己 的 方 向 。'

如果需要自行添加停顿,只需在 文本中添加 [uv_break]

示例如下:

my_text="不跟生活[uv_break]认输,不向岁月低头[laugh]。"

添加笑声

在ChatTTS中,添加笑声主要是通过 [laugh], 共有10个等级[laugh_0]~[laugh_9]。其中[laugh_0]是没有笑声,[laugh_9]笑声较多。当然,模型会根据文本自动添加笑声,也可以像上面的示例一样手动添加 [laugh].

添加口头语

口头语,就是:“那么,就是,嗯” 等语气词。

在ChatTTS中,添加口头语主要是通过 [oral],共有10个等级[oral_0]~[oral_9]。其中[oral-0]基本不添加口头语,[oral_9]添加最多口头语。

举个例子而言,你可以将下面这个参数放入模型试试看哈:

params_refine_text = {'prompt':'[oral_9][laugh_9][break_6]'}

ChatTTS 进阶-超长文本语音生成

目前 ChatTTS ,受模型限制,一次性只支持 30 秒以内的语音生成,当你一次生成文本过多时,模型只会截取前30秒的,之后的文本将被截断。

而要实现超长文本语音生成,只能通过过将文本拆分成多个句子,然后在合并音频的方式。

比如按行拆分,代码实现如下:

text = 
	"""
	如果本文对你有帮助,欢迎点赞收藏备用!
	猴哥一直在做 AI 领域的研发和探索,会陆续跟大家分享路上的思考和心得。
	最近开始运营一个公众号,旨在分享关于AI效率工具、自媒体副业的一切。用心做内容,不辜负每一份关注。
	新朋友欢迎关注 “猴哥的AI知识库” 公众号,下次更新不迷路。
	"""
params_infer_code = {'prompt':'[speed_2]', 'temperature':.1,'top_P':0.9, 'top_K':1, 'spk_emb' : speaker}
params_refine_text = {'prompt':'[oral_6][laugh_3][break_6]'}
# 将文本按照句号拆分为句子
inputs_text = text.split("\n")
wavs = chat.infer(inputs_text,
                  do_text_normalization=True,
                  params_refine_text=params_refine_text,
                  params_infer_code=params_infer_code)
# 合并音频
finally_wavs = torch.tensor(np.concatenate(wavs,axis=-1))
# 将输出的语音保存为音频文件
torchaudio.save("output.wav",finally_wavs, 24000)

写在最后

这是关于 ChatTTS 的第三篇文章,希望这些进阶用法,可以帮你轻松打造更个性化的 AI 语音助手。

如果本文对你有帮助,欢迎 点赞收藏 备用!

猴哥一直在做 AI 领域的研发和探索,会陆续跟大家分享路上的思考和心得。

最近开始运营一个公众号,旨在分享关于AI效率工具、自媒体副业的一切。用心做内容,不辜负每一份关注。

新朋友欢迎关注 “猴哥的AI知识库” 公众号,下次更新不迷路。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1834527.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

代理配置SQUID

目录 SQUID代理服务器配置 监听浏览器访问记录 拒绝访问配置 SQUID代理服务器配置 实验系统 windows 10 xxxxx Roucky_linux9.4 192.168.226.22 监听浏览器访问记录 1. 安装squid yum install squid -y 2. 编辑squid配置文件 vim /etc/squid…

等保一体机:多种防护机制,让等保合规简单高效!

自1994年国务院颁布《中华人民共和国计算机信息系统安全保护条例》规定计算机信息系统实行安全等级保护以来,等级保护工作经过了近25年的发展历程,成为了我国网络安全保护的重要举措之一。 2019年12月1日等保2.0正式开始实施,我国网络安全行业…

【大分享05】动态容差归档,打通不动产登记管理“最后一公里”

关注我们 - 数字罗塞塔计划 - 本篇是参加由电子文件管理推进联盟联合数字罗塞塔计划发起的“大分享”活动投稿文章,来自上海涵妍档案信息技术有限责任公司,作者:陈雪。 一、政策背景 在“互联网政务服务”的浪潮下,各级政府机构…

在低侧电流检测中使用单端放大器:误差源和布局技巧

低侧检测的主要优点是可以使用相对简单的配置来放大分流电阻器两端的电压。例如,通用运算放大器的同相配置可能是需要能够在消费市场领域竞争的成本敏感型电机控制应用的有效选择。 基于同相配置的电路图如图1所示。 图1。 然而,这种低成本解决方案可能…

2288. 价格减免

题目 给定一个字符串列表 sentence,表示一个句子,其中每个单词可以包含数字、小写字母和美元符号 $。如果单词的形式为美元符号后跟着一个非负实数,那么这个单词就表示一个价格。我们需要在价格的基础上减免给定的 discount%,并更…

2023数A题——WLAN网络信道接入机制建模

A题——WLAN网络信道接入机制建模 思路:该题主要考察的WLAN下退避机制建模仿真。 资料获取 问题1: 假设AP发送包的载荷长度为1500Bytes(1Bytes 8bits),PHY头时长为13.6μs,MAC头为30Bytes,MA…

上海科技博物馆超薄OLED柔性壁纸屏应用方案

产品:2组55寸OLED柔性屏2x1 特点:嵌入墙体,与空间装饰融入一体 用途:播放文物展示 一、项目背景 上海科技博物馆作为展示科技与文化的交汇点,一直致力于为观众提供沉浸式的参观体验。为了提升文物展示的现代化和科技感…

不可忽视的9条网页排版设计规则,你了解吗?

网页设计由95%的排版组成。网页排版设计使图形的放置栩栩如生,让用户保持愉悦,容易被用户视觉感知。在这个过程中,网页排版设计需要考虑很多因素:款式、大小、字体颜色等。此外,设计师通过网页排版,让文字增加设计的美…

Kafka 高性能 7 大秘诀之 Segment 消息存储机制的奥妙

《Kafka 高性能 7 大秘诀》第 4 篇,解密 kafka Segment 日志存储思想哲学以及如何将磁盘的随机读写变成顺序读写,提高磁盘读写速度。 Kafka 使用日志文件存储消息,每个 Partition 的消息被存储在多个 Segment 文件中,避免了单个文…

经典神经网络(11)VQ-VAE模型及其在MNIST数据集上的应用

经典神经网络(11)VQ-VAE模型及其在MNIST数据集上的应用 我们之前已经了解了PixelCNN模型。 经典神经网络(10)PixelCNN模型、Gated PixelCNN模型及其在MNIST数据集上的应用 今天,我们了解下DeepMind在2017年提出的一种基于离散隐变量(Discrete Latent va…

OneNote 作为恶意软件分发新渠道持续增长

目前,Office 文件已经默认禁用宏代码,攻击者开始转向利用其他微软的软件产品来进行恶意 Payload 投递。默认情况下,OneNote 应用也包含在 Office 2019 和 Microsoft 365 软件中,所以 OneNote 文件越来越受到攻击者的青睐。如果有人…

调度算法-内存页面置换算法

缺⻚异常(缺⻚中断) 与⼀般中断的主要区别在于: 缺⻚中断在指令执⾏「期间」产⽣和处理中断信号,⽽⼀般中断在⼀条指令执⾏「完成」后检查和处理中断信号。缺⻚中断返回到该指令的开始重新执⾏「该指令」,⽽⼀般中断返…

如何完美解决 Oracle Database 19c 安装程序 - 第7步(共8步)卡住,半小时都不动

🚀 如何完美解决 Oracle Database 19c 安装程序 - 第7步(共8步)卡住,半小时都不动 摘要 在安装 Oracle Database 19c 时,很多用户会在第7步(共8步)遇到卡住的问题,尤其是安装程序长…

ESP32蓝牙串口通讯

文章目录 一、前言二、代码三、运行 一、前言 ESP32支持经典蓝牙和低功耗蓝牙(BLE),经典蓝牙可在计算机上模拟出一个串口,使得ESP32可以以串口的方式和计算机通信。 二、代码 #include "BluetoothSerial.h"String device_name …

Upload-Labs-Linux1 使用 一句话木马

解题步骤&#xff1a; 1.新建一个php文件&#xff0c;编写内容&#xff1a; <?php eval($_REQUEST[123]) ?> 2.将编写好的php文件上传&#xff0c;但是发现被阻止&#xff0c;网站只能上传图片文件。 3.解决方法&#xff1a; 将php文件改为图片文件&#xff08;例…

目标检测顶会新成果!20个突破性方法,更高性能,更强理解与分析能力!

【目标检测】在近年来的深度学习领域中备受关注&#xff0c;它通过识别和定位图像中的目标对象&#xff0c;提升了模型在图像理解和分析方面的能力。目标检测技术在自动驾驶、安防监控和医疗影像分析等任务中取得了显著成果。其独特的方法和卓越的表现使其成为研究热点之一。 为…

面试经典150题

打家劫舍 class Solution { public:int rob(vector<int>& nums) {int n nums.size();if(n 1){return nums[0];}vector<int> dp(n, 0);dp[0] nums[0];//有一间房可以偷//有两间房可以偷if(nums[1] > nums[0]){dp[1] nums[1];}else{dp[1] nums[0];}for …

MySQL----InooDB行级锁、间隙锁

行级锁 行锁&#xff0c;也称为记录锁&#xff0c;顾名思义就是在记录上加的锁。 注意&#xff1a; InnoDB行锁是通过给索引上的索引项加锁来实现的&#xff0c;而不是给表的行记录加锁实现的&#xff0c;这就意味着只有通过索引条件检索数据&#xff0c;InnoDB才使用行级锁…

电商API接口是什么意思?有什么作用?

电商API接口是电子商务领域中一种技术解决方案&#xff0c;它允许不同的软件系统之间进行交互和数据交换。 在电商场景下&#xff0c;电商API接口可以实现的功能非常丰富&#xff0c;例如&#xff1a; 商品管理&#xff1a;获取商品列表、商品详情、搜索商品、上下架商品等&a…

vue页面前端初始化表格数据时报错TypeError: data.reduce is not a function

这是初始化表格数据时报的错 。 [Vue warn]: Invalid prop: type check failed for prop "data". Expected Array, got Object found in---> <ElTable> at packages/table/src/table.vue<List> at src/views/org/List.vue<Catalogue> at src/v…