UE C++ Windows平台调用讯飞语音合成接口
- 环境设置
- 调用讯飞语音接口
- 回放语音数据
- 输出EXE
环境设置
- 下载讯飞语音合成的Windows平台的C++版本SDK,包含lib库文件和dll动态链接库
- 在UE工程下新建一个ThirdParty/msc目录,将lib库文件和dll动态链接库放入其中
- [PROJECT].Build.cs文件增加库文件路径
- 新建一个UE C++类ASpeech,继承自AActor,CPP文件引入讯飞库
目录结构
[PROJECT].Build.cs文件新增库文件路径
// To include OnlineSubsystemSteam, add it to the plugins section in your uproject file with the Enabled attribute set to true
string MSCPath = Path.Combine(ThirdPartyPath, "msc/");
PublicIncludePaths.AddRange(new string[] { Path.Combine(MSCPath, "Includes") });
PublicSystemLibraryPaths.Add(Path.Combine(MSCPath, "Libraries"));
Speech.cpp文件引入库文件
#include "Speech.h"
#include "qtts.h"
#include "msp_cmn.h"
#include "msp_errors.h"
#include "Kismet/GameplayStatics.h"
using namespace Audio;
#ifdef _WIN64
#pragma comment(lib,"msc_x64.lib")//x64
#else
#pragma comment(lib,"msc.lib")//x86
#endif
调用讯飞语音接口
讯飞语音合成接口主要由以下几个:
MSPLogin
QTTSSessionBegin
QTTSTextPut
QTTSAudioGet
QTTSSessionEnd
MSPLogout
其中MSPLogin和MSPLogout在程序开始和结束的时候调用一次即可。
每次合成语音时,调用QTTSTextPut,然后循环调用QTTSAudioGet持续获取合成音频数据,直到数据全部接收完成,调用QTTSSessionEnd结束本次语音合成任务
代码如下
- 初始化
bool ASpeech::Init(const FString& params) {
// Init TTS
int ret = MSP_SUCCESS;
ret = MSPLogin(NULL, NULL, TCHAR_TO_UTF8(*params)); //第一个参数是用户名,第二个参数是密码,第三个参数是登录参数,用户名和密码可在http://www.xfyun.cn注册获取
if (MSP_SUCCESS != ret)
{
FCString::Sprintf(_debug_string_buff, TEXT("MSPLogin failed, error code: %d."), ret);
ScreenMsg(_debug_string_buff);
bInited = false;
}
else {
bInited = true;
}
return bInited;
}
- 文字转语音
bool ASpeech::Text2Speech(const FString& text, const FString& params)
{
if (bGenerating) {
ScreenMsg(TEXT("Failed: Audio generation in progress!"));
return false;
}
if (text.IsEmpty() || params.IsEmpty()) {
ScreenMsg(TEXT("Failed: Parameter cannot be empty!"));
return false;
}
// Init params
int ret = -1;
sessionID = NULL;
bGenerating = false;
GenerateIndex = 0;
DataLength = 0;
audioData.SetNumUninitialized(0);
bPlaying = false;
PlayTime = UGameplayStatics::GetTimeSeconds(GWorld);
DataTime = 0;
/* 开始合成 */
sessionID = QTTSSessionBegin(TCHAR_TO_UTF8(*params), &ret);
if (MSP_SUCCESS != ret)
{
FCString::Sprintf(_debug_string_buff, TEXT("QTTSSessionBegin failed, error code: %d."), ret);
ScreenMsg(_debug_string_buff);
return false;
}
const char* p_text = TCHAR_TO_UTF8(*text);
ret = QTTSTextPut(sessionID, p_text, (unsigned int)strlen(p_text), NULL);
if (MSP_SUCCESS != ret)
{
FCString::Sprintf(_debug_string_buff, TEXT("QTTSTextPut failed, error code: %d."), ret);
ScreenMsg(_debug_string_buff);
QTTSSessionEnd(sessionID, "TextPutError");
return false;
}
bGenerating = true;
return true;
}
- 在Tick循环中获取语音数据
void ASpeech::Tick(float DeltaTime)
{
Super::Tick(DeltaTime);
/* 获取合成音频 */
if (bGenerating && NULL != sessionID) {
int ret = -1;
unsigned int len = 0;
int synth_status = MSP_TTS_FLAG_STILL_HAVE_DATA;
const void* data = QTTSAudioGet(sessionID, &len, &synth_status, &ret);
if (MSP_SUCCESS == ret) {
if (len > 0) {
uint8 *p_data = (uint8 *)data;
for (unsigned int i = 0; i < len; ++i) {
audioData.Add(p_data[i]);
}
DataLength += len;
// Play audio
if (0 == GenerateIndex) { // 第一次接收到语音数据,同步开始播放
if (NULL != AudioComponent) {
AudioComponent->Play();
bPlaying = true;
PlayTime = UGameplayStatics::GetTimeSeconds(GWorld);
}
}
++GenerateIndex;
payloadReceivedVoiceData(p_data, len); // 装在数据到播放Wave数据区
OnAudioGet(); // 蓝图调用通知
}
if (MSP_TTS_FLAG_DATA_END == synth_status) {
/* 合成完毕 */
ret = QTTSSessionEnd(sessionID, "Normal");
if (MSP_SUCCESS != ret)
{
FCString::Sprintf(_debug_string_buff, TEXT("QTTSSessionEnd failed, error code: %d."), ret);
ScreenMsg(_debug_string_buff);
}
bGenerating = false;
sessionID = NULL;
}
}
else {
/* 合成失败 */
FCString::Sprintf(_debug_string_buff, TEXT("QTTSAudioGet failed, error code: %d."), ret);
ScreenMsg(_debug_string_buff);
QTTSSessionEnd(sessionID, "AudioGetError");
bGenerating = false;
sessionID = NULL;
}
}
}
- EndPlay结束时反初始化
void ASpeech::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
Uninit();
Super::EndPlay(EndPlayReason);
}
void ASpeech::Uninit() {
MSPLogout();
bInited = false;
}
回放语音数据
合成的语音数据为PCM格式数据,需要通过SoundWaveProcedural和AudioComponent进行加载和回放
- SoundWaveProcedural为音频源,可以动态装载音频数据
- AudioComponent为播放组件
为Speech Actor增加SoundWaveProcedural和AudioComponent组件
1、在构造函数中创建AudioComponent组件
ASpeech::ASpeech()
:
SoundWaveProcedural(NULL),
NumChannels(1),
NumSamples(samples_per_sec),
SampleRate(samples_per_sec)
{
AudioComponent = CreateDefaultSubobject<UAudioComponent>(TEXT("Audio"));
AudioComponent->SetupAttachment(GetRootComponent());
// Set this actor to call Tick() every frame. You can turn this off to improve performance if you don't need it.
PrimaryActorTick.bCanEverTick = true;
}
2、在BeginPlay函数中初始化SoundWaveProcedural对象,并设置为AudioComponent播放源
void ASpeech::BeginPlay()
{
Super::BeginPlay();
// Init audio component
AudioComponent->bAutoActivate = true;
AudioComponent->bAlwaysPlay = true;
AudioComponent->PitchMultiplier = 1.0f;
AudioComponent->VolumeMultiplier = 1.0f;
AudioComponent->bIsUISound = false;
AudioComponent->AttenuationSettings = nullptr;
AudioComponent->bOverrideAttenuation = false;
AudioComponent->bAllowSpatialization = false;
// Init sound wave procedural
SoundWaveProcedural = NewObject<USoundWaveProcedural>();
SoundWaveProcedural->SetSampleRate(SampleRate);
SoundWaveProcedural->NumChannels = NumChannels;
SoundWaveProcedural->Duration = INDEFINITELY_LOOPING_DURATION;
SoundWaveProcedural->SoundGroup = SOUNDGROUP_Default;
SoundWaveProcedural->bLooping = false;
SoundWaveProcedural->bProcedural = true;
SoundWaveProcedural->Pitch = 1.0f;
SoundWaveProcedural->Volume = 1.0f;
SoundWaveProcedural->AttenuationSettings = nullptr;
SoundWaveProcedural->bDebug = true;
SoundWaveProcedural->VirtualizationMode = EVirtualizationMode::PlayWhenSilent;
// Set audio component source
AudioComponent->SetSound(SoundWaveProcedural);
Init(TEXT("appid = 55xxxx45, work_dir = ."));
}
3、在Ticks循环函数通过QTTSAudioGet获取到数据时,装载数据到SoundWaveProcedural对象中
void ASpeech::payloadReceivedVoiceData(const uint8 *Data, int32 DataSize)
{
if (NULL == AudioComponent || NULL == SoundWaveProcedural) {
ScreenMsg(TEXT("Error: The sound playback component is empty!"));
return;
}
SoundWaveProcedural->QueueAudio(Data, DataSize);
}
4、第一次装载数据时就可以同步开始播放音频了
// Play audio
if (0 == GenerateIndex) { // 第一次接收到语音数据,同步开始播放
if (NULL != AudioComponent) {
AudioComponent->Play();
bPlaying = true;
PlayTime = UGameplayStatics::GetTimeSeconds(GWorld);
}
}
++GenerateIndex;
payloadReceivedVoiceData(p_data, len); // 装在数据到播放Wave数据区
输出EXE
输出EXE时,注意将ThirdParty里面的DLL拷贝到输出目录下,UE的输出程序不会自动拷贝ThirdParty下的DLL文件
拷贝文件路径如下
程序截图