15个版本 (5个稳定版)
1.0.4 | 2024年6月12日 |
---|---|
1.0.2 | 2024年5月13日 |
0.3.1 | 2024年4月24日 |
0.2.3 | 2023年12月2日 |
0.1.4 | 2021年6月18日 |
#40 in 音频
每月75次下载
455KB
10K SLoC
cognitive-services-speech-sdk-rs
Rust绑定微软认知语音服务SDK。围绕原生C API提供轻量级抽象。深受官方Go库的启发。提供语音转文本、文本转语音和机器人框架对话管理功能。
欢迎pull请求!
语音转文本
use cognitive_services_speech_sdk_rs as msspeech;
use log::*;
use std::env;
async fn speech_to_text() {
let filename = env::var("WAVFILENAME").unwrap();
let audio_config = msspeech::audio::AudioConfig::from_wav_file_input(&filename).unwrap();
let speech_config = msspeech::speech::SpeechConfig::from_subscription(
env::var("MSSubscriptionKey").unwrap(),
env::var("MSServiceRegion").unwrap(),
)
.unwrap();
let mut speech_recognizer =
msspeech::speech::SpeechRecognizer::from_config(speech_config, audio_config).unwrap();
speech_recognizer
.set_session_started_cb(|event| info!("set_session_started_cb {:?}", event))
.unwrap();
speech_recognizer
.set_session_stopped_cb(|event| info!("set_session_stopped_cb {:?}", event))
.unwrap();
speech_recognizer
.set_speech_start_detected_cb(|event| info!("set_speech_start_detected_cb {:?}", event))
.unwrap();
speech_recognizer
.set_speech_end_detected_cb(|event| info!("set_speech_end_detected_cb {:?}", event))
.unwrap();
speech_recognizer
.set_recognizing_cb(|event| info!("set_recognizing_cb {:?}", event.result.text))
.unwrap();
speech_recognizer
.set_recognized_cb(|event| info!("set_recognized_cb {:?}", event))
.unwrap();
speech_recognizer
.set_canceled_cb(|event| info!("set_canceled_cb {:?}", event))
.unwrap();
let result = speech_recognizer.recognize_once_async().await.unwrap();
info!("got recognition {:?}", result);
}
文本转语音
use cognitive_services_speech_sdk_rs as msspeech;
use log::*;
use std::env;
async fn text_to_speech() {
let pull_stream = msspeech::audio::PullAudioOutputStream::create_pull_stream().unwrap();
let audio_config = msspeech::audio::AudioConfig::from_stream_output(&pull_stream).unwrap();
let speech_config = msspeech::speech::SpeechConfig::from_subscription(
env::var("MSSubscriptionKey").unwrap(),
env::var("MSServiceRegion").unwrap(),
)
.unwrap();
let mut speech_synthesizer =
msspeech::speech::SpeechSynthesizer::from_config(speech_config, audio_config).unwrap();
speech_synthesizer
.set_synthesizer_started_cb(|event| info!("synthesizer_started_cb {:?}", event))
.unwrap();
speech_synthesizer
.set_synthesizer_synthesizing_cb(|event| info!("synthesizer_synthesizing_cb {:?}", event))
.unwrap();
speech_synthesizer
.set_synthesizer_completed_cb(|event| info!("synthesizer_completed_cb {:?}", event))
.unwrap();
speech_synthesizer
.set_synthesizer_canceled_cb(|event| info!("synthesizer_canceled_cb {:?}", event))
.unwrap();
match speech_synthesizer.speak_text_async("Hello Rust!").await {
Err(err) => error!("speak_text_async error {:?}", err),
Ok(speech_audio_bytes) => {
info!("speech_audio_bytes {:?}", speech_audio_bytes);
}
}
}
更多信息请参阅GitHub集成测试(tests文件夹)和示例(examples文件夹)。
构建先决条件
目前支持在Windows、Linux和MacOS上构建。使用Clang和微软语音SDK共享库。详细信息请参阅此处。
在运行cargo build之前安装以下先决条件
sudo apt-get update
sudo apt-get install clang build-essential libssl1.0.0 libasound2 wget
构建过程会生成Speech SDK原生函数的Rust绑定。这些绑定已经预先构建并放入ffi/bindings.rs文件中。在大多数情况下,不需要重新生成它们。设置以下内容以跳过绑定重新生成
export MS_COG_SVC_SPEECH_SKIP_BINDGEN=1
cargo build
构建过程将MS Speech SDK下载到目标文件夹。从这里您可以将其复制到其他文件夹,例如./SpeechSDK。在运行编译后的二进制文件时,应使用动态链接
Linux
export LD_LIBRARY_PATH=/Users/xxx/cognitive-services-speech-sdk-rs/SpeechSDK/macOS/sdk_output/MicrosoftCognitiveServicesSpeech.xcframework/macos-arm64_x86_64
MacOS
export DYLD_FALLBACK_FRAMEWORK_PATH=/Users/xxx/cognitive-services-speech-sdk-rs/SpeechSDK/macOS/sdk_output/MicrosoftCognitiveServicesSpeech.xcframework/macos-arm64_x86_64
Windows(指向目标文件夹中的SpeechSDK)
set PATH=%PATH%;"C:\Users\xxx\cognitive-services-speech-sdk-rs\target\debug\build\cognitive-services-speech-sdk-rs-b9c946c378fbb4f1\out\sdk_output\runtimes\win-x64\native"
如何在MacOS上构建
我们支持MacOS的arm、aarch64和x86_64架构。
运行以下命令以构建
cargo build
在构建和运行过程中,将动态链接Speech SDK库。在运行应用程序时,使用以下环境变量指向自定义库位置
export DYLD_FALLBACK_FRAMEWORK_PATH=/Users/xxx/cognitive-services-speech-sdk-rs/SpeechSDK/macOS/sdk_output/MicrosoftCognitiveServicesSpeech.xcframework/macos-arm64_x86_64
然后运行您的应用程序,利用cognitive-services-speech-sdk-rs或示例,例如
cargo run --example recognizer
本版本新增功能
查看变更日志
依赖项
~3–12MB
~115K SLoC