2 个不稳定版本

0.2.1	2024 年 3 月 1 日
0.2.0	~~2024 年 3 月 1 日~~
0.1.0	2024 年 1 月 24 日

#13 in 无障碍性

用于 natural-tts

MIT/Apache

86KB
1.5K SLoC

描述

这个库是 MSEdge Read aloud 函数 API 的包装器。您可以使用它使用 Microsoft 提供的多种声音合成文本到语音。

使用方法

您需要获取一个 SpeechConfig 来配置文本到语音的语音。
您可以通过简单地使用 get_voices_list 函数将 Voice 转换为 SpeechConfig。 Voice 实现了 serde::Serialize 和 serde::Deserialize。
例如
```
use msedge_tts::voice::get_voices_list;
use msedge_tts::tts::SpeechConfig;

fn main() {
    let voices = get_voices_list().unwrap();
    let speechConfig = SpeechConfig::from(&voices[0]);
}
```
您也可以自己创建 SpeechConfig。确保您知道正确的 语音名称 和 音频格式。
创建 TTS Client 或 Stream。它们都有同步和异步版本。以下为第 3 步的示例。

合成文本到语音。

同步客户端

调用客户端函数 synthesize 来合成文本到语音。此函数返回类型为 SynthesizedAudio，您可以得到 audio_bytes 和 audio_metadata。

use msedge_tts::{tts::client::connect, tts::SpeechConfig, voice::get_voices_list};

fn main() {
    let voices = get_voices_list().unwrap();
    for voice in &voices {
        if voice.name.contains("YunyangNeural") {
            let config = SpeechConfig::from(voice);
            let mut tts = connect().unwrap();
            let audio = tts
                .synthesize("Hello, World! 你好，世界！", &config)
                .unwrap();
            break;
        }
    }
}

异步客户端

调用客户端函数 synthesize 来合成文本到语音。此函数返回类型为 SynthesizedAudio，您可以得到 audio_bytes 和 audio_metadata。

use msedge_tts::{tts::client::connect_async, tts::SpeechConfig, voice::get_voices_list_async};

fn main() {
    smol::block_on(async {
        let voices = get_voices_list_async().await.unwrap();
        for voice in &voices {
            if voice.name.contains("YunyangNeural") {
                let config = SpeechConfig::from(voice);
                let mut tts = connect_async().await.unwrap();
                let audio = tts
                    .synthesize("Hello, World! 你好，世界！", &config)
                    .await
                    .unwrap();
                break;
            }
        }
    });
}

同步流

调用发送流函数 send 来合成文本到语音。调用读取流函数 read 来获取数据。
read 返回 Option<SynthesizedResponse>，响应可能是 AudioBytes 或 AudioMetadata 或 None。这是因为 MSEdge Read aloud API 依次返回多个数据段和元数据以及其他信息。
注意：一个 send 对应多个 read。下一个 send 调用将阻塞，直到没有可读取的数据。在调用 send 之前，read 将阻塞。

use msedge_tts::{
    tts::stream::{msedge_tts_split, SynthesizedResponse},
    tts::SpeechConfig,
    voice::get_voices_list,
};
use std::{
    sync::{
        atomic::{AtomicBool, Ordering},
        Arc,
    },
    thread::spawn,
};

fn main() {
    let voices = get_voices_list().unwrap();
    for voice in &voices {
        if voice.name.contains("YunyangNeural") {
            let config = SpeechConfig::from(voice);
            let (mut sender, mut reader) = msedge_tts_split().unwrap();

            let signal = Arc::new(AtomicBool::new(false));
            let end = signal.clone();
            spawn(move || {
                sender.send("Hello, World! 你好，世界！", &config).unwrap();
                println!("synthesizing...1");
                sender.send("Hello, World! 你好，世界！", &config).unwrap();
                println!("synthesizing...2");
                sender.send("Hello, World! 你好，世界！", &config).unwrap();
                println!("synthesizing...3");
                sender.send("Hello, World! 你好，世界！", &config).unwrap();
                println!("synthesizing...4");
                end.store(true, Ordering::Relaxed);
            });

            loop {
                if signal.load(Ordering::Relaxed) && !reader.can_read() {
                    break;
                }
                let audio = reader.read().unwrap();
                if let Some(audio) = audio {
                    match audio {
                        SynthesizedResponse::AudioBytes(_) => {
                            println!("read bytes")
                        }
                        SynthesizedResponse::AudioMetadata(_) => {
                            println!("read metadata")
                        }
                    }
                } else {
                    println!("read None");
                }
            }
        }
    }
}

异步流

调用发送异步函数 send 来合成文本到语音。调用读取异步函数 read 来获取数据。如上所示，read 返回 Option<SynthesizedResponse>。如上所述，send 和 read 是阻塞的。

use msedge_tts::{
    tts::{
        stream::{msedge_tts_split_async, SynthesizedResponse},
        SpeechConfig,
    },
    voice::get_voices_list_async,
};
use std::{
    sync::{
        atomic::{AtomicBool, Ordering},
        Arc,
    },
};

fn main() {
    smol::block_on(async {
        let voices = get_voices_list_async().await.unwrap();
        for voice in &voices {
            if voice.name.contains("YunyangNeural") {
                let config = SpeechConfig::from(voice);
                let (mut sender, mut reader) = msedge_tts_split_async().await.unwrap();

                let signal = Arc::new(AtomicBool::new(false));
                let end = signal.clone();
                smol::spawn(async move {
                    sender
                        .send("Hello, World! 你好，世界！", &config)
                        .await
                        .unwrap();
                    println!("synthesizing...1");
                    sender
                        .send("Hello, World! 你好，世界！", &config)
                        .await
                        .unwrap();
                    println!("synthesizing...2");
                    sender
                        .send("Hello, World! 你好，世界！", &config)
                        .await
                        .unwrap();
                    println!("synthesizing...3");
                    sender
                        .send("Hello, World! 你好，世界！", &config)
                        .await
                        .unwrap();
                    println!("synthesizing...4");
                    end.store(true, Ordering::Relaxed);
                })
                .detach();

                loop {
                    if signal.load(Ordering::Relaxed) && !reader.can_read().await {
                        break;
                    }
                    let audio = reader.read().await.unwrap();
                    if let Some(audio) = audio {
                        match audio {
                            SynthesizedResponse::AudioBytes(_) => {
                                println!("read bytes")
                            }
                            SynthesizedResponse::AudioMetadata(_) => {
                                println!("read metadata")
                            }
                        }
                    } else {
                        println!("read None");
                    }
                }
            }
        }
    });
}

查看所有示例。

依赖项

~18–29MB
~498K SLoC