23个版本 (12个稳定版)

3.0.2	2023年10月1日
3.0.0-beta.5	2023年8月29日
2.0.6	2023年7月24日
2.0.0	2023年3月14日
0.16.0	2022年6月11日

#17 in 音频

每月134次下载

Apache-2.0

140KB
1K SLoC

Rustpotter CLI

用于Rustpotter的CLI，一个用Rust编写的开源唤醒词检测器

描述

这是一个命令行客户端，用于使用rustpotter库。

您可以使用它来录制wav样本、创建rustpotter唤醒词文件并对其进行测试。

安装

支持平台的预构建可执行文件可以在发布的“资产”标签页中找到。

基本用法。

列出可用的音频输入设备和格式。

您可以使用devices命令列出可用的音频源，可以使用--configs选项添加以显示每个源默认和可用的记录格式。

每个设备和配置都有一个数字ID位于左侧，您可以使用该ID在其他命令（record和spot）中使用，以更改其音频源和格式。

在某些系统中，如果配置太多，您可以使用参数--max-channels进行过滤。

以下是在macOS上运行的示例

$ rustpotter-cli devices -c -m 1
Audio hosts:
  - CoreAudio
Default input device:
  - MacBook Pro Microphone
Available Devices: 
0 - MacBook Pro Microphone
  Default input stream config:
      - Sample Rate: 48000, Channels: 1, Format: f32, Supported: true
  All supported input stream configs:
    0 - Sample Rate: 44100, Channels: 1, Format: f32, Supported: true
    1 - Sample Rate: 48000, Channels: 1, Format: f32, Supported: true
    2 - Sample Rate: 88200, Channels: 1, Format: f32, Supported: true
    3 - Sample Rate: 96000, Channels: 1, Format: f32, Supported: true

录制音频样本

record命令允许录制音频样本。要使用不同的输入设备，请使用--device-index参数提供由devices命令返回的ID。您可以通过--config-index选项传递由devices命令返回的配置ID来更改音频格式。执行后，您需要按下Ctrl + c键组合来结束记录。

以下是在macOS上运行的示例

$ rustpotter-cli record good_morning.wav
Input device: MacBook Pro Microphone
Input device config: Sample Rate: 48000, Channels: 1, Format: f32
Begin recording...
Press 'Ctrl + c' to stop.
^CRecording good_morning.wav complete!

您可以在bash中使用类似以下内容来快速进行多次记录

WAKEWORD="ok home"
WAKEWORD_FILENAME="${WAKEWORD// /_}"
# take 10 records, waiting one second after each.
for i in {0..9}; do (rustpotter-cli record $WAKEWORD_FILENAME$i.wav && sleep 1); done

创建唤醒词模型

train命令允许创建唤醒词模型。

需要设置包含需要标记（文件名中包含[label]，其中'label'是网络应该预测该音频段的标签）或未标记（相当于文件名中包含[none]）的wav记录的训练和测试文件夹。

唤醒词模型的大小和CPU使用量取决于您选择的模型类型以及训练时所使用的音频时长（这由训练集中找到的最大音频时长定义）。

示例运行

$ rustpotter-cli train -t small --train-dir train.wav/train --test-dir train.wav/test --test-epochs 10 --epochs 2500 -l 0.017 trained-small.rpw 
Start training trained-small.rpw!
Model type: small.
Labels: ["none", "ok_casa"].
Training with 2042 records.
Testing with 119 records.
Training on 1950ms of audio.
  10 train loss:  0.12944 test acc: 90.76%
  20 train loss:  0.06484 test acc: 93.28%
  30 train loss:  0.04454 test acc: 94.12%
  40 train loss:  0.03361 test acc: 94.12%
  50 train loss:  0.02687 test acc: 94.12%
  60 train loss:  0.02227 test acc: 94.12%
  70 train loss:  0.01916 test acc: 94.12%
  80 train loss:  0.01681 test acc: 94.12%
  90 train loss:  0.01499 test acc: 94.12%
 100 train loss:  0.01354 test acc: 94.12%
 110 train loss:  0.01232 test acc: 94.96%
...  
 160 train loss:  0.00822 test acc: 94.96%
 170 train loss:  0.00766 test acc: 94.96%
 180 train loss:  0.00717 test acc: 95.80%
 190 train loss:  0.00673 test acc: 95.80%
...
 470 train loss:  0.00234 test acc: 95.80%
 480 train loss:  0.00229 test acc: 95.80%
 490 train loss:  0.00224 test acc: 96.64%
 500 train loss:  0.00219 test acc: 96.64%
...
1180 train loss:  0.00083 test acc: 96.64%
1190 train loss:  0.00082 test acc: 96.64%
1200 train loss:  0.00081 test acc: 97.48%
1210 train loss:  0.00081 test acc: 97.48%
...
2340 train loss:  0.00034 test acc: 97.48%
2350 train loss:  0.00034 test acc: 97.48%
2360 train loss:  0.00034 test acc: 98.32%
2370 train loss:  0.00033 test acc: 98.32%
...
2480 train loss:  0.00031 test acc: 98.32%
2490 train loss:  0.00031 test acc: 98.32%
2500 train loss:  0.00031 test acc: 98.32%
trained-small.rpw created!

请注意，即使使用相同的训练集，由于权重的初始化不是常数，您可能会在不同执行中获取不同的结果。

为了正确了解模型的准确性，请不要在训练和测试文件夹之间共享记录。

创建唤醒词参考

build命令允许您从一些记录中创建唤醒词参考文件。

这种唤醒词类型需要创建较少的记录，但提供的模型结果不如唤醒词模型一致。

例如：

rustpotter-cli build --model-name "ok home" --model-path ok_home.rpw ok_home1.wav ok_home2.wav

以下是在macOS上运行的示例

$ WAKEWORD="ok home"
$ WAKEWORD_FILENAME="${WAKEWORD// /_}"
$ rustpotter-cli build --model-name "$WAKEWORD" --model-path $WAKEWORD_FILENAME.rpw $WAKEWORD_FILENAME*.wav
ok_home1.wav: WavSpec { channels: 2, sample_rate: 44100, bits_per_sample: 32, sample_format: Float }
ok home created!

使用模型

您可以使用spot命令在实时使用可用音频输入测试模型，或使用test_model命令对音频文件进行测试。这两个命令都提供了类似选项，以便于在不同命令之间进行切换。

这样，您可以录制一个示例记录，并在那里调整选项，然后在实际实时测试这些选项。

以下是在macOS上运行的示例

$ rustpotter-cli test -g --gain-ref 0.004 ok_home_test.rpw test_audio.wav
Testing file test_audio.wav against model ok_home_test.rpw!
Wakeword detection: [11:06:11] RustpotterDetection { name: "ok_home_test", avg_score: 0.0, score: 0.5261932, scores: {"ok_home1-bandpass1000_2000.wav": 0.5261932}, counter: 12, gain: 0.9 }

spot和test命令的相关选项包括

-d参数启用'调试模式'，以便您可以看到部分检测。
-t设置阈值值（默认为0.5）。
-m 6要求至少有6帧得分为正（与检测counter字段进行比较）。
-e启用'急切模式'，以便尽可能早地发出检测（在最小正得分时）。
-g启用增益归一化。要调试增益归一化，您可以使用--debug-gain，或查看检测上的增益。
--gain-ref更改增益归一化参考。默认值在提供--debug-gain时打印在开始处，取决于唤醒词）。

依赖关系

~15–45MB
~759K SLoC