Spaces:
Sleeping
Sleeping
| # whisper.cpp/examples/vad-speech-segments | |
| This examples demonstrates how to use a VAD (Voice Activity Detection) model to | |
| segment an audio file into speech segments. | |
| ### Building the example | |
| The example can be built using the following command: | |
| ```console | |
| cmake -S . -B build | |
| cmake --build build -j8 --target vad-speech-segments | |
| ``` | |
| ### Running the example | |
| The examples can be run using the following command, which uses a model | |
| that we use internally for testing: | |
| ```console | |
| ./build/bin/vad-speech-segments \ | |
| -vad-model models/for-tests-silero-v5.1.2-ggml.bin \ | |
| --file samples/jfk.wav \ | |
| --no-prints | |
| Detected 5 speech segments: | |
| Speech segment 0: start = 0.29, end = 2.21 | |
| Speech segment 1: start = 3.30, end = 3.77 | |
| Speech segment 2: start = 4.00, end = 4.35 | |
| Speech segment 3: start = 5.38, end = 7.65 | |
| Speech segment 4: start = 8.16, end = 10.59 | |
| ``` | |
| To see more output from whisper.cpp remove the `--no-prints` argument. | |
| ### Command line options | |
| ```console | |
| ./build/bin/vad-speech-segments --help | |
| usage: ./build/bin/vad-speech-segments [options] file | |
| supported audio formats: flac, mp3, ogg, wav | |
| options: | |
| -h, --help [default] show this help message and exit | |
| -f FNAME, --file FNAME [ ] input audio file path | |
| -t N, --threads N [4 ] number of threads to use during computation | |
| -ug, --use-gpu [true ] use GPU | |
| -vm FNAME, --vad-model FNAME [ ] VAD model path | |
| -vt N, --vad-threshold N [0.50 ] VAD threshold for speech recognition | |
| -vspd N, --vad-min-speech-duration-ms N [250 ] VAD min speech duration (0.0-1.0) | |
| -vsd N, --vad-min-silence-duration-ms N [100 ] VAD min silence duration (to split segments) | |
| -vmsd N, --vad-max-speech-duration-s N [FLT_MAX] VAD max speech duration (auto-split longer) | |
| -vp N, --vad-speech-pad-ms N [30 ] VAD speech padding (extend segments) | |
| -vo N, --vad-samples-overlap N [0.10 ] VAD samples overlap (seconds between segments) | |
| -np, --no-prints [false ] do not print anything other than the results | |
| ``` | |