Spaces:
Running
Running
File size: 2,197 Bytes
dc59673 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# whisper.cpp/examples/vad-speech-segments
This examples demonstrates how to use a VAD (Voice Activity Detection) model to
segment an audio file into speech segments.
### Building the example
The example can be built using the following command:
```console
cmake -S . -B build
cmake --build build -j8 --target vad-speech-segments
```
### Running the example
The examples can be run using the following command, which uses a model
that we use internally for testing:
```console
./build/bin/vad-speech-segments \
-vad-model models/for-tests-silero-v5.1.2-ggml.bin \
--file samples/jfk.wav \
--no-prints
Detected 5 speech segments:
Speech segment 0: start = 0.29, end = 2.21
Speech segment 1: start = 3.30, end = 3.77
Speech segment 2: start = 4.00, end = 4.35
Speech segment 3: start = 5.38, end = 7.65
Speech segment 4: start = 8.16, end = 10.59
```
To see more output from whisper.cpp remove the `--no-prints` argument.
### Command line options
```console
./build/bin/vad-speech-segments --help
usage: ./build/bin/vad-speech-segments [options] file
supported audio formats: flac, mp3, ogg, wav
options:
-h, --help [default] show this help message and exit
-f FNAME, --file FNAME [ ] input audio file path
-t N, --threads N [4 ] number of threads to use during computation
-ug, --use-gpu [true ] use GPU
-vm FNAME, --vad-model FNAME [ ] VAD model path
-vt N, --vad-threshold N [0.50 ] VAD threshold for speech recognition
-vspd N, --vad-min-speech-duration-ms N [250 ] VAD min speech duration (0.0-1.0)
-vsd N, --vad-min-silence-duration-ms N [100 ] VAD min silence duration (to split segments)
-vmsd N, --vad-max-speech-duration-s N [FLT_MAX] VAD max speech duration (auto-split longer)
-vp N, --vad-speech-pad-ms N [30 ] VAD speech padding (extend segments)
-vo N, --vad-samples-overlap N [0.10 ] VAD samples overlap (seconds between segments)
-np, --no-prints [false ] do not print anything other than the results
```
|