ggerganov commited on
Commit
bd9b418
·
unverified ·
1 Parent(s): bfa259c

models : add instructions for using HF fine-tuned models

Browse files
Files changed (1) hide show
  1. models/README.md +21 -2
models/README.md CHANGED
@@ -41,5 +41,24 @@ https://huggingface.co/datasets/ggerganov/whisper.cpp/tree/main
41
 
42
  ## Model files for testing purposes
43
 
44
- The model files pefixed with `for-tests-` are empty (i.e. do not contain any weights) and are used by the CI for testing purposes.
45
- They are directly included in this repository for convenience and the Github Actions CI uses them to run various sanitizer tests.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  ## Model files for testing purposes
43
 
44
+ The model files prefixed with `for-tests-` are empty (i.e. do not contain any weights) and are used by the CI for
45
+ testing purposes. They are directly included in this repository for convenience and the Github Actions CI uses them to
46
+ run various sanitizer tests.
47
+
48
+ ## Fine-tuned models
49
+
50
+ There are community efforts for creating fine-tuned Whisper models using extra training data. For example, this
51
+ [blog post](https://huggingface.co/blog/fine-tune-whisper) describes a method for fine-tuning using Hugging Face (HF)
52
+ Transformer implementation of Whisper. The produced models are in slightly different format compared to the original
53
+ OpenAI format. To read the HF models you can use the [convert-h5-to-ggml.py](convert-h5-to-ggml.py) script like this:
54
+
55
+ ```
56
+ git clone https://github.com/openai/whisper
57
+ git clone https://github.com/ggerganov/whisper.cpp
58
+
59
+ # clone HF fine-tuned model (this is just an example)
60
+ git clone https://huggingface.co/openai/whisper-base.en
61
+
62
+ # convert the model to ggml
63
+ python3 ./whisper.cpp/models/convert-h5-to-ggml.py ./whisper-medium/ ./whisper .
64
+ ```