Stage-2 fine-tuning details
#69
by
leestevennz - opened
Hi,
One of the listed key features of parakeet-tdt-0.6b-v2 is its robust performance on spoken numbers. The model card also mentions a stage-2 fine-tune for 2500 steps on a 500-hour subset from nemo-asrset-3.0. I expect the finetuning on high quality human labelled data helps with the key features of robustness and numbers.
- Could you provide more details about this fine-tuning subset and the stage-2 process?
- Is there any possibility of accessing this 500-hour subset?
- Any recommendations for maintaining number transcription accuracy during custom fine-tuning?
Thanks in advance.