Stage-2 fine-tuning details

#69

by leestevennz - opened 25 days ago

Hi,

One of the listed key features of parakeet-tdt-0.6b-v2 is its robust performance on spoken numbers. The model card also mentions a stage-2 fine-tune for 2500 steps on a 500-hour subset from nemo-asrset-3.0. I expect the finetuning on high quality human labelled data helps with the key features of robustness and numbers.

Could you provide more details about this fine-tuning subset and the stage-2 process?
Is there any possibility of accessing this 500-hour subset?
Any recommendations for maintaining number transcription accuracy during custom fine-tuning?

Thanks in advance.

leestevennz

23 days ago

@nithinraok

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment