DavidAU
/

Llama3.3-8B-Instruct-Thinking-Claude-4.5-Opus-High-Reasoning

This was a test to determine:
1 - would this reasoning dataset work on this Llama 3.3 mode.
2 - would it induce reasoning (Claude specific, which has a specific fingerprint) WITHOUT "system prompt help".

This (fine tune) was not designed to update the model, domain information or improve the model beyond this.
This model does require more extensive training to bring it up to date and up to today's SOTA standards.

armand0e

1 day ago

How can you do a high quality fine-tune with only 250 rows?

With 250 rows you can get high quality "imitation" of the teacher model, not really knowledge transfer though.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment