--- license: apache-2.0 language: - en base_model: - Qwen/Qwen2.5-1.5B-Instruct library_name: transformers tags: - reinforcement-learning - text-generation-inference - science - code - math - finance pipeline_tag: text-generation --- ![R1.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/BKHWttLe9Z8hJ-azW0b8i.png) # **GCIRS-Reasoning-1.5B-R1** > **GCIRS-Reasoning-1.5B-R1** is a **research-grade reasoning model** fine-tuned from **Qwen2.5-1.5B-Instruct**, focused on **non-fictional reasoning**, **factual consistency**, and **scientific depth**. Trained with reinforcement learning using the **Big Reasoning Traces** dataset from DeepSeek, this model is tailored for complex analytical tasks and scientific rigor in high-stakes or research environments. > \[!note] > GGUF: [https://huggingface.co/prithivMLmods/GCIRS-Reasoning-1.5B-R1-GGUF](https://huggingface.co/prithivMLmods/GCIRS-Reasoning-1.5B-R1-GGUF) --- ## **Key Features** 1. **Reinforcement Learning on Big Reasoning Traces** Fine-tuned using **DeepSeek’s Big Reasoning Traces**, ensuring clarity in multi-step reasoning, factual deduction, and long-form scientific argumentation. 2. **Research-Ready Scientific Fidelity** Designed for researchers, educators, and analysts—offers **reliable factual recall**, **logical structuring**, and precise step-by-step explanation. 3. **Structured Output in LaTeX, Markdown, and JSON** Supports technical documentation and publishing with seamless integration of **LaTeX equations**, **Markdown formatting**, and **JSON output**. 4. **Multilingual Technical Reasoning** Effective across **20+ languages**, especially in **scientific**, **academic**, and **technical domains**. 5. **Efficient for Inference** Despite its **1.5B parameter scale**, it's optimized for **low-latency inference** across **modern GPUs** and **research pipelines**. --- ## **Quickstart with Transformers** ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "prithivMLmods/GCIRS-Reasoning-1.5B-R1" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Explain the principle of entropy in thermodynamics with examples." messages = [ {"role": "system", "content": "You are a scientific reasoning assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` --- ## **Intended Use** * Scientific and research-grade question answering * Conceptual explanations in physics, biology, and chemistry * Factual, non-fictional structured content generation * Academic tutoring and reasoning assessment * High-fidelity inference in low-latency research settings ## **Limitations** * Not designed for casual chat or storytelling * Performance may decline outside scientific/technical domains * Limited creativity and abstract generalization * Context limitations in extremely long research documents ## **References** 1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115) 2. [Big Reasoning Traces (DeepSeek Research)]() 3. [Reinforcement Learning with Human Feedback (RLHF)](https://arxiv.org/abs/1906.01749)