SaffalPoosh commited on
Commit
269d549
·
verified ·
1 Parent(s): fdd3c4b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -22
README.md CHANGED
@@ -19,44 +19,58 @@ datasets:
19
  <!-- Provide a quick summary of what the model is/does. -->
20
 
21
 
22
- this is qlora adapter trained on the CPP coding tasks and its trained for reasoning based generation.
23
 
 
24
 
 
25
 
26
- ```python
 
 
27
 
 
 
 
28
  example_problem = """
29
  A robot is situated at the top-left corner of an m x n grid. The robot can only move either down or right at any point in time. It wants to reach the bottom-right corner of the grid. Some cells in the grid are blocked by obstacles. How many unique paths can the robot take to reach the destination?
 
30
  Constraints:
31
  Time limit per test: 2.0 seconds
32
  Memory limit per test: 256.0 megabytes
33
  1 ≤ m, n ≤ 100
34
  Grid cells are either 0 (empty) or 1 (obstacle).
 
35
  Input Format:
36
  The first line contains two integers m and n — the dimensions of the grid.
37
  The next m lines each contain n integers (0 or 1) representing the grid.
 
38
  Output Format:
39
  Print a single integer — the number of unique paths.
 
40
  Example:
41
- ```input
42
  3 3
43
  0 0 0
44
  0 1 0
45
  0 0 0
46
- ```
47
  """
 
 
 
 
 
48
  from unsloth import FastLanguageModel
49
  from transformers import TextStreamer
 
 
50
 
 
51
  model_path = "SaffalPoosh/reasoning_cpp_llm"
52
-
53
  max_seq_length = 16000
54
  dtype = None
55
  load_in_4bit = True
56
 
57
-
58
-
59
-
60
  model, tokenizer = FastLanguageModel.from_pretrained(
61
  model_name=model_path,
62
  max_seq_length=max_seq_length,
@@ -65,40 +79,82 @@ model, tokenizer = FastLanguageModel.from_pretrained(
65
  local_files_only=False
66
  )
67
 
68
- # this will download the base model and then patch by applying the lora adapters
69
-
70
-
71
- from transformers import TextIteratorStreamer
72
  FastLanguageModel.for_inference(model)
73
- from threading import Thread
74
  # Prepare Input Data
75
  input_text = example_problem
76
  inputs = tokenizer(input_text, return_tensors="pt")
77
- inputs = {k:v.to("cuda") for k,v in inputs.items()}
 
78
  # Initialize the text streamer
79
  text_streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=False)
80
 
81
- # Perform Inference
82
- # _ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=8000)
 
 
 
 
 
 
 
 
83
 
84
- stream_catcher = Thread(target=model.generate, kwargs={**inputs, "do_sample": True, "streamer": text_streamer,
85
- # "eos_token_id": tokenizer.eos_token_id,
86
-
87
- "max_new_tokens": 10000})
88
  stream_catcher.start()
89
 
 
90
  with open("output.txt", "w") as f:
91
  for token in text_streamer:
92
  print(token, end="", flush=True)
93
  f.write(token)
94
- stream_catcher.join()
95
 
 
96
  ```
97
 
98
- the `output.txt` file shows the output of generation.
 
 
 
 
 
 
 
99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
 
 
101
 
 
 
 
 
102
 
103
 
104
 
 
19
  <!-- Provide a quick summary of what the model is/does. -->
20
 
21
 
 
22
 
23
+ # Model Card for SaffalPoosh/reasoning_cpp_llm
24
 
25
+ <!-- Provide a quick summary of what the model is/does. -->
26
 
27
+ This is a QLoRA adapter trained on C++ coding tasks and designed for reasoning-based code generation. The model specializes in solving algorithmic problems with step-by-step reasoning and generating optimized C++ solutions.
28
+
29
+ ## Example Usage
30
 
31
+ ### Problem Example
32
+
33
+ ```python
34
  example_problem = """
35
  A robot is situated at the top-left corner of an m x n grid. The robot can only move either down or right at any point in time. It wants to reach the bottom-right corner of the grid. Some cells in the grid are blocked by obstacles. How many unique paths can the robot take to reach the destination?
36
+
37
  Constraints:
38
  Time limit per test: 2.0 seconds
39
  Memory limit per test: 256.0 megabytes
40
  1 ≤ m, n ≤ 100
41
  Grid cells are either 0 (empty) or 1 (obstacle).
42
+
43
  Input Format:
44
  The first line contains two integers m and n — the dimensions of the grid.
45
  The next m lines each contain n integers (0 or 1) representing the grid.
46
+
47
  Output Format:
48
  Print a single integer — the number of unique paths.
49
+
50
  Example:
51
+ Input:
52
  3 3
53
  0 0 0
54
  0 1 0
55
  0 0 0
 
56
  """
57
+ ```
58
+
59
+ ### Model Loading and Inference
60
+
61
+ ```python
62
  from unsloth import FastLanguageModel
63
  from transformers import TextStreamer
64
+ from transformers import TextIteratorStreamer
65
+ from threading import Thread
66
 
67
+ # Model configuration
68
  model_path = "SaffalPoosh/reasoning_cpp_llm"
 
69
  max_seq_length = 16000
70
  dtype = None
71
  load_in_4bit = True
72
 
73
+ # Load model and tokenizer
 
 
74
  model, tokenizer = FastLanguageModel.from_pretrained(
75
  model_name=model_path,
76
  max_seq_length=max_seq_length,
 
79
  local_files_only=False
80
  )
81
 
82
+ # This will download the base model and then patch by applying the LoRA adapters
 
 
 
83
  FastLanguageModel.for_inference(model)
84
+
85
  # Prepare Input Data
86
  input_text = example_problem
87
  inputs = tokenizer(input_text, return_tensors="pt")
88
+ inputs = {k: v.to("cuda") for k, v in inputs.items()}
89
+
90
  # Initialize the text streamer
91
  text_streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=False)
92
 
93
+ # Perform Inference with streaming
94
+ stream_catcher = Thread(
95
+ target=model.generate,
96
+ kwargs={
97
+ **inputs,
98
+ "do_sample": True,
99
+ "streamer": text_streamer,
100
+ "max_new_tokens": 10000
101
+ }
102
+ )
103
 
 
 
 
 
104
  stream_catcher.start()
105
 
106
+ # Stream output to console and file
107
  with open("output.txt", "w") as f:
108
  for token in text_streamer:
109
  print(token, end="", flush=True)
110
  f.write(token)
 
111
 
112
+ stream_catcher.join()
113
  ```
114
 
115
+ ## Model Details
116
+
117
+ - **Model Type**: QLoRA Fine-tuned Language Model
118
+ - **Base Model**: [Specify base model if known]
119
+ - **Training Focus**: C++ algorithmic problem solving with reasoning
120
+ - **Max Sequence Length**: 16,000 tokens
121
+ - **Quantization**: 4-bit loading supported
122
+ - **Hardware Requirements**: CUDA-compatible GPU recommended
123
 
124
+ ## Training Details
125
+
126
+ - **Training Method**: QLoRA (Quantized Low-Rank Adaptation)
127
+ - **Dataset**: C++ coding tasks with reasoning annotations
128
+ - **Task Type**: Code generation with step-by-step reasoning
129
+ - **Optimization**: Focused on algorithmic problem solving
130
+
131
+ ## Usage Notes
132
+
133
+ - The model generates reasoning-based solutions for C++ programming problems
134
+ - Supports streaming inference for real-time output
135
+ - The `output.txt` file contains the complete generated solution
136
+ - Designed to handle competitive programming style problems with constraints
137
+
138
+ ## Output Format
139
+
140
+ The model typically generates:
141
+ 1. Problem analysis and reasoning
142
+ 2. Algorithm explanation
143
+ 3. Complete C++ implementation
144
+ 4. Time and space complexity analysis
145
+
146
+ ## Requirements
147
+
148
+ ```python
149
+ pip install unsloth transformers torch
150
+ ```
151
 
152
+ ## Hardware Requirements
153
 
154
+ - **GPU**: CUDA-compatible GPU (recommended)
155
+ - **Memory**: Sufficient VRAM for 4-bit quantized model
156
+ - **Storage**: Space for base model download and adapter weights
157
+ -
158
 
159
 
160