Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

tokyotech-llm

Team

university

swallow-llm

Activity Feed Request to join this org

AI & ML interests

None defined yet.

tokyotech-llm 's collections 14

Apache-2.0 Open High Quality Math Corpus

tokyotech-llm/swallow-math-v2

Viewer • Updated Nov 6, 2025 • 17.4M • 5.11k • 25

Llama-3.1-Swallow-v0.5

tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 4.02k • 9
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 12.9k • • 18

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-code

Viewer • Updated Jul 4, 2025 • 129M • 11k • 59
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0002500

Updated Jul 4, 2025 • 3
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0005000

8B • Updated Jul 4, 2025 • 2
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0007500

8B • Updated Jul 4, 2025 • 6

Llama-3.3-Swallow

tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4

Text Generation • 71B • Updated Jul 1, 2025 • 1.19k • • 12
tokyotech-llm/Llama-3.3-Swallow-70B-v0.4

Text Generation • 71B • Updated May 31, 2025 • 3.68k • 4
tokyotech-llm/edu-classifier

Text Classification • Updated Jan 30, 2025 • 196 • 13

Llama-3-Swallow

tokyotech-llm/Llama-3-Swallow-8B-v0.1

Text Generation • Updated Oct 8, 2024 • 284 • • 12
tokyotech-llm/Llama-3-Swallow-70B-v0.1

Text Generation • Updated Oct 8, 2024 • 12 • • 6
tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1

Text Generation • 8B • Updated Oct 8, 2024 • 15.9k • • 21
tokyotech-llm/Llama-3-Swallow-70B-Instruct-v0.1

Text Generation • 71B • Updated Oct 8, 2024 • 17 • • 7

Swallow-instruct

Swallow instruction tuning models

tokyotech-llm/Swallow-7b-instruct-hf

Text Generation • 7B • Updated Oct 8, 2024 • 4.22k • 44
tokyotech-llm/Swallow-13b-instruct-v0.1

Text Generation • 13B • Updated Oct 8, 2024 • 57 • 1
tokyotech-llm/Swallow-70b-instruct-v0.1

Text Generation • 69B • Updated Oct 8, 2024 • 3
tokyotech-llm/Swallow-7b-instruct-v0.1

Text Generation • 7B • Updated Oct 8, 2024 • 107 • 3

Swallow MX(Mixtral) models

tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

Text Generation • 47B • Updated Aug 17, 2024 • 28 • 29

Apache-2.0 Open High Quality Code Corpus

tokyotech-llm/swallow-code-v2

Viewer • Updated Nov 8, 2025 • 147M • 184k • 28

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-math

Viewer • Updated May 10, 2025 • 4.33M • 1.02k • 38
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500

8B • Updated May 7, 2025 • 3
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000

8B • Updated May 7, 2025 • 1
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500

8B • Updated May 7, 2025

Gemma-2-Swallow

tokyotech-llm/Gemma-2-Llama-Swallow-27b-pt-v0.1

Text Generation • 27B • Updated May 18, 2025 • 108 • 1
tokyotech-llm/Gemma-2-Llama-Swallow-9b-pt-v0.1

Text Generation • Updated May 18, 2025 • 2.1k • 1
tokyotech-llm/Gemma-2-Llama-Swallow-2b-pt-v0.1

Text Generation • Updated May 18, 2025 • 67
tokyotech-llm/Gemma-2-Llama-Swallow-2b-it-v0.1

Text Generation • Updated May 18, 2025 • 142 • 4

Llama-3.1-Swallow

tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 12.9k • • 18
tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 4.02k • 9
tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3

Text Generation • 71B • Updated Apr 2, 2025 • 419 • 14
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

Text Generation • 8B • Updated Apr 2, 2025 • 3.46k • • 24

Continual Pre-Training from Llama 2

tokyotech-llm/Swallow-7b-hf

Text Generation • 7B • Updated Oct 8, 2024 • 619 • 17
tokyotech-llm/Swallow-7b-instruct-hf

Text Generation • 7B • Updated Oct 8, 2024 • 4.22k • 44
tokyotech-llm/Swallow-7b-instruct-v0.1

Text Generation • 7B • Updated Oct 8, 2024 • 107 • 3
tokyotech-llm/Swallow-7b-plus-hf

Text Generation • Updated Oct 8, 2024 • 1 • 8

Swallow MS/MX (Mistral/Mixtral) models

tokyotech-llm/Swallow-MS-7b-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 43 • 28
tokyotech-llm/Swallow-MS-7b-instruct-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 245 • 14
tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

Text Generation • 47B • Updated Aug 17, 2024 • 28 • 29

Swallow-MS-instruct

tokyotech-llm/Swallow-MS-7b-instruct-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 245 • 14

Apache-2.0 Open High Quality Math Corpus

tokyotech-llm/swallow-math-v2

Viewer • Updated Nov 6, 2025 • 17.4M • 5.11k • 25

Apache-2.0 Open High Quality Code Corpus

tokyotech-llm/swallow-code-v2

Viewer • Updated Nov 8, 2025 • 147M • 184k • 28

Llama-3.1-Swallow-v0.5

tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 4.02k • 9
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 12.9k • • 18

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-math

Viewer • Updated May 10, 2025 • 4.33M • 1.02k • 38
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500

8B • Updated May 7, 2025 • 3
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000

8B • Updated May 7, 2025 • 1
tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500

8B • Updated May 7, 2025

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

tokyotech-llm/swallow-code

Viewer • Updated Jul 4, 2025 • 129M • 11k • 59
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0002500

Updated Jul 4, 2025 • 3
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0005000

8B • Updated Jul 4, 2025 • 2
tokyotech-llm/Llama-3.1-8B-code-ablation-exp1-LR2.5e-5-MINLR2.5E-6-WD0.1-iter0007500

8B • Updated Jul 4, 2025 • 6

Gemma-2-Swallow

tokyotech-llm/Gemma-2-Llama-Swallow-27b-pt-v0.1

Text Generation • 27B • Updated May 18, 2025 • 108 • 1
tokyotech-llm/Gemma-2-Llama-Swallow-9b-pt-v0.1

Text Generation • Updated May 18, 2025 • 2.1k • 1
tokyotech-llm/Gemma-2-Llama-Swallow-2b-pt-v0.1

Text Generation • Updated May 18, 2025 • 67
tokyotech-llm/Gemma-2-Llama-Swallow-2b-it-v0.1

Text Generation • Updated May 18, 2025 • 142 • 4

Llama-3.3-Swallow

tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4

Text Generation • 71B • Updated Jul 1, 2025 • 1.19k • • 12
tokyotech-llm/Llama-3.3-Swallow-70B-v0.4

Text Generation • 71B • Updated May 31, 2025 • 3.68k • 4
tokyotech-llm/edu-classifier

Text Classification • Updated Jan 30, 2025 • 196 • 13

Llama-3.1-Swallow

tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

Text Generation • 8B • Updated Jun 25, 2025 • 12.9k • • 18
tokyotech-llm/Llama-3.1-Swallow-8B-v0.5

8B • Updated Jul 1, 2025 • 4.02k • 9
tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.3

Text Generation • 71B • Updated Apr 2, 2025 • 419 • 14
tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

Text Generation • 8B • Updated Apr 2, 2025 • 3.46k • • 24

Llama-3-Swallow

tokyotech-llm/Llama-3-Swallow-8B-v0.1

Text Generation • Updated Oct 8, 2024 • 284 • • 12
tokyotech-llm/Llama-3-Swallow-70B-v0.1

Text Generation • Updated Oct 8, 2024 • 12 • • 6
tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1

Text Generation • 8B • Updated Oct 8, 2024 • 15.9k • • 21
tokyotech-llm/Llama-3-Swallow-70B-Instruct-v0.1

Text Generation • 71B • Updated Oct 8, 2024 • 17 • • 7

Continual Pre-Training from Llama 2

tokyotech-llm/Swallow-7b-hf

Text Generation • 7B • Updated Oct 8, 2024 • 619 • 17
tokyotech-llm/Swallow-7b-instruct-hf

Text Generation • 7B • Updated Oct 8, 2024 • 4.22k • 44
tokyotech-llm/Swallow-7b-instruct-v0.1

Text Generation • 7B • Updated Oct 8, 2024 • 107 • 3
tokyotech-llm/Swallow-7b-plus-hf

Text Generation • Updated Oct 8, 2024 • 1 • 8

Swallow-instruct

Swallow instruction tuning models

tokyotech-llm/Swallow-7b-instruct-hf

Text Generation • 7B • Updated Oct 8, 2024 • 4.22k • 44
tokyotech-llm/Swallow-13b-instruct-v0.1

Text Generation • 13B • Updated Oct 8, 2024 • 57 • 1
tokyotech-llm/Swallow-70b-instruct-v0.1

Text Generation • 69B • Updated Oct 8, 2024 • 3
tokyotech-llm/Swallow-7b-instruct-v0.1

Text Generation • 7B • Updated Oct 8, 2024 • 107 • 3

Swallow MS/MX (Mistral/Mixtral) models

tokyotech-llm/Swallow-MS-7b-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 43 • 28
tokyotech-llm/Swallow-MS-7b-instruct-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 245 • 14
tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

Text Generation • 47B • Updated Aug 17, 2024 • 28 • 29

Swallow MX(Mixtral) models

tokyotech-llm/Swallow-MX-8x7b-NVE-v0.1

Text Generation • 47B • Updated Aug 17, 2024 • 28 • 29

Swallow-MS-instruct

tokyotech-llm/Swallow-MS-7b-instruct-v0.1

Text Generation • 7B • Updated Aug 17, 2024 • 245 • 14

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs