ModernBERT-base_en-tr_jobs

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5874
  • Num Input Tokens Seen: 2096939008

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.7876 0.0780 500 1.5507 32768000
1.4002 0.1559 1000 1.2782 65536000
1.1977 0.2339 1500 1.1295 98304000
1.0805 0.3118 2000 1.0334 131072000
1.0054 0.3898 2500 0.9656 163840000
0.9383 0.4677 3000 0.9139 196608000
0.8908 0.5457 3500 0.8726 229376000
0.8533 0.6236 4000 0.8391 262144000
0.8271 0.7016 4500 0.8152 294912000
0.8013 0.7795 5000 0.7912 327680000
0.7832 0.8575 5500 0.7777 360448000
0.7708 0.9355 6000 0.7603 393216000
0.7556 1.0134 6500 0.7493 425930752
0.7421 1.0914 7000 0.7356 458698752
0.7265 1.1693 7500 0.7273 491466752
0.7221 1.2473 8000 0.7175 524234752
0.7138 1.3252 8500 0.7069 557002752
0.7027 1.4032 9000 0.6992 589770752
0.6932 1.4811 9500 0.6904 622538752
0.6904 1.5591 10000 0.6859 655306752
0.6781 1.6370 10500 0.6802 688074752
0.6778 1.7150 11000 0.6757 720842752
0.6683 1.7930 11500 0.6693 753610752
0.6681 1.8709 12000 0.6666 786378752
0.6587 1.9489 12500 0.6604 819146752
0.6541 2.0268 13000 0.6542 851861504
0.6478 2.1048 13500 0.6518 884629504
0.6485 2.1827 14000 0.6468 917397504
0.6426 2.2607 14500 0.6437 950165504
0.6366 2.3386 15000 0.6407 982933504
0.6343 2.4166 15500 0.6366 1015701504
0.63 2.4945 16000 0.6348 1048469504
0.6311 2.5725 16500 0.6315 1081237504
0.626 2.6505 17000 0.6275 1114005504
0.622 2.7284 17500 0.6250 1146773504
0.6224 2.8064 18000 0.6223 1179541504
0.6174 2.8843 18500 0.6195 1212309504
0.611 2.9623 19000 0.6169 1245077504
0.6142 3.0402 19500 0.6174 1277792256
0.6074 3.1182 20000 0.6128 1310560256
0.6095 3.1961 20500 0.6109 1343328256
0.6037 3.2741 21000 0.6090 1376096256
0.6056 3.3520 21500 0.6073 1408864256
0.6008 3.4300 22000 0.6052 1441632256
0.6012 3.5080 22500 0.6050 1474400256
0.5967 3.5859 23000 0.6013 1507168256
0.5986 3.6639 23500 0.6013 1539936256
0.599 3.7418 24000 0.5993 1572704256
0.5932 3.8198 24500 0.5992 1605472256
0.591 3.8977 25000 0.5951 1638240256
0.593 3.9757 25500 0.5960 1671008256
0.5897 4.0536 26000 0.5942 1703723008
0.5877 4.1316 26500 0.5928 1736491008
0.5892 4.2095 27000 0.5922 1769259008
0.5851 4.2875 27500 0.5918 1802027008
0.5838 4.3655 28000 0.5915 1834795008
0.5876 4.4434 28500 0.5900 1867563008
0.5797 4.5214 29000 0.5880 1900331008
0.5803 4.5993 29500 0.5897 1933099008
0.5833 4.6773 30000 0.5893 1965867008
0.5838 4.7552 30500 0.5864 1998635008
0.5846 4.8332 31000 0.5881 2031403008
0.5818 4.9111 31500 0.5866 2064171008
0.5826 4.9891 32000 0.5874 2096939008

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.0
  • Tokenizers 0.21.0
Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for avsolatorio/ModernBERT-base_en-tr_jobs

Finetuned
(1011)
this model

Evaluation results