ModernBERT-base_en-tr_jobs

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5874
Num Input Tokens Seen: 2096939008

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
1.7876	0.0780	500	1.5507	32768000
1.4002	0.1559	1000	1.2782	65536000
1.1977	0.2339	1500	1.1295	98304000
1.0805	0.3118	2000	1.0334	131072000
1.0054	0.3898	2500	0.9656	163840000
0.9383	0.4677	3000	0.9139	196608000
0.8908	0.5457	3500	0.8726	229376000
0.8533	0.6236	4000	0.8391	262144000
0.8271	0.7016	4500	0.8152	294912000
0.8013	0.7795	5000	0.7912	327680000
0.7832	0.8575	5500	0.7777	360448000
0.7708	0.9355	6000	0.7603	393216000
0.7556	1.0134	6500	0.7493	425930752
0.7421	1.0914	7000	0.7356	458698752
0.7265	1.1693	7500	0.7273	491466752
0.7221	1.2473	8000	0.7175	524234752
0.7138	1.3252	8500	0.7069	557002752
0.7027	1.4032	9000	0.6992	589770752
0.6932	1.4811	9500	0.6904	622538752
0.6904	1.5591	10000	0.6859	655306752
0.6781	1.6370	10500	0.6802	688074752
0.6778	1.7150	11000	0.6757	720842752
0.6683	1.7930	11500	0.6693	753610752
0.6681	1.8709	12000	0.6666	786378752
0.6587	1.9489	12500	0.6604	819146752
0.6541	2.0268	13000	0.6542	851861504
0.6478	2.1048	13500	0.6518	884629504
0.6485	2.1827	14000	0.6468	917397504
0.6426	2.2607	14500	0.6437	950165504
0.6366	2.3386	15000	0.6407	982933504
0.6343	2.4166	15500	0.6366	1015701504
0.63	2.4945	16000	0.6348	1048469504
0.6311	2.5725	16500	0.6315	1081237504
0.626	2.6505	17000	0.6275	1114005504
0.622	2.7284	17500	0.6250	1146773504
0.6224	2.8064	18000	0.6223	1179541504
0.6174	2.8843	18500	0.6195	1212309504
0.611	2.9623	19000	0.6169	1245077504
0.6142	3.0402	19500	0.6174	1277792256
0.6074	3.1182	20000	0.6128	1310560256
0.6095	3.1961	20500	0.6109	1343328256
0.6037	3.2741	21000	0.6090	1376096256
0.6056	3.3520	21500	0.6073	1408864256
0.6008	3.4300	22000	0.6052	1441632256
0.6012	3.5080	22500	0.6050	1474400256
0.5967	3.5859	23000	0.6013	1507168256
0.5986	3.6639	23500	0.6013	1539936256
0.599	3.7418	24000	0.5993	1572704256
0.5932	3.8198	24500	0.5992	1605472256
0.591	3.8977	25000	0.5951	1638240256
0.593	3.9757	25500	0.5960	1671008256
0.5897	4.0536	26000	0.5942	1703723008
0.5877	4.1316	26500	0.5928	1736491008
0.5892	4.2095	27000	0.5922	1769259008
0.5851	4.2875	27500	0.5918	1802027008
0.5838	4.3655	28000	0.5915	1834795008
0.5876	4.4434	28500	0.5900	1867563008
0.5797	4.5214	29000	0.5880	1900331008
0.5803	4.5993	29500	0.5897	1933099008
0.5833	4.6773	30000	0.5893	1965867008
0.5838	4.7552	30500	0.5864	1998635008
0.5846	4.8332	31000	0.5881	2031403008
0.5818	4.9111	31500	0.5866	2064171008
0.5826	4.9891	32000	0.5874	2096939008

Framework versions

Transformers 4.48.3
Pytorch 2.6.0+cu124
Datasets 3.3.0
Tokenizers 0.21.0

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for avsolatorio/ModernBERT-base_en-tr_jobs

Base model

answerdotai/ModernBERT-base

Finetuned

(1011)

this model

avsolatorio
/

ModernBERT-base_en-tr_jobs

ModernBERT-base_en-tr_jobs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for avsolatorio/ModernBERT-base_en-tr_jobs

Evaluation results