context size reporting inconsistency

#12
by amew0 - opened

The readme.md says that this model has a context size of 32768 but the tokenizer_config.json shows "model_max_length": 131072.

The reported context size inconsistencies seem to stem from how different input formats and prompt templates are handled. Have others noticed variations in response lengths or stability when switching between cold and warm cache states? It would be useful to gather more data on specific scenarios where these discrepancies occur.

Sign up or log in to comment