Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
whoisjones
's Collections
fiNERweb
Familiarity (Difficult NER training datasets)
MastermindEval
MastermindEval
updated
Mar 7, 2025
Evaluating reasoning capabilities of LLMs using the game of Mastermind (paper is coming)
Upvote
-
flair/mastermind_35_mcq_random
Viewer
•
Updated
Mar 12, 2025
•
37.1k
•
116
flair/mastermind_46_mcq_random
Viewer
•
Updated
Mar 12, 2025
•
36.1k
•
92
flair/mastermind_46_mcq_close
Viewer
•
Updated
Mar 12, 2025
•
36.1k
•
92
flair/mastermind_24_mcq_random
Viewer
•
Updated
Mar 12, 2025
•
30.4k
•
95
flair/mastermind_24_mcq_close
Viewer
•
Updated
Mar 12, 2025
•
30.4k
•
100
flair/mastermind_35_mcq_close
Viewer
•
Updated
May 29, 2025
•
37.1k
•
124
Upvote
-
Share collection
View history
Collection guide
Browse collections