Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
whoisjones 's Collections
fiNERweb
Familiarity (Difficult NER training datasets)
MastermindEval

MastermindEval

updated Mar 7, 2025

Evaluating reasoning capabilities of LLMs using the game of Mastermind (paper is coming)

Upvote
-

  • flair/mastermind_35_mcq_random

    Viewer • Updated Mar 12, 2025 • 37.1k • 116

  • flair/mastermind_46_mcq_random

    Viewer • Updated Mar 12, 2025 • 36.1k • 92

  • flair/mastermind_46_mcq_close

    Viewer • Updated Mar 12, 2025 • 36.1k • 92

  • flair/mastermind_24_mcq_random

    Viewer • Updated Mar 12, 2025 • 30.4k • 95

  • flair/mastermind_24_mcq_close

    Viewer • Updated Mar 12, 2025 • 30.4k • 100

  • flair/mastermind_35_mcq_close

    Viewer • Updated May 29, 2025 • 37.1k • 124
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs