Artificial Intelligence

Out of context: Reply #1632

  • Started
  • Last post
  • 2,578 Responses
  • ********
    -2

    Models in Aidanbench

    "Aidan Bench
    Some models feel competent despite under-scoring on benchmarks like MMLU, GPQA, MATH, or NIAH.

    Aidan Bench rewards:
    - Creativity
    - Reliability
    - Contextual attention
    - Instruction following
    Aidan Bench is weakly correlated with Lmsys, has no score ceiling, and aligns with real-world open-ended use."

View thread