Artificial Intelligence
Artificial Intelligence
Out of context: Reply #1632
- Started
- Last post
- 2,578 Responses
- ********-2
Models in Aidanbench
"Aidan Bench
Some models feel competent despite under-scoring on benchmarks like MMLU, GPQA, MATH, or NIAH.Aidan Bench rewards:
- Creativity
- Reliability
- Contextual attention
- Instruction following
Aidan Bench is weakly correlated with Lmsys, has no score ceiling, and aligns with real-world open-ended use."
