13
GPT MT Benchmark
cabreraalex
11
TruthfulQA
a13x
TruthfulQA (https://arxiv.org/abs/2109.07958) task in the Open-LLM-Leaderboard.
9
HellaSwag
HellaSwag (https://arxiv.org/abs/1905.07830) task in the Open-LLM-Leaderboard.
MMLU
MMLU (https://arxiv.org/abs/2009.03300) tasks in the Open-LLM-Leaderboard.