GPT MT Benchmark
TruthfulQA (https://arxiv.org/abs/2109.07958) task in the Open-LLM-Leaderboard.
What does the OpenLLM Leaderboard measure?
An investigation of the Open LLM Leaderboard and why you should double-check before using the top-ranked model.
GPT MT Benchmark Report
Explore how LLMs compare to dedicated language translation models, particularly for low-resourced languages.