MMLU
Massive Multitask Language Understanding
Dataset per il testing dell'accuratezza delle informazioni presenti all'interno del modello di linguaggio. E un test a risposta multipla.
Esempi di domande:
One of the reasons that the government discourages and regulates monopolies is that (A) producer surplus is lost and consumer surplus is gained. (B) monopoly prices ensure productive efficiency but cost society allocative efficiency. (C) monopoly firms do not engage in significant research and development. (D) consumer surplus is lost with higher prices and lower levels of output.
When you drop a ball from rest it accelerates downward at 9.8 m/s2. If you instead throw it downward assuming no air resistance its acceleration immediately after leaving your hand is (A) 9.8 m/s2 (B) more than 9.8 m/s2 (C) less than 9.8 m/s2 (D) Cannot say unless the speed of throw is given.