MMLU

Da Wiki AI.
Versione del 27 feb 2024 alle 15:13 di Alesaccoia (discussione | contributi) (Creata pagina con "[https://arxiv.org/pdf/2009.03300.pdf Massive Multitask Language Understanding] Dataset per il testing dell'accuratezza delle informazioni presenti all'interno del modello di linguaggio. E un test a risposta multipla. Esempi di domande: One of the reasons that the government discourages and regulates monopolies is that (A) producer surplus is lost and consumer surplus is gained. (B) monopoly prices ensure productive efficiency but cost society allocative efficiency. (...")
(diff) ← Versione meno recente | Versione attuale (diff) | Versione più recente → (diff)

Massive Multitask Language Understanding

Dataset per il testing dell'accuratezza delle informazioni presenti all'interno del modello di linguaggio. E un test a risposta multipla.

Esempi di domande:

One of the reasons that the government discourages and regulates monopolies is that (A) producer surplus is lost and consumer surplus is gained. (B) monopoly prices ensure productive efficiency but cost society allocative efficiency. (C) monopoly firms do not engage in significant research and development. (D) consumer surplus is lost with higher prices and lower levels of output.

When you drop a ball from rest it accelerates downward at 9.8 m/s2. If you instead throw it downward assuming no air resistance its acceleration immediately after leaving your hand is (A) 9.8 m/s2 (B) more than 9.8 m/s2 (C) less than 9.8 m/s2 (D) Cannot say unless the speed of throw is given.

Link

Benchmark