Llama: differenze tra le versioni

Versione delle 16:59, 27 mag 2024

Llama
Nome Inglese	Large Language Model Meta AI
Sigla	LLaMA
Anno Di Creazione	2021
Versione Corrente	3.0
URL	https://llama.meta.com
Pubblicazione	LLaMA: Open and Efficient Foundation Language Models
URL Pubblicazione

La prima versione contava 7, 13, 33 e 65 B di parametri.

La seconda versione, rilasciata a giugno 2023, viene allenata su di un nuovo mix di dati pubblici, maggiore del 40% rispetto alla prima versione, la lunghezza del contesto viene raddoppiata, e viene adottato un nuovo modello di attention, chiamato grouped-query attention. Vengono rilasciati modelli da 7,13 e 70B di parametri. Viene anche rilasciata Llama 2-Chat, una versione con fine-tuning ottimizzata per use case conversazionali.

Ragione per lo sviluppo

"The capabilities of LLMs are remarkable considering the seemingly straightforward nature of the training methodology. Auto-regressive transformers are pretrained on an extensive corpus of self-supervised data, followed by alignment with human preferences via techniques such as Reinforcement Learning with Human Feedback (RLHF). Although the training methodology is simple, high computational requirements have limited the development of LLMs to a few players. There have been public releases of pretrained LLMs (such as BLOOM (Scao et al., 2022), LLaMa-1 (Touvron et al., 2023), and Falcon (Penedo et al., 2023)) that match the performance of closed pretrained competitors like GPT-3 (Brown et al., 2020) and Chinchilla (Hoffmann et al., 2022), but none of these models are suitable substitutes for closed “product” LLMs, such as ChatGPT, BARD, and Claude. These closed product LLMs are heavily fine-tuned to align with human preferences, which greatly enhances their usability and safety. This step can require significant costs in compute and human annotation, and is often not transparent or easily reproducible, limiting progress within the community to advance AI alignment research." da "Llama 2: Open Foundation and Fine-Tuned Chat Models"

Links

Tutorial

Prompt Engineering With Llama 2

@@ Riga 1: / Riga 1: @@
-Nome: [[Nome::Large Language Model Meta AI]]
+{{Template modello
+|NomeInglese=Large Language Model Meta AI
-Sigla: [[Sigla::LLaMA]]
+|Sigla=LLaMA
+|AnnoDiCreazione=2021
-Anno di creazione: [[AnnoDiCreazione::2023]]
+|VersioneCorrente=3.0
+|URL=https://llama.meta.com
-Versione corrente: [[VersioneCorrente::LLaMa 2]]
+|Pubblicazione=LLaMA: Open and Efficient Foundation Language Models
+}}
-Anno di creazione versione corrente: [[AnnoDiCreazioneVersioneCorrente::2023]]
-URLHomePage: [https://llama.meta.com HomePage Llama]
-Pubblicazione: [[Pubblicazione::LLaMA: Open and Efficient Foundation Language Models]]
-Basato su: [[BasatoSu::Transformer (Architettura di Deep Learning)]]
-__SHOWFACTBOX__
 La prima versione contava  7, 13, 33 e 65 B di parametri.
@@ Riga 25: / Riga 16: @@
 "The capabilities of LLMs are remarkable considering the seemingly straightforward nature of the training methodology. Auto-regressive transformers are pretrained on an extensive corpus of self-supervised data, followed by alignment with human preferences via techniques such as Reinforcement Learning with Human Feedback (RLHF). Although the training methodology is simple, high computational requirements have limited the development of LLMs to a few players. There have been public releases of pretrained LLMs (such as BLOOM (Scao et al., 2022), LLaMa-1 (Touvron et al., 2023), and Falcon (Penedo et al., 2023)) that match the performance of closed pretrained competitors like GPT-3 (Brown et al., 2020) and Chinchilla (Hoffmann et al., 2022), but none of these models are suitable substitutes for closed “product” LLMs, such as ChatGPT, BARD, and Claude. These closed product LLMs are heavily fine-tuned to align with human preferences, which greatly enhances their usability and safety. This step can require significant costs in compute and human annotation, and is often not transparent or easily reproducible, limiting progress within the community to advance AI alignment research." da "Llama 2: Open Foundation and Fine-Tuned Chat Models"