Generating Sequences With Recurrent Neural Networks (2014): differenze tra le versioni

Versione attuale delle 11:54, 28 ago 2024

Generating Sequences With Recurrent Neural Networks (2014)
Data	2013
Autori	Alex Graves
URL	https://www.semanticscholar.org/paper/6471fd1cbc081fb3b7b5b14d6ab9eaaba02b5c17
Topic	Reti neurali ricorrenti
Citazioni	3857

Paper di Alexander Graves su modelli generativi autoregressivi di sequenze di testo e di scrittura manuale. In questo paper l'autore si scontra con problemi di allineamento su parti dell'input in posizione molto distanti, problemi che verranno risolti con l'attention e con l'architettura transformer .

In principle a large enough RNN should be sufficient to generate sequences of arbitrary complexity. In practice however, standard RNNs are unable to store information about past inputs for very long [15]. As well as diminishing their ability to model long-range structure, this ‘amnesia’ makes them prone to instability when generating sequences. The problem (common to all conditional generative models) is that if the network’s predictions are only based on the last few inputs, and these inputs were themselves predicted by the network, it has little opportunity to recover from past mistakes.

Links

https://arxiv.org/pdf/1308.0850.pdf

@@ Riga 1: / Riga 1: @@
-Paper di Gloves su [[Modello Generativo|modelli generativi]] autoregressivi di sequenze di testo e di scrittura manuale. In questo paper l'autore si scontra con problemi di allineamento su parti dell'input in posizione molto distanti, problemi che verranno risolti con l'[[Attention (Machine Learning)|attention]] e con l'architettura [[transformer]] .<blockquote>In principle a large enough RNN should be sufficient to generate sequences of arbitrary complexity. In practice however, standard RNNs are unable to store information about past inputs for very long [15]. As well as diminishing their ability to model long-range structure, this ‘amnesia’ makes them prone to instability when generating sequences. The problem (common to all conditional generative models) is that if the network’s predictions are only based on the last few inputs, and these inputs were themselves predicted by the network, it has little opportunity to recover from past mistakes.</blockquote>
+{{template pubblicazione
+|data=2013
+|autori=Alex Graves
+|URL=https://www.semanticscholar.org/paper/6471fd1cbc081fb3b7b5b14d6ab9eaaba02b5c17
+|topic=Reti neurali ricorrenti
+|citazioni=3857
+}}
+Paper di [[Alexander Graves]] su [[Modello Generativo|modelli generativi]] autoregressivi di sequenze di testo e di scrittura manuale. In questo paper l'autore si scontra con problemi di allineamento su parti dell'input in posizione molto distanti, problemi che verranno risolti con l'[[Attention (Machine Learning)|attention]] e con l'architettura [[transformer]] .<blockquote>In principle a large enough RNN should be sufficient to generate sequences of arbitrary complexity. In practice however, standard RNNs are unable to store information about past inputs for very long [15]. As well as diminishing their ability to model long-range structure, this ‘amnesia’ makes them prone to instability when generating sequences. The problem (common to all conditional generative models) is that if the network’s predictions are only based on the last few inputs, and these inputs were themselves predicted by the network, it has little opportunity to recover from past mistakes.</blockquote>
 === Links ===
 https://arxiv.org/pdf/1308.0850.pdf
-[[Category:pubblicazione]]