Generating Sequences With Recurrent Neural Networks (2014)
Generating Sequences With Recurrent Neural Networks (2014) | |
---|---|
Data | 2013 |
Autori | Alex Graves |
URL | https://www.semanticscholar.org/paper/6471fd1cbc081fb3b7b5b14d6ab9eaaba02b5c17 |
Topic | Reti neurali ricorrenti |
Citazioni | 3857 |
Paper di Alexander Graves su modelli generativi autoregressivi di sequenze di testo e di scrittura manuale. In questo paper l'autore si scontra con problemi di allineamento su parti dell'input in posizione molto distanti, problemi che verranno risolti con l'attention e con l'architettura transformer .
In principle a large enough RNN should be sufficient to generate sequences of arbitrary complexity. In practice however, standard RNNs are unable to store information about past inputs for very long [15]. As well as diminishing their ability to model long-range structure, this ‘amnesia’ makes them prone to instability when generating sequences. The problem (common to all conditional generative models) is that if the network’s predictions are only based on the last few inputs, and these inputs were themselves predicted by the network, it has little opportunity to recover from past mistakes.