Generating Sequences With Recurrent Neural Networks (2014)

Da Wiki AI.
Versione del 28 ago 2024 alle 11:54 di Alesaccoia (discussione | contributi)
(diff) ← Versione meno recente | Versione attuale (diff) | Versione più recente → (diff)
Generating Sequences With Recurrent Neural Networks (2014)
Data 2013
Autori Alex Graves
URL https://www.semanticscholar.org/paper/6471fd1cbc081fb3b7b5b14d6ab9eaaba02b5c17
Topic Reti neurali ricorrenti
Citazioni 3857


Paper di Alexander Graves su modelli generativi autoregressivi di sequenze di testo e di scrittura manuale. In questo paper l'autore si scontra con problemi di allineamento su parti dell'input in posizione molto distanti, problemi che verranno risolti con l'attention e con l'architettura transformer .

In principle a large enough RNN should be sufficient to generate sequences of arbitrary complexity. In practice however, standard RNNs are unable to store information about past inputs for very long [15]. As well as diminishing their ability to model long-range structure, this ‘amnesia’ makes them prone to instability when generating sequences. The problem (common to all conditional generative models) is that if the network’s predictions are only based on the last few inputs, and these inputs were themselves predicted by the network, it has little opportunity to recover from past mistakes.

Links

https://arxiv.org/pdf/1308.0850.pdf