Recurrent Neural Networks (RNNs) are powerful neural network architectures used for modeling sequences. LSTM (Long Short Term Memory) based RNNs are surprisingly good at capturing long-term dependencies in the sequences. A barebones sequence-to-sequence/encoder-decoder architecture performs incredibly well in tasks like Machine Translation.
A typical sequence-sequence architecture consists of an encoder and a decoder RNN. The encoder processes a source sequence and reduces it into a fixed length vector – the context, and the decoder generates a target sequence, token by token, conditioned on the context. The context is usually the final state of the encoder RNN. Consider the following example, in which the source sequence in English, is mapped to the target sequence in French.
Attention Mechanism: Benefits and Applications Print
Modified on: Mon, 8 Feb, 2021 at 2:37 PM
Did you find it helpful? Yes No
Send feedbackSorry we couldn't be helpful. Help us improve this article with your feedback.