A general high-level introduction to the Decoder part of the Transformer architecture. What is it, when should you use it?

This video is part of the Hugging Face course: http://huggingface.co/course

Related videos:
– The Transformer architectutre: https://youtu.be/H39Z_720T5s
– Encoder models: https://youtu.be/MUqNwgPjJvQ
– Encoder-Decoder models: https://youtu.be/0_4KEb08xrE

To understand what happens inside the Transformer network on a deeper level, we recommend the following blogposts by Jay Alammar:
– The Illustrated Transformer: https://jalammar.github.io/illustrated-transformer/
– The Illustrated GPT-2: https://jalammar.github.io/illustrated-gpt2/
– Understanding Attention: https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Furthermore, for a code-oriented perspective, we recommend taking a look at the following post:
– The Annotated Transformer, by Harvard NLP https://nlp.seas.harvard.edu/2018/04/03/attention.html

Have a question? Checkout the forums: https://discuss.huggingface.co/c/course/20
Subscribe to our newsletter: https://huggingface.curated.co/

Add comment

Your email address will not be published. Required fields are marked *

Categories

All Topics