A general high-level introduction to the Transformer architecture.

This video is part of the Hugging Face course: http://huggingface.co/course

Related videos:
– Encoder models: https://youtu.be/MUqNwgPjJvQ
– Decoder models: https://youtu.be/d_ixlCubqQw
– Encoder-Decoder models: https://youtu.be/0_4KEb08xrE

To understand what happens inside the Transformer network on a deeper level, we recommend the following blogposts by Jay Alammar:
– The Illustrated Transformer: https://jalammar.github.io/illustrated-transformer/
– The Illustrated GPT-2: https://jalammar.github.io/illustrated-gpt2/
– Understanding Attention: https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Furthermore, for a code-oriented perspective, we recommend taking a look at the following post:
– The Annotated Transformer, by Harvard NLP https://nlp.seas.harvard.edu/2018/04/03/attention.html

Have a question? Checkout the forums: https://discuss.huggingface.co/c/course/20
Subscribe to our newsletter: https://huggingface.curated.co/

Add comment

Your email address will not be published. Required fields are marked *

Categories

All Topics