Transformer models: Decoders

April 20, 2022

5 views

1 min read

Cinema Mode

A general high-level introduction to the Decoder part of the Transformer architecture. What is it, when should you use it?

This video is part of the Hugging Face course: http://huggingface.co/course

Related videos:
– The Transformer architectutre: https://youtu.be/H39Z_720T5s
– Encoder models: https://youtu.be/MUqNwgPjJvQ
– Encoder-Decoder models: https://youtu.be/0_4KEb08xrE

To understand what happens inside the Transformer network on a deeper level, we recommend the following blogposts by Jay Alammar:
– The Illustrated Transformer: https://jalammar.github.io/illustrated-transformer/
– The Illustrated GPT-2: https://jalammar.github.io/illustrated-gpt2/
– Understanding Attention: https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Furthermore, for a code-oriented perspective, we recommend taking a look at the following post:
– The Annotated Transformer, by Harvard NLP https://nlp.seas.harvard.edu/2018/04/03/attention.html

Have a question? Checkout the forums: https://discuss.huggingface.co/c/course/20
Subscribe to our newsletter: https://huggingface.curated.co/

Transformer models: Decoders

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Transformer models: Decoders

You may also like

Add comment

Categories

All Topics