The Transformer architecture

April 20, 2022

7 views

1 min read

Cinema Mode

A general high-level introduction to the Transformer architecture.

This video is part of the Hugging Face course: http://huggingface.co/course

Related videos:
– Encoder models: https://youtu.be/MUqNwgPjJvQ
– Decoder models: https://youtu.be/d_ixlCubqQw
– Encoder-Decoder models: https://youtu.be/0_4KEb08xrE

To understand what happens inside the Transformer network on a deeper level, we recommend the following blogposts by Jay Alammar:
– The Illustrated Transformer: https://jalammar.github.io/illustrated-transformer/
– The Illustrated GPT-2: https://jalammar.github.io/illustrated-gpt2/
– Understanding Attention: https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Furthermore, for a code-oriented perspective, we recommend taking a look at the following post:
– The Annotated Transformer, by Harvard NLP https://nlp.seas.harvard.edu/2018/04/03/attention.html

Have a question? Checkout the forums: https://discuss.huggingface.co/c/course/20
Subscribe to our newsletter: https://huggingface.curated.co/

The Transformer architecture

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

The Transformer architecture

You may also like

Add comment

Categories

All Topics