In episode 7 we’ll tackle two important pieces of BERT’s internal architecture:(1) the “Feed Forward Network”, which is the second half of the Encoder, and (2) the “Positional Encoding Vectors”, which allow BERT to incorporate information about the relative positions of the words in a sentence.

==== Series Playlist ====
https://www.youtube.com/playlist?list=PLam9sigHPGwOBuH4_4fr-XvDbe5uneaf6

==== Updates ====
Sign up to hear about new content across my blog and channel: https://www.chrismccormick.ai/subscribe

==== References ====
Here is the blog post that I referenced which goes into more of the math behind positional encoding: https://kazemnejad.com/blog/transformer_architecture_positional_encoding/
Note that the author follows the paper’s definition of the functions (where the sine and cosine signals are interleaved), whereas Jay’s post follows the code implementation (where the sine and cosine signals are concatenated).

Add comment

Your email address will not be published. Required fields are marked *

Categories

All Topics