Update: The BERT eBook is out! You can buy it from my site here: https://www.chrismccormick.ai/bert-ebook?utm_source=youtube&utm_medium=vid_desc&utm_campaign=bert_ebook&utm_content=vid4

In Episode 3 I’ll walk through how to fine-tune BERT on a sentence classification task. We’ll use the CoLA dataset from the GLUE benchmark as our example dataset.

I’ve split this episode into two videos. In part 1, we’ll look at the CoLA dataset and how to format it for use with BERT. In part 2, I’ll walk us through the PyTorch code for performing the training. Both portions use the same notebook below.

Part 2: https://www.youtube.com/watch?v=Hnvb9b7a_Ps&list=PLam9sigHPGwOBuH4_4fr-XvDbe5uneaf6

==== Notebook ====
The original Colab Notebook used in this video can be found here: https://colab.research.google.com/drive/1Y4o3jh3ZH70tl6mCd76vz_IxX23biCPP

I’ve also published an updated version:
https://colab.research.google.com/drive/1pTuQhug6Dhl9XalKB0zUGf4FIdYFlpcX
In this version:
– I’ve simplified the tokenization and input formatting by using the `tokenizer.encode_plus` function.
– Added validation loss to the learning curve, to check for overfitting (thank you to Stas Bekman for contributing this!)

==== Updates ====
Sign up to hear about new content across my blog and channel! https://www.chrismccormick.ai/subscribe

==== More BERT Applications ====
The BERT Research series is an 8 episode series about understanding BERT and how it works. If you are interested in BERT applications, I’ve also published the following:
– Document Classification: https://youtu.be/_eSGWNqKeeY
– ALBERT: https://youtu.be/vsGN8WqwvKg
– Question Answering: https://youtu.be/l8ZYCvgGu0o

Add comment

Your email address will not be published. Required fields are marked *

Categories

All Topics