What is pre-tokenization?

April 20, 2022

4 views

1 min read

Cinema Mode

Pre-tokenization is the second step when tokenizing texts. But what does this mean? This video will tell you all about it.

This video is part of the Hugging Face course: http://huggingface.co/course
Open in colab to run the code samples:
https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/videos/pre_tokenization.ipynb

Related videos:
– What is normalization? https://youtu.be/4IIC2jI9CaU
– Training a new tokenizer: https://youtu.be/DJimQynXZsQ

Don’t have a Hugging Face account? Join now: http://huggingface.co/join
Have a question? Checkout the forums: https://discuss.huggingface.co/c/course/20
Subscribe to our newsletter: https://huggingface.curated.co/

What is pre-tokenization?

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

What is pre-tokenization?

You may also like

Add comment

Categories

All Topics