Fast tokenizer superpowers

April 20, 2022

6 views

1 min read

Cinema Mode

Fast tokenizers are fast, but they also have additional features to map the tokens to the words they come from or the original span of characters in the raw text. This video explores these features.

This video is part of the Hugging Face course: http://huggingface.co/course
Open in colab to run the code samples:
https://colab.research.google.com/github/huggingface/notebooks/blob/master/course/videos/offset_mapping.ipynb

Related videos:
– Why are fast tokenizers called fast? — https://youtu.be/g8quOxoqhHQ
– Training a new tokenizer: https://youtu.be/DJimQynXZsQ

Don’t have a Hugging Face account? Join now: http://huggingface.co/join
Have a question? Checkout the forums: https://discuss.huggingface.co/c/course/20
Subscribe to our newsletter: https://huggingface.curated.co/

Fast tokenizer superpowers

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Fast tokenizer superpowers

You may also like

Add comment

Categories

All Topics