Transformer combining Vision and Language? ViLBERT – NLP meets Computer Vision

July 11, 2020

5 views

1 min read

Cinema Mode

If you always wanted to know hot to integrate both text and images in one single MULTIMODAL Transformer, then this is the video for you!
Multimodality🔥 + Transformers💪
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/

Content:
* 00:00 Multimodality and Multimodal Transformers
* 02:08 ViLBERT
* 02:39 How does ViLBERT work?
* 05:49 How is ViLBERT trained?

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🎞️ Useful: Ms. Coffee Bean explained the Transformer here https://youtu.be/FWFA4DGuzSc

📄 ViLBERT paper — Lu, Jiasen, Dhruv Batra, Devi Parikh, and Stefan Lee. “Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks.” In Advances in Neural Information Processing Systems, pp. 13-23. 2019. http://papers.nips.cc/paper/8297-vilbert-pretraining-task-agnostic-visiolinguistic-representations-for-vision-and-language-tasks.pdf

📄 For even more similar architectures, check out the multi-modal section of https://github.com/tomohideshibata/BERT-related-papers

💻 With code available at https://github.com/facebookresearch/vilbert-multi-task

🔗 Links:
YouTube: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/

#AICoffeeBreak #MsCoffeeBean #multimodal #multimodality #ViLBERT #MachineLearning #AI #research

Video and thumbnails contain emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0

Transformer combining Vision and Language? ViLBERT – NLP meets Computer Vision

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Transformer combining Vision and Language? ViLBERT – NLP meets Computer Vision

You may also like

Add comment

Categories

All Topics