Cohere For AI – Community Talks: Michael Tschannen

February 14, 2023

5 views

2 min read

Cinema Mode

The Cohere For AI community’s Interactive Reading Group was pleased to welcome Michael Tschannen to present their work on Image-and-Language Understanding from Pixels Only.

Abstract: Multimodal models often consist of many tasks- and modality-specific pieces and training procedures. For example, CLIP trains independent text and image towers via a contrastive loss. We explore an additional unification: the use of a pure pixel-based model to perform image, text, and multimodal tasks. Our model is trained with contrastive loss alone, so we call it CLIP-Pixels Only (CLIPPO). CLIPPO uses a single encoder that processes both regular images and text rendered as images. CLIPPO performs image-based tasks such as retrieval and zero-shot image classification almost as well as CLIP, with half the number of parameters and no text-specific tower or embedding. When trained jointly via image-text contrastive learning and next-sentence contrastive learning, CLIPPO can perform well on natural language understanding tasks, without any word-level loss, outperforming pixel-based prior work.

Bio: Michael Tschannen is a Research Scientist at Google Research Zurich (Brain Team) broadly interested in multimodal representation learning. Before that he was working on computer vision R&D at Apple Zurich for two years, and spent a year as a postdoc at Google Research Zurich exploring topics in unsupervised representation learning, generative models, and neural compression. He completed his PhD at ETH Zurich in late 2018. Prior to that he obtained a MSc from ETH Zurich and a BSc from EPFL, both in Electrical Engineering and Information Technology.

Website: https://mitscha.github.io/

Paper: https://arxiv.org/abs/2212.08045

Cohere For AI – Community Talks: Michael Tschannen

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Cohere For AI – Community Talks: Michael Tschannen

You may also like

Add comment

Categories

All Topics