Cohere For AI – Community Talks: Vaishaal Shankar

August 28, 2023

5 views

2 min read

Cinema Mode

Join the C4AI’s Computer Vision Group as they welcome Vaishaal Shankar for a presentation on “DataComp”

Speaker information: Vaishaal Shankar, is a researcher at the machine learning research group at Apple. For the last ~1 year or so Vaishaal’s collaborators from UW, Tel Aviv University, Columbia and more have been working on a benchmark for measuring the performance of datasets as opposed to models. They find that in the recent landscape of large scale models (for vision/language/speech/etc..) the massive scale (over 1B examples/1T tokens etc..) datasets have been crucial for the performance of these models (GPT/CLIP/Flamingo etc…).

Event information: Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl. Participants in our benchmark design new filtering techniques or curate new data sources and then evaluate their new dataset by running our standardized CLIP training code and testing the resulting model on 38 downstream test sets. Our benchmark consists of multiple compute scales spanning four orders of magnitude, which enables the study of scaling trends and makes the benchmark accessible to researchers with varying resources. Our baseline experiments show that the DataComp workflow leads to better training sets. In particular, our best baseline, DataComp-1B, enables training a CLIP ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet, outperforming OpenAI’s CLIP ViT-L/14 by 3.7 percentage points while using the same training procedure and compute.

Cohere For AI – Community Talks: Vaishaal Shankar

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Cohere For AI – Community Talks: Vaishaal Shankar

You may also like

Add comment

Categories

All Topics