The capabilities of multimodal AI | Gemini Demo

December 5, 2023

3 views

1 min read

Cinema Mode

Our natively multimodal AI model Gemini is capable of reasoning across text, images, audio, video and code. Here are favorite moments with Gemini Learn more and try the model: https://deepmind.google/gemini

Explore Gemini: https://goo.gle/how-its-made-gemini

For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.

Subscribe to our Channel: https://www.youtube.com/google
Tweet with us on X: https://twitter.com/google
Follow us on Instagram: https://www.instagram.com/google
Join us on Facebook: https://www.facebook.com/Google

0:00 Intro
0:19 Multimodal Dialogue
1:32 Multilinguality
2:04 Game Creation
2:31 Visual Puzzles
3:17 Making Connections
3:39 Image & Text Generation
4:06 Logic & Spatial Reasoning
4:55 Translating Visuals
5:27 Cultural Understanding

The capabilities of multimodal AI | Gemini Demo

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

The capabilities of multimodal AI | Gemini Demo

You may also like

Add comment

Categories

All Topics