Training Large-Scale Transformer Models on Cerebras is Easy

June 22, 2022

3 views

1 min read

Cinema Mode

The Cerebras Software Platform (CSoft) makes it easy to train large-scale Transformer-style natural language processing (#NLP) models on a single Cerebras CS-2 system. How easy? Let Natalia show you with GPT-3 XL 1.3B and GPT-3 6.7B parameter models as examples. The same codebase, the same command to launch a training job, just different model configurations. No need to worry about how to distribute the training across multiple conventional devices, no complicated hybrid 3D parallelism. Switch from 1.3B to 6.7B to 13B to 20B model training just by changing a few parameters in a configuration file.

Learn more: https://www.cerebras.net/blog/training-multi-billion-parameter-models-on-a-single-cerebras-system-is-easy
https://www.cerebras.net/product-software/

#ai #deeplearning #gptj #gpt3 #artificialintelligence #NLP #naturallanguageprocessing

Training Large-Scale Transformer Models on Cerebras is Easy

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Training Large-Scale Transformer Models on Cerebras is Easy

You may also like

Add comment

Categories

All Topics