Phi-1: A ‘Textbook’ Model

July 3, 2023

6 views

2 min read

Cinema Mode

After a conversation with one of the ‘Textbooks Are All You Need’ authors, I can now bring you insights from the new phi-1 tiny language model. See if you agree with me that it tells us so much more than how to do good coding, it affects AGI timelines by telling us whether data will be a bottleneck.

I cover 5 other papers, including WizardCoder, Data Constraints (how more epochs could be used), TinyStories, and more, to give context to the results and end with what I think timelines might be and how public messaging could be targeted.

With extracts from Sarah Constantin in Asterisk and Carl Shulman on Dwarkesh Patel, Andrej Karpathy and Jack Clark (co-founder of Anthropic), as well as the Textbooks and TinyStories co-author himself, Ronen Eldan, I hope you get something from this one. And yes, the title of the paper isn’t the best.

Textbooks Paper: https://arxiv.org/pdf/2306.11644.pdf
Karpathy Tweet: https://twitter.com/karpathy/status/1671587087542530049
TinyStories: https://arxiv.org/pdf/2305.07759.pdf
GPT 4 Self-Repair: https://arxiv.org/pdf/2306.09896.pdf
Yao Fu Tweet on Emergent Self-Repair: https://twitter.com/Francis_YAO_/status/1670618013089820674
WizardCoder: https://arxiv.org/pdf/2306.08568.pdf
Evol-Instruct (WizardLM) paper: https://arxiv.org/pdf/2304.12244.pdf
Scaling Data Constrained Language Models: https://arxiv.org/pdf/2305.16264.pdf
Sarah Constantin, Asterisk Magazine: https://asteriskmag.com/issues/03/the-transistor-cliff
Jack Clark Tweet: https://twitter.com/jackclarkSF/status/1673369486869811201
Carl Shulman, Intelligence Explosion, Dwarkesh Patel: https://www.youtube.com/watch?v=_kRg-ZP1vQc
LLMs and BDTs, Oxford: https://arxiv.org/ftp/arxiv/papers/2306/2306.13952.pdf
HumanEval: https://arxiv.org/pdf/2107.03374v2.pdf
Decoder Piece (if anyone wants to know, I think George Hotz is super-naïve on safety): https://the-decoder.com/gpt-4-is-1-76-trillion-parameters-in-size-and-relies-on-30-year-old-technology/#google_vignette

https://www.patreon.com/AIExplained

Phi-1: A ‘Textbook’ Model

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

NVIDIA GeForce 7950 GX2 – Part 5

NVIDIA GeForce 8800 and nForce 600 – Priceless Video

GeForce GTX 200 GPUs – GAMING BEYOND

Phi-1: A ‘Textbook’ Model

You may also like

Add comment

Categories

All Topics