Angie Chen Learning From Natural Language Feedback

October 24, 2023

7 views

2 min read

Cinema Mode

Welcome Angie Chen
Title: Learning from Natural Language Feedback

Abstract: The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient. We further show that ILF can be seen as a form of minimizing the KL divergence to the ground truth distribution and demonstrate proof-of-concepts on summarization and neural program synthesis tasks. For summarization, ILF improves a GPT-3 model’s summarization performance to be comparable to human quality, outperforming fine-tuning on human-written summaries. For code generation, ILF improves a Codegen-Mono 6.1B model’s pass@1 rate by 38% relative (and 10% absolute) on the Mostly Basic Python Problems (MBPP) benchmark, outperforming both fine-tuning on MBPP and fine-tuning on repaired programs written by humans. Overall, our results suggest that learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations for improving an LLM’s performance on a variety of tasks.
Papers covered
1. Improving Code Generation by Training with Natural Language Feedback – https://arxiv.org/abs/2303.16749

2. Training Language Models with Language Feedback – https://arxiv.org/abs/2204.14146

Bio: Angelica Chen is a rising 4th year PhD student at NYU advised by Kyunghyun Cho. She is broadly interested in two directions of research – (1) online training of LLMs via human feedback and evolutionary strategies, and (2) understanding the learning strategies and training dynamics of LLMs. She has previously interned for Google Brain and Research on applying evolutionary algorithms to LLMs for neural architecture search and streaming disfluency detection. Prior to NYU, she worked at Google Search on developing neural semantic parsing algorithms and with the Seung lab at Princeton University for ML applications to healthcare.

Angie Chen Learning From Natural Language Feedback

Add comment

Cancel reply

Categories

All Topics

210,000 CODERS lost jobs as NVIDIA released NEW coding language.

Kurzweil: AI will be smarter than all humans combined by 2029

The AI Revolution: Will Robots Take Your Job?

Artificial Intelligence | 60 Minutes Full Episodes

The A.I. Dilemma – March 9, 2023

In the Age of AI (full documentary) | FRONTLINE

Angie Chen Learning From Natural Language Feedback

You may also like

Add comment

Categories

All Topics