This talk discusses the vision to develop methods that can enable efficiently updating pre-trained models with new knowledge while preventing the forgetting of past knowledge. Taking a step towards efficient continual pre-training, we examine the effect of different warm-up strategies and replay when continuing to pre-train models on new data and new languages.
This talk discusses the vision to develop methods that can enable efficiently updating pre-trained models with new knowledge while preventing the forgetting of past knowledge. Taking a step towards efficient continual pre-training, we examine the effect of different warm-up strategies and replay when continuing to pre-train models on new data and new languages.
Add comment