The context window of OSS LLMs and embedding models has been relatively small vs proprietary models. But, methods to expand context window like RoPE and self-extend are quickly changing this. Nomic has launched a new open source, long context embedding model with 8k token context window (using RoPE), strong performance on several benchmarks (outperforms OpenAI’s ada-002), and support for running locally (as well as via an API). Here, we show how to build a long context RAG app using OSS components from scratch: Nomic’s new 8k context window embeddings and Mistral-instruct 32k context window (via Ollama).
Cookbook – https://github.com/langchain-ai/langchain/blob/master/cookbook/nomic_embedding_rag.ipynb
Add comment