This demo showcases how to design and create an enterprise-grade retrieval-augmented generation (RAG) pipeline using NVIDIA AI Foundation models. With NVIDIA AI Foundation endpoints, all embedding and generation tasks are handled seamlessly, removing the need for dedicated GPUs.
Step-by-Step guide:
0:31 – Experiment with OS LLMs on NVIDIA AI Foundation Endpoints
1:11 – Overview of RAG Pipeline Components: Custom Data Loader, Text Embedding Model, Vector Database, LLM
1:30 – Use the LangChain Connector
1:47 – Generate an API Key for NGC
2:03 – Build the Chat UI
2:13 – Add Custom Data Connector
2:25 – Access the Text Embedding Model with API Calls
2:34 – Deploy the Vector Database to Index Embeddings
2:40 – Create or Load a Vector Store
2:51 – Use FAISS Library to Store Chunks
2:55 – Connect Your RAG Pipeline Using Streamlit
Developer resources:
▫️ Deploy and test NVIDIA generative AI pipeline examples on GitHub: https://nvda.ws/41gNtfJ
▫️ Read the Mastering LLM Techniques series: https://resources.nvidia.com/en-us-large-language-models
▫️ Deploy, test, and extend this RAG application example on GitHub: https://nvda.ws/495rDP1
▫️ Join the NVIDIA Developer Program: https://nvda.ws/3OhiXfl
▫️ Read and subscribe to the NVIDIA Technical Blog: https://nvda.ws/3XHae9F
#largelanguagemodels #retrievalaugmentedgeneration #llm #rag #generativeai #langchain #aichatbot #ragtutorial #chatbotapp #buildingchatbots
Add comment