Live code review: Pinecone Vercel starter template and Retrieval Augmented Generation (RAG). Part #2
Timestamps – so you can jump to what’s most interesting to you:
00:00 Continuing discussion around the recursive crawler
00:08 GitHub CoPilot, and the tasks it excels at
00:50 What do we do with the HTML we extract? How the seeder works
1:51 The different types of document splitters you can use
2:32 embedDocument and how it works
3:11 Why do we split documents when working with a vector database?
4:00 Problems that occur if you don’t split documents
5:32 Proper chunking improves relevance
6:50 You still need to tweak and experiment with your chunk parameters
7:00 Chunked upserts
8:36 Chat endpoint – how we use the context at runtime
9:20 Injecting context in LLMs prompts
10:24 Is there a measurable difference in where you put the context in the prompt?
11:12 Reviewing the end to end RAG workflow
13:05 LLMs conditioned us to be okay with responses taking time (being pretty slow!)
13:45 Cool UX anecdote around what humans consider too long
17:00 You have an opportunity to associate chunks with metadata
19:05 UI cards – selecting one to show it was used as context in response
21:22 How we make it visually clear which chunks and context were used in the LLM
23:45 Auditability and why it matters
24:30 Testing the live app
29:55 Outro chatting – Thursday AI sessions on Twitter spaces
30:24 Review GitHub project – this is all open-source!
30:59 Inaugural stream conclusion
31:55 Vim / VsCode / Cursor AI IDE discussion
32:23 Setting up Devtools on Mac OSX
33:12 Upcoming stream ideas – Image search / Pokemon search
Add comment