Discussion about this post

User's avatar
Finn Tropy's avatar

Fantastic article, Jenny!

I'm currently working on a RAG project at my 9-to-5, testing different chunking and indexing strategies. Having small, overlapping chunks improved the specificity, like you said.

I started exploring MCP to gain more control over database-sourced content and built a simple MCP server prototype over the weekend. It was fun to see how the tiny qwen3:32b LLM model, running on Ollama locally, was able to figure out SQL queries from the database schema and my vague prompt. It even fixed the broken queries on its own, until it retrieved the data I requested. It took me a few hours to get this multi-step looping process, where the LLM self-corrects on error, to work correctly. Now my mind is racing as I think of ways to apply this idea to other problems. Self-improving learning loops would be like rocket fuel, accelerating problem-solving.

The ability for an LLM to call different tools using MCP (Model Context Protocol) opens up a new path, leveraging AI capabilities and other content types. I used the FastMCP library by Jeremiah Lowin - worth checking out.

I started using Cursor a few days ago and connected it to my Obsidian vault, and got a very similar experience to what you described.

Thanks for pointing out how RAG is embedded everywhere. I can see that, too, after building one myself. Looks like we are going through similar paths.

Expand full comment
Joel Salinas's avatar

What a breakdown of RAG, Jenny! You definitely sparked some ideas I will be experimenting with. Thank you! As always, awesome work. 🙌🙌

Expand full comment
24 more comments...

No posts