20 Comments
User's avatar
Mark Williams's avatar

Check out Professor W Edward Deming. The father of quality. Plan - Do - Check - Act. Or as someone once said to me. The ultimate framework.

Expand full comment
Finn Tropy's avatar

PDCA is the framework with which I am very familiar from my past roles working with quality management and ISO9001 certification teams.

Expand full comment
Jenny Ouyang's avatar

Finn, this is a fascinating breakdown of the AI learning loop!

Really appreciate you sharing the details, it’s not easy to define each step so clearly and capture all the nuances. Also appreciate the GitHub and paper resources.

Definitely bookmarking this for when I need a deeper dive. Thanks again! 🙌

Expand full comment
Finn Tropy's avatar

Hey Jenny, it is really fun to play with these locally hosted LLM models and come up with experiments like this one.

I was reviewing my Obsidian vault and found some old material (like 4FA framework) and got the idea of self-improving AI Note generator.

I used some of the code from the MCP server that I tested previous weekend to build this loop that self-corrects on SQL error, and a few hours later I had a Note factory running 😀.

I love building silly little experiments like this - not particularly useful but great for discovery and learning new ideas.

Expand full comment
Jenny Ouyang's avatar

Yes I love this too! Silly but really fun to experiment!

So… self-correct on sql error? How did that even happen? Did you provide the schema, make it to actually execute, then read the error output for correction?

Also, what kind of db are you talking about? I’d be shocked if you answer me it’s a vector db 😅

Expand full comment
Finn Tropy's avatar

And BTW, PostgreSQL has vector support for similarity search - see https://www.postgresql.org/about/news/pgvector-070-released-2852/

Expand full comment
Jenny Ouyang's avatar

Oh yes good catch on that!

Expand full comment
Finn Tropy's avatar

I built an MCP server connected to a PostgreSQL database server. Using FastMCP, I added tools and resources, such as schemas and example queries for notes, enabling Ollama with a qwen3:32b model that supports "tool_calls". The tool returned the execution result or threw an exception on query errors that were sent back to the model. I added a loop to retry on exceptions at least three times.

I was pretty surprised to see this tiny qwen3:32b model to figure out the correct queries after receiving the error messages when it made the "tools_calls". I think giving the database schema upfront as part of the tool description helped. The sample queries were also helpful in this experiment.

Expand full comment
Jenny Ouyang's avatar

That was seriously impressive—amazing to see a small local model pull that off! And great to know that Qwen3-32B supports tool_calls. Now I have to try it out!

Expand full comment
Finn Tropy's avatar

Let me know what you find out!

Expand full comment
Gary Coulton's avatar

Does it allow you to make changes? Or does it refuse to accept them?

Expand full comment
Dr. Jane Bormeister's avatar

Fascinating experiment, Finn! I use AI as a thinking partner, but the curation, the voice, the choice of what matters - that still feels deeply human. Your experiment shows AI can mimic structure, but can it replicate genuine insight? The question is whether AI can choose from all notes which thoughts are worth sharing.

Expand full comment
Finn Tropy's avatar

Thanks Jane! I also use AI as a thinking partner, brainstorming some crazy ideas and exploring the patterns it has learned from the Internet, trying to understand and figure out how things work.

Creating these insights is something I love to work on, but it's a slow and painful process for my aging brain. And you are pointing out the very question I have been asking myself - which thoughts are worth sharing? And how should we value those thoughts?

AI can definitely mimic structure and learn to improve those patterns, either by enhancing the prompt or performing meta-learning, which I don't fully understand yet.

On June 12th, the MIT research team published this paper https://arxiv.org/html/2506.10943v1. It describes Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their finetuning data and update directives. There is also a Wired article https://www.wired.com/story/this-ai-model-never-stops-learning/ about this new method. In the MIT paper, you can find this "Algorithm 1 Self-Adapting LLMs (SEAL):Self-Edit Reinforcement Learning Loop" that looks very similar to what I was attempting to build.

I saw the paper this morning while reading the news. It is remarkable how well Google and other players are picking up these small signals. This was on my news feed after researching this topic yesterday.

Expand full comment
Dr. Jane Bormeister's avatar

Fascinating timing with the MIT paper – it’s striking how closely your experiment aligns with current research.

But it also makes me wonder: SEAL can optimize structure, but can it sense meaning?

Deciding which thoughts are worth sharing feels like something rooted in human judgment.

Maybe that’s where our so-called “aging brains” still bring something essential to the table…

Expand full comment
Yana G.Y.'s avatar

Finn, I think that's great. While reading I was thinking I can actually do that kind of loop with agents. What do you think?

Expand full comment
Finn Tropy's avatar

You could do loops using AI client software like Claude, Cursor or VS Code. They all have MCP capabilities to use different tools and resources.

I'm using Cursor right now for a project and it is fun to watch it to write code, test it and find errors, backtrack to try different approaches until it comes up with a working solution.

So nearly all the elements of looping are there. Once we give it curiosity to explore by itself this self improving loop will be closed and rapid improvement can follow.

Expand full comment
Julie Diebolt Price's avatar

While I didn't understand everything you said, Finn, I believe I could make this work with some tutoring. Not that I want to, but I won't be left behind because of ignorance.

Expand full comment
Finn Tropy's avatar

Hi Julie, I apologize for using the technical jargon. It may be interesting to some of my readers, but it doesn't effectively convey the deeper ideas and concepts.

As an engineer, I have built software that "learns" and adjusts based on external feedback. A simple temperature controller has this same "loop" structure: a sensor measures the environment temperature, an algorithm calculates the difference to the target value, and sends a command to a heater or cooling unit. This process keeps looping to maintain the temperature at the set value. It is possible to add more "intelligence", like the prediction of future values, to make it more accurate.

When a toddler learns to ride a bicycle, they must learn to maintain balance by sensing acceleration while pedaling and making corrections by turning the handlebars. A similar "loop" is running in their brain, and after a while, their brain learns and adjusts, until riding becomes easy.

Training AI to learn how to write effective notes requires a similar "loop"—by creating a note, getting critique (or response from readers), and adjusting accordingly. That is what my little experiment was about.

Expand full comment
Julie Diebolt Price's avatar

Very cool!

Expand full comment
Finn Tropy's avatar

This current version is just looping and doesn't ask for feedback or approval. It writes intermediate notes and critique to files, so I can check the progress. It's quite slow, one loop takes about 11 minutes on my laptop.

Expand full comment