Richard Golian

1995-born. Charles University alum. Head of Performance at Mixit. 10+ years in marketing and data.

Castellano Français Slovenčina

Manage subscription Choose a plan

RSS
Newsletter
New articles to your inbox

Article

Training an AI Agent That Learns Between Sessions

How AI agents learn between sessions
Richard Golian
Richard Golian · 880 reads
Hi, I am Richard. On this blog, I share thoughts, personal stories — and what I am working on. I hope this article brings you some value.

The goal I set myself

I wanted to build an agent that does not just assist. One that acts.

The idea was straightforward: configure automations to retrieve data, let the agent analyse what it finds, have it propose next steps, send those proposals somewhere for review, and through that feedback loop — gradually improve. At a certain point, once its proposed steps consistently matched what I considered good decisions, it would stop waiting for approval and start executing on its own.

Not a chatbot. Not a co-pilot. An autonomous system that earns its authority through demonstrated accuracy.

That was the goal. I wrote about parts of it in my previous article on local AI models. This is the next chapter.

WHAT I BUILT — AND WHAT IT COULD NOT DO

The first version was simple by design. But the interesting part was not what it did. The interesting part was what it could not do.

The agent runs on a schedule. It retrieves data, analyses it, and sends a report to Slack. To make sure the output was consistent, I created a schema — an approved format the agent checks itself against before sending anything. If something does not match, it corrects itself. It loops until the output passes. If something prevents it from completing the process — such as a failed LLM call — it does not send a degraded output. It sends an alert to Slack instead.

I also added positive examples. Approved outputs from previous runs that the agent can reference when producing the next one.

This felt like a solid system. And for a while, I thought it was.

THE THING THAT KEPT BOTHERING ME

Every session starts from zero.

The schema is there. The examples are there. But the agent does not know what it struggled with yesterday. It does not know which rule it keeps violating. It does not know what it has already figured out.

And that changes everything.

The self-correction loop works within a single session. Between sessions, nothing accumulates. So the inconsistency I was seeing was not a configuration problem. It was not a prompting problem.

The problem was not technical. It was structural.

SELF-CORRECTION VS SELF-IMPROVEMENT

This is where I realised something important.

Self-correction means the agent catches its own errors before sending output. It happens inside one run, against a fixed schema. The session ends, and whatever the agent learned — disappears.

Self-improvement means the agent builds something across runs. Each session leaves a trace that the next session can use. Errors become rules. Rules become context. Context shapes the next output before generation even starts.

The first is a quality filter. The second is something closer to learning.

And this distinction is not just about AI agents. It is the difference between systems that repeat and systems that evolve. Between people who fix mistakes and people who stop making the same ones. Most organisations have self-correction. Very few have genuine self-improvement. The mechanism looks similar from the outside. The architecture underneath is completely different.

What I had was a good quality filter. What I was missing was the accumulation layer underneath it.

DOES CLAUDE CODE ALREADY HAVE PERSISTENT MEMORY?

This is a fair question — and one I had to work through myself.

Claude Code has a file called CLAUDE.md. It loads automatically at the start of every session. When you tell the agent to remember something for future runs, it can write it there. And next time, it will be there. That is real persistence. It is not an illusion.

So when Claude Code confirms it will remember something — it is not lying.

The problem is what "there" actually means in practice.

Continue

Join the Library

Full access to my thoughts, personal stories, findings, and what I learn from the people I meet.

Join the Library — €29.99 per year
Or just this article · €2,99

Get the full article by email and feel free to reply if you want to discuss it further.

Visa Mastercard Apple Pay Google Pay

Summary

I wanted to build an autonomous AI agent that improves over time — not just one that corrects itself within a single session. The distinction is between self-correction and self-improvement. Claude Code's built-in memory has limits for agents that run daily. A structured memory layer changes what is possible.

Common questions on this article's topic

What is the difference between AI self-correction and self-improvement?
Self-correction means the agent catches errors within a single session — checking output against a schema and looping until it passes. When the session ends, everything learned is lost. Self-improvement means the agent builds knowledge across sessions: errors become rules, rules become context, and context shapes future output before generation even starts. In the article, this distinction is identified as the critical gap in current AI agent architectures — and the key to building systems that genuinely evolve.
What is CLAUDE.md and what are its limitations for AI agents?
CLAUDE.md is a file that Claude Code loads automatically at the start of every session, providing persistent memory across runs. When the agent is told to remember something, it writes to this file. The persistence is real — not an illusion. However, in the article, the limitation is identified: CLAUDE.md is a static, unstructured file. It does not organise itself, distinguish relevant from outdated entries, or manage its own growth. For an agent running daily over weeks, the file becomes noise rather than signal.
Why does every AI agent session start from zero?
Because current AI models have no built-in mechanism for accumulating experience between sessions. The context window is populated fresh each time. In the article, this is identified as the structural — not technical — problem: the agent does not know what it struggled with yesterday, which rules it keeps violating, or what it has already figured out. The self-correction loop works within a session. Between sessions, nothing persists unless explicitly stored.
What is a structured memory layer for AI agents?
A structured memory layer sits alongside static memory files and organises accumulated experience into categories the agent can reference selectively. Instead of loading everything into the context window every time, the agent retrieves only what is relevant to the current task. In the article, this is the solution being built: a system where errors become rules, rules become context, and the agent's behaviour improves measurably across sessions rather than resetting each time.
Can you run autonomous AI agents locally?
Yes, but with significant limitations. In the article and its predecessor on local AI models, the setup used a Mac Mini with Ollama and n8n. Basic pipelines worked: data retrieval, simple analysis, Slack alerts. But the context window limitations of local models made complex analysis impossible. For autonomous agents that need to process real-world data volumes and maintain quality over time, cloud models with larger context windows proved necessary.
What does it take to build an AI agent that earns autonomy?
In the article, the principle is that autonomy must be earned through demonstrated accuracy — not granted by default. The architecture starts with human review of every proposed action. As the agent's proposals consistently match good decisions, it gradually gains permission to act independently. This requires not just good single-session performance but genuine improvement over time — which is why the structured memory layer is essential. Without cross-session learning, the agent cannot build the track record needed to justify autonomous action.
Richard Golian

If you have any thoughts, questions, or feedback, feel free to drop me a message at mail@richardgolian.com.

NEWSLETTER
What I write about, what I am working on, what I learned.
Sent the first Sunday of the month. Unsubscribe anytime.

Related articles

Building an AI Stock Market Prediction System That Grades Itself

I am building an AI system to predict the S&P 500. It runs on my own machine, uses free public data — yfinance, FRED, the Shiller dataset — and grades every forecast against reality. This series documents the build itself: the decisions, the methodology, the mistakes. What I will eventually share from the running system is a separate question, and an honest one.

26 April 2026·612 reads
AI sales forecast: 9 traps so far

Yesterday I could not tear myself away from the computer. When I lifted my head, it was half past eight in the evening. I had been sitting alone upstairs for about three hours.

25 April 2026·585 reads
What AI Hides From You

Before you can teach AI to understand anything, you need to see what it is hiding from you.

11 April 2026·672 reads

More articles

Where the Money Goes When AI Takes the Work

Prague, 13 May 2026. On my way to work I started thinking about something that stayed with me for days. If most routine work on a computer disappears in the next ten years, and a large share of repetitive manual work disappears with it, what happens to the flow of money? Who pays whom for what? Which economic layers will exist, how large will they be, and what relationships will run between them? This is the six-layer map I sketched as an answer.

15 May 2026·102 reads
Will AI take my job?

Will AI take my job? A certified Google trainer told me in June 2024 that my profession would cease to exist. Twenty-two months later, my job title has not changed — but ninety percent of what I do during the day is different. I have delegated more of my thinking to AI agents than I thought possible. I am not afraid. This is why, and what it means for anyone asking the same question.

23 April 2026·366 reads
€50,000 Quote vs. Two Hours with Claude Code

One hour. Fifty-five minutes. That is how long it took to build what a Czech software firm had quoted at over €50,000. I built it with Claude Code. Not a prototype. Not a proof of concept. A working tool — the one the company actually needed. By the evening of the same day, it was running on staging. This is not about Claude Code. It is about what Claude Code exposes.

18 April 2026·722 reads
Is AI Making Us Dumber?

I have conducted roughly one hundred and fifty practical interviews over the past four years. Fifty for data specialist roles. A hundred for advertising and performance marketing specialists. Almost every one of them involved sitting down with a candidate over a practical task — something close to a real problem we actually need to solve at the company. Not theory. Not trivia. Applied problem-solving. Over time, I started noticing a pattern.

14 April 2026·673 reads
When Your AI Agent Joins the Team

The moment other people needed access to it, the problem changed completely. It was no longer about whether the agent could learn. It was about who gets to teach it.

8 April 2026·827 reads
Local AI Model Limitations: Why I Switched from Ollama to Claude for Autonomous Agents

This is what I learned about local vs cloud AI, and why I switched to Claude Code.

3 April 2026·1 477 reads
Slovakia's Economy in 2026

What happened — and how can it be reversed?

28 March 2026·1 342 reads
Full AI agents or fully offline.

Four days in Catalonia. No computer, no AI, almost no social media. I bought this notebook so that I could write down what I would think about, and what I would come across and learn on the trip.

10.5.2026·323 reads
NEWSLETTER
What I write about, what I am working on, what I learned.
Sent the first Sunday of the month. Unsubscribe anytime.