Richard Golian

1995-born. Charles University alum. Head of Performance at Mixit. 10+ years in marketing and data.

Castellano Français Slovenčina

Manage subscription Choose a plan

RSS
Newsletter
New articles to your inbox

Article

Training an AI Agent That Learns Between Sessions

How AI agents learn between sessions
Richard Golian
Richard Golian · 997 reads
Hi, I am Richard. On this blog, I share thoughts, personal stories, findings — and what I am working on. I hope this article brings you some value.

The goal I set myself

I wanted to build an agent that does not just assist. One that acts.

The idea was straightforward: configure automations to retrieve data, let the agent analyse what it finds, have it propose next steps, send those proposals somewhere for review, and through that feedback loop — gradually improve. At a certain point, once its proposed steps consistently matched what I considered good decisions, it would stop waiting for approval and start executing on its own.

Not a chatbot. Not a co-pilot. An autonomous system that earns its authority through demonstrated accuracy.

That was the goal. I wrote about parts of it in my previous article on local AI models. This is the next chapter.

WHAT I BUILT — AND WHAT IT COULD NOT DO

The first version was simple by design. But the interesting part was not what it did. The interesting part was what it could not do.

The agent runs on a schedule. It retrieves data, analyses it, and sends a report to Slack. To make sure the output was consistent, I created a schema — an approved format the agent checks itself against before sending anything. If something does not match, it corrects itself. It loops until the output passes. If something prevents it from completing the process — such as a failed LLM call — it does not send a degraded output. It sends an alert to Slack instead.

I also added positive examples. Approved outputs from previous runs that the agent can reference when producing the next one.

This felt like a solid system. And for a while, I thought it was.

THE THING THAT KEPT BOTHERING ME

Every session starts from zero.

The schema is there. The examples are there. But the agent does not know what it struggled with yesterday. It does not know which rule it keeps violating. It does not know what it has already figured out.

And that changes everything.

The self-correction loop works within a single session. Between sessions, nothing accumulates. So the inconsistency I was seeing was not a configuration problem. It was not a prompting problem.

The problem was not technical. It was structural.

SELF-CORRECTION VS SELF-IMPROVEMENT

This is where I realised something important.

Self-correction means the agent catches its own errors before sending output. It happens inside one run, against a fixed schema. The session ends, and whatever the agent learned — disappears.

Self-improvement means the agent builds something across runs. Each session leaves a trace that the next session can use. Errors become rules. Rules become context. Context shapes the next output before generation even starts.

The first is a quality filter. The second is something closer to learning.

And this distinction is not just about AI agents. It is the difference between systems that repeat and systems that evolve. Between people who fix mistakes and people who stop making the same ones. Most organisations have self-correction. Very few have genuine self-improvement. The mechanism looks similar from the outside. The architecture underneath is completely different.

What I had was a good quality filter. What I was missing was the accumulation layer underneath it.

DOES CLAUDE CODE ALREADY HAVE PERSISTENT MEMORY?

This is a fair question — and one I had to work through myself.

Claude Code has a file called CLAUDE.md. It loads automatically at the start of every session. When you tell the agent to remember something for future runs, it can write it there. And next time, it will be there. That is real persistence. It is not an illusion.

So when Claude Code confirms it will remember something — it is not lying.

The problem is what "there" actually means in practice.

Continue

Join the Library

Full access to my thoughts, personal stories, findings, and what I learn from the people I meet.

Join the Library — €29.99 per year
Read only this one · €2,99

Get the full article by email and feel free to reply if you want to discuss it further.

Visa Mastercard Apple Pay Google Pay

Summary

I wanted to build an autonomous AI agent that improves over time — not just one that corrects itself within a single session. The distinction is between self-correction and self-improvement. Claude Code's built-in memory has limits for agents that run daily. A structured memory layer changes what is possible.

Common questions on this article's topic

What is the difference between AI self-correction and self-improvement?
Self-correction means the agent catches errors within a single session — checking output against a schema and looping until it passes. When the session ends, everything learned is lost. Self-improvement means the agent builds knowledge across sessions: errors become rules, rules become context, and context shapes future output before generation even starts. In the article, this distinction is identified as the critical gap in current AI agent architectures — and the key to building systems that genuinely evolve.
What is CLAUDE.md and what are its limitations for AI agents?
CLAUDE.md is a file that Claude Code loads automatically at the start of every session, providing persistent memory across runs. When the agent is told to remember something, it writes to this file. The persistence is real — not an illusion. However, in the article, the limitation is identified: CLAUDE.md is a static, unstructured file. It does not organise itself, distinguish relevant from outdated entries, or manage its own growth. For an agent running daily over weeks, the file becomes noise rather than signal.
Why does every AI agent session start from zero?
Because current AI models have no built-in mechanism for accumulating experience between sessions. The context window is populated fresh each time. In the article, this is identified as the structural — not technical — problem: the agent does not know what it struggled with yesterday, which rules it keeps violating, or what it has already figured out. The self-correction loop works within a session. Between sessions, nothing persists unless explicitly stored.
What is a structured memory layer for AI agents?
A structured memory layer sits alongside static memory files and organises accumulated experience into categories the agent can reference selectively. Instead of loading everything into the context window every time, the agent retrieves only what is relevant to the current task. In the article, this is the solution being built: a system where errors become rules, rules become context, and the agent's behaviour improves measurably across sessions rather than resetting each time.
Can you run autonomous AI agents locally?
Yes, but with significant limitations. In the article and its predecessor on local AI models, the setup used a Mac Mini with Ollama and n8n. Basic pipelines worked: data retrieval, simple analysis, Slack alerts. But the context window limitations of local models made complex analysis impossible. For autonomous agents that need to process real-world data volumes and maintain quality over time, cloud models with larger context windows proved necessary.
What does it take to build an AI agent that earns autonomy?
In the article, the principle is that autonomy must be earned through demonstrated accuracy — not granted by default. The architecture starts with human review of every proposed action. As the agent's proposals consistently match good decisions, it gradually gains permission to act independently. This requires not just good single-session performance but genuine improvement over time — which is why the structured memory layer is essential. Without cross-session learning, the agent cannot build the track record needed to justify autonomous action.
Richard Golian

If you have any thoughts, questions, or feedback, feel free to drop me a message at mail@richardgolian.com.

NEWSLETTER
What I write about, what I am working on, what I learned.
Sent the first Sunday of the month. Unsubscribe anytime.

Related articles

What Determines a Stock Price?

In April, in the first part of this series, I wrote about an AI prediction system I had started building on my own machine. At the time the software was a few hours old and the prediction record was empty. The record since then has shown one thing — the system does not yet understand the market it is being asked to forecast. It can pull macro context, book value, earnings. But it cannot put those together into something that helps it understand the price.

23 May 2026·292 reads
Building an AI Stock Market Prediction System That Grades Itself

I am building an AI system to predict the S&P 500. It runs on my own machine, uses free public data — yfinance, FRED, the Shiller dataset — and grades every forecast against reality. This series documents the build itself: the decisions, the methodology, the mistakes. What I will eventually share from the running system is a separate question, and an honest one.

26 April 2026·769 reads
AI sales forecast: 9 traps so far

Yesterday I could not tear myself away from the computer. When I lifted my head, it was half past eight in the evening. I had been sitting alone upstairs for about three hours.

25 April 2026·722 reads

More articles

Europe Is Not Ready for Drone Warfare

Europe does not have the capacity to face a full-scale, mass drone war of the kind we see in Ukraine. Three dependencies weaken it: China supplies the physical material for defence systems, the United States supplies capabilities Europe does not have, and twenty-seven states cannot agree how fast, or who pays. Rearmament plans exist, but they are being carried out slowly.

31 May 2026·240 reads
Can AI Replace Human Judgement?

AI produces the graphic, the newsletter and the product page faster than a person. What is left for the one who used to do it is the judgement — knowing whether the output is good. But most people have worse judgement than AI. And whoever cannot judge quality cannot delegate either. How do you tell whether yours is the judgement a company relies on, or the kind it can replace?

30 May 2026·230 reads
Where the Money Goes When AI Takes the Work

Prague, 13 May 2026. On my way to work I started thinking about something that stayed with me for days. If most routine work on a computer disappears in the next ten years, and a large share of repetitive manual work disappears with it, what happens to the flow of money? Who pays whom for what? Which economic layers will exist, how large will they be, and what relationships will run between them? This is the six-layer map I sketched as an answer.

15 May 2026·811 reads
Will AI take my job?

Will AI take my job? A certified Google trainer told me in June 2024 that my profession would cease to exist. Twenty-two months later, my job title has not changed — but ninety percent of what I do during the day is different. I have delegated more of my thinking to AI agents than I thought possible. I am not afraid. This is why, and what it means for anyone asking the same question.

23 April 2026·461 reads
€50,000 Quote vs. Two Hours with Claude Code

One hour. Fifty-five minutes. That is how long it took to build what a Czech software firm had quoted at over €50,000. I built it with Claude Code. Not a prototype. Not a proof of concept. A working tool — the one the company actually needed. By the evening of the same day, it was running on staging. This is not about Claude Code. It is about what Claude Code exposes.

18 April 2026·890 reads
Is AI Making Us Dumber?

I have conducted roughly one hundred and fifty practical interviews over the past four years. Fifty for data specialist roles. A hundred for advertising and performance marketing specialists. Almost every one of them involved sitting down with a candidate over a practical task — something close to a real problem we actually need to solve at the company. Not theory. Not trivia. Applied problem-solving. Over time, I started noticing a pattern.

14 April 2026·840 reads
What AI Hides From You

Before you can teach AI to understand anything, you need to see what it is hiding from you.

11 April 2026·859 reads
Full AI agents or fully offline.

Four days in Catalonia. No computer, no AI, almost no social media. I bought this notebook so that I could write down what I would think about, and what I would come across and learn on the trip.

10.5.2026·498 reads
NEWSLETTER
What I write about, what I am working on, what I learned.
Sent the first Sunday of the month. Unsubscribe anytime.