Article
Local AI Model Limitations: Why I Switched from Ollama to Claude for Autonomous Agents
I have been writing about AI since early 2023. Over that time, I have watched it change how I code, how I think about content, and how I think about the future of work.
This is a story about going one level deeper — from using AI as a tool to trying to build something autonomous on top of it. It did not work the way I expected.
WHY I TRIED RUNNING AI LOCALLY
Before I had any real experience with it, local AI seemed like the most interesting move I could make. Not just because of flexibility or security — although both mattered — but because it felt like the most honest way to approach the technology.
In the middle of everything happening around AI, actually running a model locally, configuring it, connecting it to data, and seeing where it breaks felt fundamentally different from using a polished cloud interface. It felt like the difference between using a tool and understanding how that tool actually works.
At the same time, I was not approaching it as a purely technical experiment. I had a clear use case in mind from the beginning.
The first area I wanted to apply this to was SEO. SEO is a documented, relatively exact discipline. It has structure, rules, patterns, and measurable outcomes. In theory, that makes it ideal for automation. An agent can scan hundreds of subpages in minutes, identify structural issues, detect missing elements, and if it also has access to search trend data, it can produce meaningful content recommendations.
That is not an abstract idea. That is a real workflow with clear business value.
The broader vision was more ambitious. I wanted to build an agent that retrieves data based on configured automations, proposes steps based on what it finds, sends those proposals somewhere for review, and through that feedback loop gradually improves. At a certain point, once its proposed steps consistently match what I consider good decisions, it would start executing those actions autonomously.
Not just assisting. Acting.
That was the goal.
MAC MINI, OLLAMA, N8N
The setup itself was straightforward. I used a Mac Mini, ran a local model through Ollama, and handled basic orchestration via n8n.
Getting Ollama running was surprisingly simple. Much simpler than I expected. Within a short time, I had a model up, responding, and behaving like a chatbot. From a purely technical perspective, the barrier to entry was low.
Within a few hours, I had a basic pipeline in place. The model was able to retrieve data, run a basic marketing analysis, and I had a clear path toward automating alerts into Slack based on the output. At that stage, everything felt promising. The system was working, and it was working locally.
What I did not yet fully understand was how quickly I would run into its limits.
Then I tested it on representative sample data designed to simulate real-world conditions.
THE CONTEXT WINDOW
This is where the real limitation became obvious.
The model could handle a few pages of text. It could process a small table, or a dataset with a size of a few kilobytes. Within that range, it behaved in a way that looked functional.
But the moment I gave it representative SEO data — the kind of volume you actually need to analyse if you want meaningful output — the system broke down.
It processed what fit into its context window and ignored the rest. It produced output that, on the surface, looked structured, but when you looked closer, it had almost no value. It would pick up a number somewhere in the data and repeat it back. It did not combine signals. It did not prioritise correctly. It did not understand relationships across the dataset.
And the reason was simple. It could not see enough of it.
I noticed this immediately during the first real analysis. The quality of the output was roughly comparable to what cloud models were producing in 2023. That is not a criticism of the model itself. It is a reflection of the constraints.
The problem was not configuration. It was not prompting. It was not lack of effort.
The hardware determined which model I could run. And the model I could run simply could not hold the amount of information required for the task.
WHAT AUTONOMOUS ACTUALLY MEANS
At this point, it became clear what "autonomous" actually requires in practice — and where the system was falling short.
An autonomous agent is not just a loop that calls a model repeatedly. It requires the ability to reason across a large amount of context, maintain coherence across multiple steps, and produce outputs that are precise enough to act on without constant supervision.
That means it needs to hold not just the current input, but the accumulated state of the entire workflow. What data was retrieved, what actions were proposed, what decisions were made, what failed, what succeeded, and what the overall objective is.
This is where the limitation becomes structural.
A model with a constrained context window cannot maintain that state. It cannot connect decisions across time. It cannot evaluate its own outputs in a meaningful way because it lacks visibility into the full process.
The vision of the system was not the problem.
The infrastructure underneath it was.
SWITCHING TO CLAUDE CODE
At that point, I moved to a cloud-based solution and started working with Claude Code from Anthropic.
Join the Library
Full access to my thoughts, personal stories, findings, and what I learn from the people I meet.
Join the Library — €29.99 per yearGet the full article by email and feel free to reply if you want to discuss it further.
Summary
Common questions on this article's topic
What is the difference between running AI locally and using cloud AI?
What is a context window and why does it matter?
What is Ollama and how easy is it to set up?
Can local AI models handle real business data analysis?
What is an autonomous AI agent?
Should developers start with local AI or cloud AI?
Related articles
In April, in the first part of this series, I wrote about an AI prediction system I had started building on my own machine. At the time the software was a few hours old and the prediction record was empty. The record since then has shown one thing — the system does not yet understand the market it is being asked to forecast. It can pull macro context, book value, earnings. But it cannot put those together into something that helps it understand the price.
I am building an AI system to predict the S&P 500. It runs on my own machine, uses free public data — yfinance, FRED, the Shiller dataset — and grades every forecast against reality. This series documents the build itself: the decisions, the methodology, the mistakes. What I will eventually share from the running system is a separate question, and an honest one.
Yesterday I could not tear myself away from the computer. When I lifted my head, it was half past eight in the evening. I had been sitting alone upstairs for about three hours.
More articles
Europe does not have the capacity to face a full-scale, mass drone war of the kind we see in Ukraine. Three dependencies weaken it: China supplies the physical material for defence systems, the United States supplies capabilities Europe does not have, and twenty-seven states cannot agree how fast, or who pays. Rearmament plans exist, but they are being carried out slowly.
AI produces the graphic, the newsletter and the product page faster than a person. What is left for the one who used to do it is the judgement — knowing whether the output is good. But most people have worse judgement than AI. And whoever cannot judge quality cannot delegate either. How do you tell whether yours is the judgement a company relies on, or the kind it can replace?
Prague, 13 May 2026. On my way to work I started thinking about something that stayed with me for days. If most routine work on a computer disappears in the next ten years, and a large share of repetitive manual work disappears with it, what happens to the flow of money? Who pays whom for what? Which economic layers will exist, how large will they be, and what relationships will run between them? This is the six-layer map I sketched as an answer.
Will AI take my job? A certified Google trainer told me in June 2024 that my profession would cease to exist. Twenty-two months later, my job title has not changed — but ninety percent of what I do during the day is different. I have delegated more of my thinking to AI agents than I thought possible. I am not afraid. This is why, and what it means for anyone asking the same question.
One hour. Fifty-five minutes. That is how long it took to build what a Czech software firm had quoted at over €50,000. I built it with Claude Code. Not a prototype. Not a proof of concept. A working tool — the one the company actually needed. By the evening of the same day, it was running on staging. This is not about Claude Code. It is about what Claude Code exposes.
I have conducted roughly one hundred and fifty practical interviews over the past four years. Fifty for data specialist roles. A hundred for advertising and performance marketing specialists. Almost every one of them involved sitting down with a candidate over a practical task — something close to a real problem we actually need to solve at the company. Not theory. Not trivia. Applied problem-solving. Over time, I started noticing a pattern.
Before you can teach AI to understand anything, you need to see what it is hiding from you.
Four days in Catalonia. No computer, no AI, almost no social media. I bought this notebook so that I could write down what I would think about, and what I would come across and learn on the trip.
