Richard Golian

1995-born. Charles University alum. Head of Performance at Mixit. 10+ years in marketing and data.

Castellano Français Slovenčina

Manage subscription Choose a plan

RSS
Newsletter
New articles to your inbox

Article

Local AI Model Limitations: Why I Switched from Ollama to Claude for Autonomous Agents

Local AI agent: setup, limits, lessons
Richard Golian
Richard Golian · 1 805 reads
Hi, I am Richard. On this blog, I share thoughts, personal stories, findings — and what I am working on. I hope this article brings you some value.

I have been writing about AI since early 2023. Over that time, I have watched it change how I code, how I think about content, and how I think about the future of work.

This is a story about going one level deeper — from using AI as a tool to trying to build something autonomous on top of it. It did not work the way I expected.

WHY I TRIED RUNNING AI LOCALLY

Before I had any real experience with it, local AI seemed like the most interesting move I could make. Not just because of flexibility or security — although both mattered — but because it felt like the most honest way to approach the technology.

In the middle of everything happening around AI, actually running a model locally, configuring it, connecting it to data, and seeing where it breaks felt fundamentally different from using a polished cloud interface. It felt like the difference between using a tool and understanding how that tool actually works.

At the same time, I was not approaching it as a purely technical experiment. I had a clear use case in mind from the beginning.

The first area I wanted to apply this to was SEO. SEO is a documented, relatively exact discipline. It has structure, rules, patterns, and measurable outcomes. In theory, that makes it ideal for automation. An agent can scan hundreds of subpages in minutes, identify structural issues, detect missing elements, and if it also has access to search trend data, it can produce meaningful content recommendations.

That is not an abstract idea. That is a real workflow with clear business value.

The broader vision was more ambitious. I wanted to build an agent that retrieves data based on configured automations, proposes steps based on what it finds, sends those proposals somewhere for review, and through that feedback loop gradually improves. At a certain point, once its proposed steps consistently match what I consider good decisions, it would start executing those actions autonomously.

Not just assisting. Acting.

That was the goal.

MAC MINI, OLLAMA, N8N

The setup itself was straightforward. I used a Mac Mini, ran a local model through Ollama, and handled basic orchestration via n8n.

Getting Ollama running was surprisingly simple. Much simpler than I expected. Within a short time, I had a model up, responding, and behaving like a chatbot. From a purely technical perspective, the barrier to entry was low.

Within a few hours, I had a basic pipeline in place. The model was able to retrieve data, run a basic marketing analysis, and I had a clear path toward automating alerts into Slack based on the output. At that stage, everything felt promising. The system was working, and it was working locally.

What I did not yet fully understand was how quickly I would run into its limits.

Then I tested it on representative sample data designed to simulate real-world conditions.

THE CONTEXT WINDOW

This is where the real limitation became obvious.

The model could handle a few pages of text. It could process a small table, or a dataset with a size of a few kilobytes. Within that range, it behaved in a way that looked functional.

But the moment I gave it representative SEO data — the kind of volume you actually need to analyse if you want meaningful output — the system broke down.

It processed what fit into its context window and ignored the rest. It produced output that, on the surface, looked structured, but when you looked closer, it had almost no value. It would pick up a number somewhere in the data and repeat it back. It did not combine signals. It did not prioritise correctly. It did not understand relationships across the dataset.

And the reason was simple. It could not see enough of it.

I noticed this immediately during the first real analysis. The quality of the output was roughly comparable to what cloud models were producing in 2023. That is not a criticism of the model itself. It is a reflection of the constraints.

The problem was not configuration. It was not prompting. It was not lack of effort.

The hardware determined which model I could run. And the model I could run simply could not hold the amount of information required for the task.

WHAT AUTONOMOUS ACTUALLY MEANS

At this point, it became clear what "autonomous" actually requires in practice — and where the system was falling short.

An autonomous agent is not just a loop that calls a model repeatedly. It requires the ability to reason across a large amount of context, maintain coherence across multiple steps, and produce outputs that are precise enough to act on without constant supervision.

That means it needs to hold not just the current input, but the accumulated state of the entire workflow. What data was retrieved, what actions were proposed, what decisions were made, what failed, what succeeded, and what the overall objective is.

This is where the limitation becomes structural.

A model with a constrained context window cannot maintain that state. It cannot connect decisions across time. It cannot evaluate its own outputs in a meaningful way because it lacks visibility into the full process.

The vision of the system was not the problem.

The infrastructure underneath it was.

SWITCHING TO CLAUDE CODE

At that point, I moved to a cloud-based solution and started working with Claude Code from Anthropic.

Continue

Join the Library

Full access to my thoughts, personal stories, findings, and what I learn from the people I meet.

Join the Library — €29.99 per year
Read only this one · €2,99

Get the full article by email and feel free to reply if you want to discuss it further.

Visa Mastercard Apple Pay Google Pay

Summary

I tried building an autonomous AI agent locally — Mac Mini, Ollama, n8n. The context window limitations made meaningful analysis impossible. This is what I learned about local vs cloud AI, and why I switched to Claude Code.

Common questions on this article's topic

What is the difference between running AI locally and using cloud AI?
Local AI runs on your own hardware — giving you full control over data and no recurring API costs, but with significant limitations in processing power and context window size. Cloud AI (like Claude or GPT-4) runs on remote servers with far larger models, longer context windows, and better reasoning capabilities, but requires sending data externally and paying per usage. In the article, local AI was initially chosen for privacy and control, but its limitations forced a switch to cloud.
What is a context window and why does it matter?
The context window is the amount of text an AI model can process in a single interaction — analogous to how much of a document it can see at once. Local models typically have much smaller context windows than cloud models. In the article, this was the critical limitation: when given real-world SEO data volumes, the local model could only process what fit in its window and ignored the rest, producing output that looked structured but had almost no analytical value.
What is Ollama and how easy is it to set up?
Ollama is an open-source tool that allows users to run large language models locally on their own hardware. In the article, setup is described as surprisingly simple — within a short time, a model was running and responding on a Mac Mini. The barrier to entry was low from a technical perspective. The problems emerged only when the model was tasked with processing real-world data volumes that exceeded its context window capacity.
Can local AI models handle real business data analysis?
In the article, the answer is not yet — at least not for complex, multi-dimensional analysis. The local model could handle small datasets and simple queries. But when given representative SEO data at production scale, it broke down: processing only what fit in its context window, picking up isolated numbers without understanding relationships, and producing output comparable to cloud models from 2023. The gap between local and cloud capability remains significant.
What is an autonomous AI agent?
An autonomous AI agent is a system that retrieves data, proposes actions based on what it finds, learns from feedback, and eventually executes decisions independently. In the article, the goal was to build such an agent for SEO: scanning subpages, identifying issues, proposing content recommendations, and gradually improving through a feedback loop until it could act without human intervention. The vision was not just AI assisting — but AI acting.
Should developers start with local AI or cloud AI?
In the article, starting locally provided valuable hands-on understanding of how models actually work — the difference between using a polished interface and understanding the underlying technology. However, for production use cases requiring complex reasoning and large data volumes, cloud AI was necessary. The practical recommendation is: experiment locally to build understanding, but use cloud models for real business applications where quality and context capacity matter.
Richard Golian

If you have any thoughts, questions, or feedback, feel free to drop me a message at mail@richardgolian.com.

NEWSLETTER
What I write about, what I am working on, what I learned.
Sent the first Sunday of the month. Unsubscribe anytime.

Related articles

What Determines a Stock Price?

In April, in the first part of this series, I wrote about an AI prediction system I had started building on my own machine. At the time the software was a few hours old and the prediction record was empty. The record since then has shown one thing — the system does not yet understand the market it is being asked to forecast. It can pull macro context, book value, earnings. But it cannot put those together into something that helps it understand the price.

23 May 2026·292 reads
Building an AI Stock Market Prediction System That Grades Itself

I am building an AI system to predict the S&P 500. It runs on my own machine, uses free public data — yfinance, FRED, the Shiller dataset — and grades every forecast against reality. This series documents the build itself: the decisions, the methodology, the mistakes. What I will eventually share from the running system is a separate question, and an honest one.

26 April 2026·770 reads
AI sales forecast: 9 traps so far

Yesterday I could not tear myself away from the computer. When I lifted my head, it was half past eight in the evening. I had been sitting alone upstairs for about three hours.

25 April 2026·722 reads

More articles

Europe Is Not Ready for Drone Warfare

Europe does not have the capacity to face a full-scale, mass drone war of the kind we see in Ukraine. Three dependencies weaken it: China supplies the physical material for defence systems, the United States supplies capabilities Europe does not have, and twenty-seven states cannot agree how fast, or who pays. Rearmament plans exist, but they are being carried out slowly.

31 May 2026·240 reads
Can AI Replace Human Judgement?

AI produces the graphic, the newsletter and the product page faster than a person. What is left for the one who used to do it is the judgement — knowing whether the output is good. But most people have worse judgement than AI. And whoever cannot judge quality cannot delegate either. How do you tell whether yours is the judgement a company relies on, or the kind it can replace?

30 May 2026·230 reads
Where the Money Goes When AI Takes the Work

Prague, 13 May 2026. On my way to work I started thinking about something that stayed with me for days. If most routine work on a computer disappears in the next ten years, and a large share of repetitive manual work disappears with it, what happens to the flow of money? Who pays whom for what? Which economic layers will exist, how large will they be, and what relationships will run between them? This is the six-layer map I sketched as an answer.

15 May 2026·812 reads
Will AI take my job?

Will AI take my job? A certified Google trainer told me in June 2024 that my profession would cease to exist. Twenty-two months later, my job title has not changed — but ninety percent of what I do during the day is different. I have delegated more of my thinking to AI agents than I thought possible. I am not afraid. This is why, and what it means for anyone asking the same question.

23 April 2026·462 reads
€50,000 Quote vs. Two Hours with Claude Code

One hour. Fifty-five minutes. That is how long it took to build what a Czech software firm had quoted at over €50,000. I built it with Claude Code. Not a prototype. Not a proof of concept. A working tool — the one the company actually needed. By the evening of the same day, it was running on staging. This is not about Claude Code. It is about what Claude Code exposes.

18 April 2026·890 reads
Is AI Making Us Dumber?

I have conducted roughly one hundred and fifty practical interviews over the past four years. Fifty for data specialist roles. A hundred for advertising and performance marketing specialists. Almost every one of them involved sitting down with a candidate over a practical task — something close to a real problem we actually need to solve at the company. Not theory. Not trivia. Applied problem-solving. Over time, I started noticing a pattern.

14 April 2026·840 reads
What AI Hides From You

Before you can teach AI to understand anything, you need to see what it is hiding from you.

11 April 2026·859 reads
Full AI agents or fully offline.

Four days in Catalonia. No computer, no AI, almost no social media. I bought this notebook so that I could write down what I would think about, and what I would come across and learn on the trip.

10.5.2026·498 reads
NEWSLETTER
What I write about, what I am working on, what I learned.
Sent the first Sunday of the month. Unsubscribe anytime.