Article
AI sales forecast: 9 traps so far
Yesterday I could not tear myself away from the computer. When I lifted my head, it was half past eight in the evening. I had been sitting alone upstairs for about three hours.
I was teaching an AI agent that can work independently with data and code. The task: a short-term sales forecast — a predictive view of incoming orders and revenue.
The plan was simple.
Give the agent the data, the campaigns, the context, and let it forecast orders for the next thirty days, every morning. And teach it to understand why the number lands where it lands on a given day.
I decided to build this more robustly than this particular output strictly requires. The reason is broader than one prediction. Once the agent understands what revenue is made of — the tail of a season, a short-term push, an unexpected outage, the effect of overlapping campaigns — a whole field of possibilities opens up for what else I can put it to work on.
One thing was clear from the start. Throwing a pile of numbers at the agent is not enough. For the result to be usable, it has to understand the connections between them. It has to be able to answer "if that seasonal campaign were not running, what would the chart look like?". It has to say "this mid-month peak we are expecting because of a retention campaign ending in two weeks". It has to answer what-if questions and return believable simulations.
The goal is clear.
Another step toward the state when your AI agent joins the team. Get the agent to a level where someone else can say "fine, you take this over, I will do something else". It is not easy.
THE NAIVE FIRST VERSION
The starting position was this. The data warehouse keeps daily order aggregates. The project management tool stores campaigns with tags, start and end dates, types. The marketing plan provides year-on-year growth assumptions.
I gave it to the agent and it produced a formula:
baseline(2026-D) = actual(2025-D, weekday-aligned)
forecast(2026-D) = baseline × growth_target × campaign_multiplier
The multiplier (the number you multiply the baseline by to reflect the impact of a campaign) it pulled from history. A day at the peak of a particular campaign historically had some multiple of revenue compared to the state when no campaign was running. A different multiple for seasonal holidays.
At first glance it looked decent. Close enough to be worth tuning. I started building a dashboard so I could visualise the result while tuning it.
When I asked it to explain the logic and visualise the data, a several-hour battle began.
ROUND 1 — WHY WERE MY PROFILE MULTIPLIERS LYING?
One of the campaigns the model flagged as the strongest. That was wrong.
I wrote to it: "That is completely off. This thematic week is one of the weaker ones. The other campaign running in parallel has a much bigger impact."
The problem was in the baseline (the reference state against which campaign impact is measured). The multiplier was being computed as the ratio of (median of days when the campaign ran) to (median of other days). But "other days" included other parallel campaigns. The baseline was artificially inflated. The lift attribution (the increment in revenue assigned to a campaign) was distorted in both directions — some campaigns overstated, others understated, depending on which other campaigns happened to be running during their inactive days. In overlap periods — which is most of the year — the attribution was completely off.
After I objected, the agent rewrote the baseline definition to "median of days when no push campaign was running". But the result was not suitable as a starting point for analysis. There were few clean days. For some markets and weekdays I did not even have five examples. Campaigns overlap almost continuously.
ROUND 2 — WHY IS AD ATTRIBUTION ONLY THE TIP OF THE ICEBERG?
Then came the attempt to add more context. Measurable campaign impact via ad attribution (assigning orders to a specific ad) — conversions from the ad platforms.
The agent could not interpret it correctly again.
I wrote to it: "But you did not account for consent rate. How many people refuse cookies."
Through the ad platforms only a portion of orders gets matched. The rest goes through non-consent customers who refused cookies — they do not show up in the ad platforms, but they do in the order records. The agent knew about this gap, but did not include it in the prediction method.
After recalculation the campaign numbers rose to more realistic values.
And straight away we hit another layer.
ROUND 3 — HOW DOES ON-SITE COMMUNICATION CHANGE CONVERSION RATES?
I wrote to it: "A campaign does not have impact only through ads. When a campaign appears on the website, conversion lifts for everyone who arrives, not just clicks from ads. Including those from search referrals, direct, and so on."
Join the Library
Full access to my thoughts, personal stories, findings, and what I learn from the people I meet.
Join the Library — €29.99 per yearSummary
Common questions on this article's topic
Can an AI agent forecast sales and orders?
How do you build an AI sales forecast?
What is the consent gap in ad attribution?
How do you measure the impact of a marketing campaign on conversion rate?
Can an AI agent replace a marketing analyst?
How do you forecast orders for a market with no history?
Related articles
I am building an AI system to predict the S&P 500. It runs on my own machine, uses free public data — yfinance, FRED, the Shiller dataset — and grades every forecast against reality. This series documents the build itself: the decisions, the methodology, the mistakes. What I will eventually share from the running system is a separate question, and an honest one.
Will AI take my job? A certified Google trainer told me in June 2024 that my profession would cease to exist. Twenty-two months later, my job title has not changed — but ninety percent of what I do during the day is different. I have delegated more of my thinking to AI agents than I thought possible. I am not afraid. This is why, and what it means for anyone asking the same question.
I have conducted roughly one hundred and fifty practical interviews over the past four years. Fifty for data specialist roles. A hundred for advertising and performance marketing specialists. Almost every one of them involved sitting down with a candidate over a practical task — something close to a real problem we actually need to solve at the company. Not theory. Not trivia. Applied problem-solving. Over time, I started noticing a pattern.
More articles
Prague, 13 May 2026. On my way to work I started thinking about something that stayed with me for days. If most routine work on a computer disappears in the next ten years, and a large share of repetitive manual work disappears with it, what happens to the flow of money? Who pays whom for what? Which economic layers will exist, how large will they be, and what relationships will run between them? This is the six-layer map I sketched as an answer.
One hour. Fifty-five minutes. That is how long it took to build what a Czech software firm had quoted at over €50,000. I built it with Claude Code. Not a prototype. Not a proof of concept. A working tool — the one the company actually needed. By the evening of the same day, it was running on staging. This is not about Claude Code. It is about what Claude Code exposes.
Before you can teach AI to understand anything, you need to see what it is hiding from you.
The moment other people needed access to it, the problem changed completely. It was no longer about whether the agent could learn. It was about who gets to teach it.
I wanted to build an agent that doesn't just assist. One that acts.
This is what I learned about local vs cloud AI, and why I switched to Claude Code.
What happened — and how can it be reversed?
Four days in Catalonia. No computer, no AI, almost no social media. I bought this notebook so that I could write down what I would think about, and what I would come across and learn on the trip.
