Your AI bill just became personal

In partnership with

2026 State of AEO Report

A year ago, most marketers weren't thinking about AI search. Now it's one of the fastest moving channels in the industry and nobody has a playbook yet.

So we built one. We surveyed hundreds of marketers to find out how they're approaching answer engine optimization, where they're investing, what's actually working, and what isn't.

The result is the 2026 State of AEO Report. Real data. Real strategies. A clear picture of where AI search is headed and how to get ahead of it.

Download the free report

iPrompt

THE AI NEWSLETTER THAT TURNS NEWS INTO ACTION

ISSUE #138 WEDNESDAY · 3 JUNE 2026

THE HOOK

One developer ran a single AI coding session on Monday and burned through 8% of his monthly budget in two hours. He wasn't alone: that morning GitHub moved 4.7 million subscribers onto pay-per-token billing, and the bills started landing. Hours later, Microsoft shipped its own models to cut what that work costs it to run. The week AI got expensive is the week the bill became yours to read. Are you reading it?

AI NEWS ROUNDUP

This week in AI

1 GitHub Copilot started charging by the token. On 1 June, all 4.7 million paid Copilot subscribers moved from a flat fee to usage-based “AI Credits.” Code completions stay free, but anything agentic is now metered. Power users on Reddit and X projected bills rising 10× to 50× — one estimated his jumping from $50 to $3,000 a month. The old fallback to a cheaper model is gone. TechCrunch →

2 Microsoft shipped its own AI stack at Build. Hours after the billing switch, Satya Nadella unveiled seven in-house MAI models. Microsoft says MAI-Thinking-1, its first reasoning model, was trained without OpenAI data; MAI-Code-1-Flash, an efficiency-tuned coder it says uses 60% fewer tokens on hard tasks, is now live in the Copilot model picker. The point isn't that it beats GPT — it's that Microsoft now owns a coder it doesn't pay OpenAI to run. Microsoft →

3 Even Alphabet is borrowing to afford AI. Google's parent is raising $80bn in equity — including a $10bn private placement from Berkshire Hathaway — to fund compute it says demand is already outstripping. The reason your bill is going up sits here: running AI is so expensive that the richest firms in tech are raising money to keep the lights on. That cost flows downhill to you. CNBC →

4 The same billing change shipped a kill switch for your bill. Buried under the backlash: GitHub now lets orgs and enterprises set a hard user-level budget — flip on “stop usage when the limit is reached” and a surprise bill becomes impossible. One enthusiastic engineer running agents overnight can no longer drain the team's pool. The metering era has an off switch; most people haven't found it. GitHub →

OUR ANGLE

🔭 The same week the bill arrived, the discount went on sale

The fact: two stories, one day. Monday GitHub made the cost of AI personal — every token now has a price tag, and the safety net that quietly downgraded you to a cheaper model is gone. Tuesday Microsoft shipped cheaper models it owns outright. Same week, same problem, two directions.

What I infer: Microsoft is solving one problem from both ends. Internal numbers reported by Ed Zitron suggest GitHub's weekly cost of running Copilot had nearly doubled since January — flat-rate pricing was bleeding money because agents burn tokens like water. So it passes the cost to users (token billing) and cuts the cost itself (its own models, no OpenAI margin on top). And it won't stop at code — the same meter is coming for every heavy AI workflow, whether that's a research agent, a writing tool or an analysis copilot. The free trial is over because the trial was a subsidy, and the subsidy stopped paying.

My bet — against the herd: here's the part nobody's pricing in. Everyone's racing to meter — Copilot, Cursor and Claude Code have all converged on usage-based pricing. That makes the bill unpredictable, and unpredictability breeds anxiety. So by Q1 2027, at least one serious AI tool will bring back a true flat-rate “all-you-can-eat” tier and sell it on exactly that: no surprises. The economics now allow it — cheaper owned models (Microsoft's whole MAI play) plus quiet guardrails like budget routing to a budget model, smaller default context, and capped agent runs let a vendor offer a fixed price and still protect margin. When everyone zigs to the meter, predictable pricing becomes the premium product. If you think metering is permanent, reply and tell me why.

Go deeper: the metering era — and the flat-rate comeback nobody's pricing in →

THE THREE SPECIALS

Do · Use · Understand

🎯 PROMPT OF THE WEEK

The Token-Cost Auditor

Run this before your next AI bill lands. It turns a vague “am I overspending?” into a ranked list of which workflows to cap, batch, or move to a cheaper model — the exact thinking GitHub just forced on every Copilot user.

You are a cost-engineering analyst for AI tools.

I'll describe how I use an AI assistant in a typical

week. For each task, estimate relative token cost

(LOW / MEDIUM / HIGH) based on:

- input size (how much context I paste)

- output size (how long the response is)

- how often I re-run or iterate

Then give me:

- The 2 tasks burning the most tokens, and why

- For each: cap it, batch it, or downgrade the

model? Pick one and justify it.

- One task I should STOP sending to a frontier

model and route to a cheaper one with no real

quality loss.

- One habit (pasting whole files, no system

prompt, re-running blindly) that inflates my

bill - and the fix.

MY WEEKLY AI USAGE:

[list 6-10 tasks: what you ask, how big, how often]

Why it works: Token billing punishes three invisible habits — oversized context, runaway output, and blind re-runs. Naming the cost per task turns an abstract invoice into a queue of fixable decisions. Most of your spend hides in two or three workflows.

Where to be careful: Estimates are directional, not your real bill. Use them to rank, then check the actual usage dashboard for your top two before you change anything.

Tested on: Claude Opus 4.8 and Gemini 3.5 Flash — both gave usable triage.

🛠️ TOOL OF THE WEEK

OpenRouter

One API key, every model — so you route each task to the cheapest model that does the job.

★★★★ / 5

Use if: you pay for AI by the token and want to stop sending cheap tasks to expensive models. Skip if: you're locked into one vendor's IDE and never touch an API.

OpenRouter sits in front of hundreds of models — OpenAI, Anthropic, Google, with Microsoft's new MAI family announced for the platform this week — behind a single endpoint. You see live per-token prices side by side and switch models with one line. Set a cheap default, escalate to a frontier model only when a task needs it. That's the metering-era discipline in a single tool.

Describe it to a colleague: “It's a price-comparison switchboard for AI models — same code, cheapest engine that works.”

Live pricing makes the cost of each model visible before you commit a request.

Best use case: route your highest-volume, lowest-stakes task to a budget model this week and watch the spend drop. OpenRouter →

💡 TIP OF THE WEEK

This won't help if you only autocomplete — but if you run agents, your context window is now a line item.

Treat context like cash, not like memory

Under token billing, every token you feed the model — input, output, and cached — is metered. The instinct to “paste the whole repo so it has context” is now the instinct that drains your credits fastest. Three habits that cut spend without cutting quality:

Trim the input. Paste the function and its callers, not the file. The model rarely needs the other 800 lines.

Cap the output. Ask for the diff, not the rewritten file. “Show only changed lines” can halve a response.

Stop blind re-runs. Each “try again” is a fresh bill. Refine the prompt once instead of rolling the dice five times.

Why it works: Cost scales with tokens, and tokens scale with context. The flat-rate era rewarded dumping everything in; the metered era rewards precision. Same answers, a fraction of the spend.

Where it doesn't apply: genuinely cross-file reasoning — a refactor that touches ten files needs to see them. Pay for the context that earns its keep; cut the context you paste out of habit.

YOUR MOVE

One action. Reply by Friday.

This week in three lines:

The free-trial era of AI is over — GitHub Copilot now meters tokens, and agentic bills are jumping 10× to 50× for the people who use it most.

Microsoft shipped its own MAI models the same week — owning the cheapest stack is how a metered business survives, and it cuts the OpenAI dependency at the same time.

Every token you feed an AI is now a line item on a bill. Trim what you paste in, cap what you ask for back, and stop the blind re-runs.

Your one action: run the Token-Cost Auditor on your real week, then change one thing — cap a task, batch it, or route it to a cheaper model. Reply and tell me what you changed. That's the move that matters — I read every response, and the readers who actually act are the names I start to recognise.

Optional, after you've replied: there's a deep dive on the flat-rate comeback, and a forward link for anyone whose AI bill just doubled. Both are entirely optional — the reply is the only thing that matters this week.

—

R. Lauritsen

EDITOR · IPROMPT

P.S. Hit reply first — then, if you want the longer argument, the deep dive names the company I think is most likely to bring flat-rate back. But I'd rather hear what you changed than have you read it.

Forward iPrompt → For someone whose AI bill just doubled overnight.

Share the newsletter

The metering era — and the flat-rate comeback nobody’s pricing in

This week AI stopped being free to try. The major tools are converging on pay-per-token billing, and the bills are landing. But when a market races to the meter, it leaves something behind —

IPrompt - Your weekly AI intelligence brief • René Lauritsen

The largest IPO in history is coming. Where will all that liquid money go?

SpaceX just filed for an IPO valued at up to $1.75 trillion. When that much capital becomes liquid all at once, where it goes next is the big question.

Meanwhile, spring art auctions in NY cleared $2.5 billion, with 15+ new artist records.

Prized, physical assets with fixed and scarce supply. When the ultra-wealthy get liquid, it’s one of the markets they reach for to diversify.

Masterworks lets you into that art market without needing the nine figures. Its members invest in shares of blue-chip artwork by artists like Banksy, Basquiat and Warhol.

The track record to-date?

$1.3B deployed across 500+ artworks
29 sales to date
Net annualized returns like 16.5%, 17.6%, and 17.8%, not including those unsold*

Skip the waitlist to join Masterworks.

^{*Investing involves risk. Past performance is not indicative of future returns. See important disclosures at}^{masterworks.com/cd}^.

Your AI bill just became personal

2026 State of AEO Report

The largest IPO in history is coming. Where will all that liquid money go?

Recommended for you

IBM just told you who pays for AI

The Donor Economy: who's actually paying for the AI build-out

Google drew the new map. OpenAI isn't on it. A new agent standard launched

The rails, not the models — the bet enterprise AI just started making

$29bn walked in. A credit rating walked out. Same week, same trade, opposite verdicts. The build-out just met its bill.

Who pays for the machines? The balance-sheet mechanics of a $750bn build-out

Smarter by
Wednesday

Your AI bill just became personal

Sponsor:

2026 State of AEO Report

The largest IPO in history is coming. Where will all that liquid money go?

Recommended for you

IBM just told you who pays for AI

The Donor Economy: who's actually paying for the AI build-out

Google drew the new map. OpenAI isn't on it. A new agent standard launched

The rails, not the models — the bet enterprise AI just started making

$29bn walked in. A credit rating walked out. Same week, same trade, opposite verdicts. The build-out just met its bill.

Who pays for the machines? The balance-sheet mechanics of a $750bn build-out

Smarter byWednesday

Smarter by
Wednesday