Sourced issue — exhibit data from the June 10, 2026 client workbook (Morgan Stanley Research, Byrd); anchored to the disclosed IREN–Microsoft lease
WAGGS TOKEN RATE OBSERVER
Interest rates priced money. Token rates price minds.
Vol. I · No. 004Tuesday, June 30, 2026Byline: Waggs
Output tokens (Azure-class): $10–15/M
Blackwell DC net margin: 58%
Revenue per MWh consumed: $3,386
Cost of that MWh: $100

Does anyone actually make money selling tokens?

Yes — the factory floor is wildly profitable, and getting more so. A real, contracted Microsoft data center clears a ~58% net margin selling inference on Blackwell chips. The same shell on Rubin makes 78%; on Feynman, 90%. The losses in AI live upstream, in training and in model-company P&Ls — not in the factory. A megawatt-hour bought for $100 leaves as $3,400 of tokens.
10 min read · 4 exhibits · 3 signals

For once we are not modeling a hypothetical. In a disclosed transaction, IREN — a bitcoin miner turned landlord — agreed to build a 200 MW (critical IT) data center, buy $5.8 billion of NVIDIA chips, and lease the whole thing to Microsoft for $9.7 billion over five years. One contract, two businesses: IREN earns a financed ~40% equity IRR being the landlord; Microsoft takes the chips and runs what the analyst calls an intelligence factory — capital in, tokens out. This issue walks the factory floor with a cost sheet, because the token rate (the price of a million units of machine thought) only means something against the cost of manufacturing one.

The framework: depreciation is the COGS of cognition. Electricity, the thing every headline worries about, is 8.1% of the factory's annual cost. The chip is everything else. A Blackwell server burns $10k of power and $40k of depreciation a year. So the token rate is not really a power price or a labor price — it is an amortization schedule with a markup. Keep that and every number below follows.

Exhibit 1
Anatomy of a 100 MW factory: where $3.24 billion goes
The GPUs are 48% of total capex — and with servers, networking and cabling, the silicon complex is ~74%. The building itself is a $13.5M-per-MW afterthought wrapped around a $19M-per-MW machine.
Workbook tab: "DC Example Math B100" — 52,209 GPUs at $30k; cooling $1,000/kW; shell at $500/sqft; networking/transceivers/DCI as multiples of physical overhead; $32.4M per MW all-in.

Now the revenue side, and a crucial honesty the workbook deserves credit for: it prices tokens twice. Once under theoretically perfect conditions — every FLOP utilized, which yields comic numbers ($41B of revenue from one site, 95% margins) and is labeled "not realistic" by the analyst himself — and once in the real world, where measured throughput (SemiAnalysis: ~10,000 actual tokens/sec/GPU on Llama-70B against ~70,700 theoretical) implies the fleet converts only 14.1% of its paper FLOPs into shipped tokens. The honest case still prints money: serving GPT-4o-class queries at $2.50/M in, $10/M out, the 135,743-GPU Microsoft site produces $5.8 billion of annual revenue at a 58% net margin.

Exhibit 2
Spec sheet vs. factory floor — the 14.1% haircut
Six-sevenths of the theoretical token output never exists. Anyone underwriting AI paper off spec-sheet FLOPs is financing tokens that will never be minted — the 2026 version of counting railroad land grants as track.
Workbook tab: "Intel Factory Blackwell" — theoretical vs. GPU-seconds-per-query method; inference efficiency 14.1% per SemiAnalysis / NVIDIA InferenceMAX.

Here is the part that should reorganize how you think about NVIDIA's roadmap. Hold the building constant — same 260 MW, same $100/MWh power, same Microsoft — and swap generations. The chips get fewer (135,743 Blackwells → 84,839 Rubins → 53,025 Feynmans), the power draw per box rises, and the economics go vertical:

Exhibit 3 · click through
The same factory, three generations of machine
Each generation roughly doubles the revenue of the same watt and adds ~20 points of margin. This is why token prices can fall 97% and the factories get more profitable — and why a 2026 Blackwell hall is a melting asset the day Rubin ships.
Workbook tabs: "Intel Factory Blackwell / Rubin / Feynman" — real-world (GPU-seconds) method, low-end model scenario; net margin from token sales: 58.3% / 77.7% / 89.8%.

And the landlord's side of the same contract: IREN puts up $1.36B of equity (after $2.5B of 7% chip debt and a $1.94B Microsoft prepayment), collects $1.94B a year, and even assuming the chips are worth only 19 cents on the dollar after five years, clears a ~39.5% pre-tax equity IRR — rising to ~45% when the same trade is run on Rubin. The implied rent is $1.63 per GPU-hour. Both sides of one piece of paper, and both sides win at current token prices. That is what a genuinely scarce input looks like; it is also what a top looks like, which is the entire art of this cycle.

Exhibit 4
One contract, two businesses: the IREN–Microsoft machine
IREN (the landlord)Microsoft (the factory)
Puts in$5.8B of chips; $1.36B net equity after debt + prepayment$9.7B of lease payments over 5 years
Gets out$1.94B/yr of rent + 19% chip residual135,743 B100s' worth of token production
Unit price$1.63 per GPU-hour$2.50–3.00/M input, $10–15/M output tokens
Annual economicsMargin on lease $1.43B (yrs 1–2)$5.8B revenue on $2.1B all-in cost
The return~39.5% pre-tax equity IRR (45% on Rubin)~58% net margin; $3,386 revenue per $100 MWh
The risk heldChip residual value; refinancingToken-price deflation; utilization below 75%
The lease splits the scissors from No. 001: IREN holds the depreciation risk, Microsoft holds the token-price risk. Both are currently paid; only one of them owns the customer.
Workbook tabs: "Intel Factory Blackwell/Rubin" Section I–II; deal terms per IREN's public disclosure (link in workbook and footer).
The other side — steelmanned

The 58–90% margins rest on assumptions that all lean friendly. 75% utilization is the big one — factories that run at 40% halve the answer. The pricing inputs ($10–15/M output) are list prices in a market where frontier rates have already fallen ~97% in three years; the model holds price fixed while costs fall, the most dangerous extrapolation in commodity history. The 14.1% efficiency figure is one benchmark on one open-source model. And mixture-of-experts architectures (the workbook's own Mistral column: 41B active parameters against GPT-4o's 1,700B) can produce 40x more tokens per FLOP — meaning the "same query" may soon be served from one-fortieth the silicon, by whoever chooses to start the price war.

If tokens reprice to marginal cost the way kerosene and steamboat fares did (No. 002), the factory margin compresses toward a utility return — and the $32M-per-MW capital stack is suddenly being amortized against utility revenues. The factory always makes money last; the question is whether it makes back the factory.

Signals to watch
The token rate itself

Margins survive deflation only if volume outruns it. Watch frontier-class output pricing on Azure/OpenAI/Anthropic rate cards against the 22.7x annual usage growth in the OpenRouter data.

Trigger: frontier output < $5/M while Blackwell is still the workhorse
GPU-hour spot vs. contract

The IREN lease implies $1.63/GPU-hr for Blackwells. If spot rental rates slip materially below contracted rates, every neocloud lease in the deal sheet is above market.

Trigger: B-series spot < $1.30/GPU-hr sustained
The efficiency number

14.1% real-world inference efficiency is the margin's denominator. Better batching and speculative decoding raise it — bullish for factories, brutal for chip demand forecasts.

Trigger: InferenceMAX-style results > 25% — halve the GW forecast
The coda

A 90% margin on synthetic thought is the closest legal thing to seigniorage — and like all seigniorage it is really a tax on trust, paid by people who would rather buy cognition than admit they need it. Nobody who calls the API wants tokens. They want what the token stands in for: the report finished, the question answered, the feeling of having thought without the hours of thinking. The factory sells relief from the oldest scarcity, and prices it like electricity because that is the only cost anyone can see.

The Wedgwood lesson from the archive applies: when the cost of a luxury collapses, the margin migrates from making the thing to knowing what to do with it. The factories will keep their spread for a while — scarcity and contracts see to that. The durable fortune goes to whoever stands between the token and the customer and convinces them the thinking was theirs all along.

Waggs