<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The AI Realist]]></title><description><![CDATA[Practical AI for builders, operators, and investors.]]></description><link>https://www.airealist.ai</link><image><url>https://substackcdn.com/image/fetch/$s_!u6cR!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924ecf6b-2ddb-4f24-a3bd-89ae62c7c1dc_800x800.png</url><title>The AI Realist</title><link>https://www.airealist.ai</link></image><generator>Substack</generator><lastBuildDate>Tue, 26 May 2026 07:18:43 GMT</lastBuildDate><atom:link href="https://www.airealist.ai/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Julien Simon]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[julsimon@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[julsimon@substack.com]]></itunes:email><itunes:name><![CDATA[Julien Simon]]></itunes:name></itunes:owner><itunes:author><![CDATA[Julien Simon]]></itunes:author><googleplay:owner><![CDATA[julsimon@substack.com]]></googleplay:owner><googleplay:email><![CDATA[julsimon@substack.com]]></googleplay:email><googleplay:author><![CDATA[Julien Simon]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Huawei Can’t Buy EUV. It Says It Doesn’t Need To.]]></title><description><![CDATA[A new &#8220;scaling law&#8221; reframes the chip race from space to time. The export-control wall was built to deny the first. It has no answer for the second.]]></description><link>https://www.airealist.ai/p/huawei-cant-buy-euv-it-says-it-doesnt</link><guid isPermaLink="false">https://www.airealist.ai/p/huawei-cant-buy-euv-it-says-it-doesnt</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Mon, 25 May 2026 11:16:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!uI1I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uI1I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uI1I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 424w, https://substackcdn.com/image/fetch/$s_!uI1I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 848w, https://substackcdn.com/image/fetch/$s_!uI1I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 1272w, https://substackcdn.com/image/fetch/$s_!uI1I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uI1I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png" width="1264" height="848" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:848,&quot;width&quot;:1264,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2678340,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/199173698?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uI1I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 424w, https://substackcdn.com/image/fetch/$s_!uI1I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 848w, https://substackcdn.com/image/fetch/$s_!uI1I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 1272w, https://substackcdn.com/image/fetch/$s_!uI1I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24bb21d6-3fdd-4ab7-a6e2-5522dbfef869_1264x848.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On Monday in Shanghai, He Tingbo &#8212; president of Huawei&#8217;s semiconductor business and chair of its Scientist Committee &#8212; stood in front of a room at the IEEE International Symposium on Circuits and Systems and told the industry it had spent six decades optimizing for the wrong thing.[1] Her keynote, &#8220;New Semiconductor Path in Practice,&#8221; proposed retiring the principle that has organized the entire business since 1965: shrink the transistor, double the count, repeat. In its place, she offered the &#8220;Tau (&#964;) Scaling Law&#8221; &#8212; already nicknamed &#8220;Her&#8217;s Law&#8221; by her peers &#8212; which optimizes not for how small a transistor is but for how fast a signal moves through the chip.[2] &#8220;I used to think it may take us 10 years,&#8221; she told the room, &#8220;but six years, we are here.&#8221;[1]</p><p>The French wire that crossed my desk called it &#8220;un nouveau mode de fabrication de puces&#8221; &#8212; a new way of manufacturing chips.[3] It is not. That distinction is the entire story, and getting it wrong is how a reader ends up either over- or under-pricing what just happened.</p><p>Huawei did not announce a manufacturing breakthrough. It announced a <em>design</em> breakthrough, executed on the manufacturing it already has. The company is process-constrained: its chips lean on SMIC&#8217;s roughly 7-nanometre-class nodes, several generations behind the 3nm-class processes feeding Apple, Qualcomm, and AMD &#8212; and behind the 2nm that TSMC is now ramping.[4] What it claims to have built is a way to extract frontier-class transistor density from trailing-edge fabrication &#8212; by changing the layout, not the lithography.</p><p>The mechanism is called <strong>LogicFolding</strong>. In a conventional layout, logic blocks sprawl across a mostly flat plane. The limiting factor is increasingly not how fast the transistors switch, but how long it takes a signal to cross the long, resistive wires between them &#8212; a delay that caps the clock and wastes energy driving the interconnect. LogicFolding &#8220;folds&#8221; the logic &#8212; expanding the layout from one layer to two &#8212; pulling critical paths closer together, shortening the wiring, cutting propagation delay, and packing more transistors into the same footprint.[5] Huawei says the fall 2026 Kirin gains 53.5% in transistor density, to 238 million transistors per square millimeter, alongside a 40% jump in performance-core power efficiency and a 3.1GHz top clock.[6] That density figure sits, on paper, near Intel&#8217;s 18A and TSMC&#8217;s 3nm.[7] The phone is only the showcase: Huawei frames the same time-scaling logic running up through its UnifiedBus interconnect to the AI clusters, where it is trying to displace Nvidia.[8]</p><p>This is neither vaporware nor a triumph &#8212; it is a genuine engineering idea aimed at a real bottleneck. The honest steelman comes from Omdia&#8217;s semiconductor research director, He Hui, who calls it a shift from node-driven scaling to &#8220;system-level efficiency scaling&#8221;&#8212;in his view, a credible way to wring more performance out of constrained lithography.[9] Interconnect delay is genuinely the dominant frontier problem, and stacking silicon to address it is not new: HBM has stacked memory since 2015, and TSMC and Intel have stacked finished dies with SoIC and Foveros. What Huawei claims is harder and less proven &#8212; folding a single logic block&#8217;s own gates across two bonded tiers so signals take a short vertical hop instead of a long planar route. That is logic-on-logic at the cell level, the territory the whole industry has been circling for a decade, because the thermal and yield problems are brutal. The difference from the rivals&#8217; version is that theirs still rides on leading-edge fabs Huawei cannot buy.</p><p>But density on a slide is not density at competitive yield, power, and thermals. The most important sentence published all day came from Paul Triolo of DGA Group: a stacked or folded design can produce genuine density gains, he said, but it &#8220;does not mean Huawei has solved&#8221; the yield, power, thermal, and device-performance problems of true 1.4nm-class manufacturing.[10] Counterpoint&#8217;s Neil Shah was blunter on the strategic point: this &#8220;parallel semiconductor path is still unproven at scale.&#8221;[11] And the headline number &#8212; a transistor density &#8220;equivalent to&#8221; a 1.4nm process &#8212; is not a 2026 result. It is a 2031 projection, unaccompanied by any independent performance data.[12] Stacking buys you density. It does not automatically buy you the efficiency that makes density useful at the frontier.</p><p>So strip the projection away and look at what actually shipped: a strategic reframe. The export-control regime was architected on the premise of manufacturing. Deny China extreme ultraviolet lithography &#8212; the ASML machines no Chinese firm can legally buy &#8212; cap it at 7nm, and the density frontier stays out of reach.[13] The wall is real, and it works against the thing it was built to stop. What it cannot do is stop Huawei from deciding that the frontier is no longer defined by the dimension the wall measures. If the goal is signal-propagation time rather than transistor pitch, then a control regime denominated in nanometres is policing a metric the target has stopped competing on. Even Triolo, who doubts the manufacturing claim, reads the move this way: Huawei is &#8220;turning an engineering strategy into a quasi-&#8217;law&#8217;&#8221; &#8212; shorten wires, stack logic, co-design the whole system.[10]</p><p>The reframe does not entirely escape the wall. A folded design still has to be finalized for production on EDA software, where America&#8217;s Synopsys and Cadence dominate, and fabricated on SMIC&#8217;s constrained base node. Huawei now claims home-grown design tools, but domestic EDA at the leading edge is unproven &#8212; and Washington showed in 2025 that it can switch the EDA tap off at will, before it relented.[13] The dependency is real. It simply no longer sits where the lithography rules are pointed.</p><p>The market saw the same thing even where the engineering is unproven: SMIC shares rose 7.6% on the news.[14] And the competitive backdrop sharpens it &#8212; last week, Nvidia&#8217;s Jensen Huang told CNBC his company had &#8220;largely conceded&#8221; China&#8217;s AI chip market to Huawei.[15] The Tau Law is a flag planted in the ground that Nvidia is vacating.</p><p>None of this means Huawei has closed the gap. It almost certainly has not, and the skeptics may be entirely right that folding logic across bonded tiers hits a thermal-and-yield ceiling well short of the 2031 target. But the bet is now legible, and it is falsifiable on a clock. The first checkpoint is this autumn, when the new Kirin ships and an independent teardown can confirm or puncture the density claim. The test is not whether the design works on a slide but whether Huawei can build it in volume without the chips failing &#8212; the gap, in stacked designs, where ambition usually dies.[16] The second checkpoint is 2031. If either one lands at competitive efficiency without EUV, Washington is left writing its rules in a unit that no longer measures the race.</p><p>The wall was built to keep China from making the transistors smaller. Huawei&#8217;s answer is to stop trying.</p><div><hr></div><h3>Notes</h3><p>[1]: He Tingbo delivered the keynote &#8220;New Semiconductor Path in Practice&#8221; at the 2026 IEEE International Symposium on Circuits and Systems (ISCAS), Shanghai, May 25, 2026. Huawei newsroom, &#8220;HUAWEI Presents the Tau (&#964;) Scaling Law, Enabling Breakthroughs in Transistor Density and System Performance,&#8221; May 25, 2026, <a href="https://www.huawei.com/en/news/2026/5/ieee-iscas-tau-scaling">huawei.com</a>. Vendor-primary source. The &#8220;I used to think it may take us 10 years, but six years we are here&#8221; remark, and a teaser that Huawei would &#8220;bring the surprise&#8221; before winter 2026, are reported from the keynote by BusinessToday, <a href="https://www.businesstoday.in/technology/artificial-intelligence/story/huawei-unveils-new-chip-architecture-claims-path-to-1-4nm-equivalent-processors-by-2031-533106-2026-05-25">businesstoday.in</a>.</p><p>[2]: The principle &#8220;proposes replacing geometric scaling with time (&#964;) scaling as a new guiding principle for the evolution of both semiconductors and electronic systems.&#8221; Huawei newsroom, ibid. The &#8220;Her&#8217;s Law&#8221; nickname (a play on He Tingbo&#8217;s surname and the convention of naming foundational laws after their originators, as with Moore&#8217;s Law) is reported by the South China Morning Post, &#8220;Huawei unveils new scaling law and tech that narrows gap with TSMC, Samsung,&#8221; May 25, 2026, <a href="https://www.scmp.com/tech/article/3354710/huawei-unveils-new-scaling-law-and-tech-can-develop-14-nm-equivalent-chips-2031">scmp.com</a>. &#964; (tau) is the time constant engineers use to describe how quickly signals propagate through a circuit.</p><p>[3]: &#8220;Huawei a d&#233;velopp&#233; un nouveau mode de fabrication de puces,&#8221; Boursorama (reproducing an AFP wire), May 25, 2026, <a href="https://www.boursorama.com/bourse/actualites/huawei-a-developpe-un-nouveau-mode-de-fabrication-de-puces-b3d3e0de58f1842bb8b0c0e62231667f">boursorama.com</a>. The &#8220;fabrication&#8221; framing is the error this piece corrects.</p><p>[4]: On the process gap: &#8220;analysts say China remains behind global leaders in the most advanced process technology,&#8221; with Huawei&#8217;s chips produced on SMIC&#8217;s 7nm-class node versus TSMC&#8217;s 2nm. Reuters, &#8220;China&#8217;s Huawei reveals chip design breakthrough amid US sanctions,&#8221; May 25, 2026, <a href="https://www.rappler.com/technology/huawei-chip-design-breakthrough-may-25-2026/">reuters via rappler.com</a>. The Kirin 9030 (Mate 80 Pro Max) was built by SMIC on an &#8220;N+3&#8221; process, a scaled evolution of its 7nm node and still behind TSMC and Samsung, per a TechInsights teardown reported by the South China Morning Post, <a href="https://tech.yahoo.com/computing/articles/huaweis-kirin-9030-processor-shows-093000039.html">scmp via tech.yahoo.com</a>.</p><p>[5]: LogicFolding &#8220;would shorten wiring inside chips and considerably improve performance&#8221;; Reuters, op. cit. (rappler.com). Huawei&#8217;s own description: the architecture &#8220;can be used to continuously compress signal propagation delay and steadily improve transistor density.&#8221; Huawei newsroom, op. cit. CNBC reported that &#8220;Huawei&#8217;s new chip architecture expands the layout from one layer to two,&#8221; per He Tingbo. CNBC, &#8220;Huawei plans new smartphone chips this fall,&#8221; May 25, 2026, <a href="https://www.cnbc.com/2026/05/25/huawei-chip-logicfolding-semiconductor-nvidia-china.html">cnbc.com</a>.</p><p>[6]: Per-metric figures (vs. a conventional SoC): +53.5% transistor density to 238 MTr/mm&#178;, +40% P-core power efficiency, and +12.7% max clock frequency to 3.1GHz. These figures appear in He Tingbo&#8217;s ISCAS presentation slides as relayed by trade press; they are not stated in Huawei&#8217;s official press release, which carries only the &#964; Law framework, the &#8220;381 chips&#8221; and &#8220;Fall 2026 Kirin&#8221; claims, and the 2031 target (Huawei newsroom, op. cit.). Slide figures via FoneArena, &#8220;HUAWEI presents Tau (&#964;) Scaling Law,&#8221; May 25, 2026, <a href="https://www.fonearena.com/blog/483567/huawei-tau-scaling-law.html">fonearena.com</a>, and Huawei Central, <a href="https://www.huaweicentral.com/huawei-kirin-2026-chip/">huaweicentral.com</a>. Vendor-claimed presentation data; no independent verification as of publication.</p><p>[7]: The 238 MTr/mm&#178; figure has been described as roughly comparable to Intel&#8217;s 18A and TSMC&#8217;s 3nm-class density. The comparison is density-only and does not establish equivalent power, yield, or performance; nor is it specified whether the figure is logic-only or SRAM-inclusive, which materially affects any cross-foundry comparison. Treat as vendor-claimed slide data pending an independent teardown of the shipping Kirin.</p><p>[8]: Huawei describes the &#964; Scaling Law operating &#8220;at the system level&#8221; by &#8220;redefining interconnect protocols for computing systems with UnifiedBus to achieve unified memory addressing and native memory semantics for SuperPoDs,&#8221; reducing system communication latency. Huawei newsroom, op. cit. (<a href="https://www.huawei.com/en/news/2026/5/ieee-iscas-tau-scaling">huawei.com</a>). This situates the phone-level LogicFolding claim within Huawei&#8217;s broader AI-cluster ambition.</p><p>[9]: He Hui, director of semiconductor research at Omdia, quoted in Reuters via Rappler, &#8220;China&#8217;s Huawei reveals chip design breakthrough amid US sanctions,&#8221; May 25, 2026, <a href="https://www.rappler.com/technology/huawei-chip-design-breakthrough-may-25-2026/">rappler.com</a>.</p><p>[10]: Paul Triolo, head of technology, Asia and Americas, DGA Group, quoted in CNBC, &#8220;Huawei plans new smartphone chips this fall as rivalry with Nvidia and Apple heats up,&#8221; May 25, 2026, <a href="https://www.cnbc.com/2026/05/25/huawei-chip-logicfolding-semiconductor-nvidia-china.html">cnbc.com</a>. Full quotes: &#8220;A stacked/folded design can produce effective density gains, but it does not mean Huawei has solved the full process, yield, power, thermal, and device-performance problems associated with true 1.4 nm-class manufacturing&#8221;; and separately, &#8220;Huawei is turning an engineering strategy into a quasi-&#8217;law,&#8217;&#8221; which Triolo characterized as &#8220;more a systems-level optimization doctrine: shorten wires, stack logic, improve memory semantics, and co-design chips, packages, software, and clusters.&#8221;</p><p>[11]: Neil Shah, vice president of research, Counterpoint Research, quoted in CNBC, ibid.</p><p>[12]: &#8220;By 2031, the high-end chips HUAWEI designs based on the &#964; Scaling Law are expected to feature a transistor density that is equivalent to 14 &#197; (1.4 nm) processes.&#8221; Huawei newsroom, op. cit. &#8220;Although Huawei did not provide independent performance data, the target is significant because 1.4 nm is expected to be close to the global frontier for advanced chipmaking around the end of the decade.&#8221; Reuters via Investing.com, <a href="https://www.investing.com/news/economy-news/huawei-proposes-new-path-for-chip-development-amid-us-sanctions-4708270">investing.com</a>.</p><p>[13]: China &#8220;is widely seen as unlikely to reach that level through conventional manufacturing alone because Washington has restricted its access to advanced lithography tools and other key semiconductor technologies.&#8221; Reuters, op. cit. ASML has never shipped an EUV machine to China, and there is no credible domestic alternative &#8212; the binding reason SMIC sits several generations behind TSMC and Samsung; see TheNextWeb, &#8220;Huawei unveils &#8216;Tau Scaling Law&#8217; as China&#8217;s workaround,&#8221; May 25, 2026, <a href="https://thenextweb.com/news/huawei-tau-scaling-law-chip-sanctions">thenextweb.com</a>. On the EDA dependency as a demonstrated lever: US BIS ordered Synopsys, Cadence, and Siemens EDA to halt China sales in late May 2025, then rescinded the restriction on July 2, 2025; the three firms hold roughly 80% of China&#8217;s EDA market. US Commerce/BIS via Network World, July 3, 2025, <a href="https://www.networkworld.com/article/4016826/us-lets-china-buy-semiconductor-design-software-again-2.html">networkworld.com</a>; EE Times, <a href="https://www.eetimes.com/u-s-restricts-eda-software-sales-to-china/">eetimes.com</a>. The episode established that the tool dependency is a switch that can be activated, even though it is not active as of publication. At the keynote He Tingbo claimed Huawei had spent six years building domestic capabilities &#8220;including electronic design automation (EDA) tools and chip design methodologies&#8221;; BusinessToday, May 25, 2026, <a href="https://www.businesstoday.in/technology/artificial-intelligence/story/huawei-unveils-new-chip-architecture-claims-path-to-1-4nm-equivalent-processors-by-2031-533106-2026-05-25">businesstoday.in</a>. Domestic EDA at leading-edge nodes remains commercially unproven.</p><p>[14]: SMIC shares rose 7.6% on Monday following the LogicFolding announcement. South China Morning Post, op. cit.; Reuters via Rappler, op. cit.</p><p>[15]: Jensen Huang told CNBC the company had &#8220;largely conceded&#8221; China&#8217;s AI chip market to Huawei. CNBC, op. cit.; corroborated in Modern Diplomacy, <a href="https://moderndiplomacy.eu/2026/05/25/huawei-unveils-major-chip-design-breakthrough-as-china-pushes-past-us-sanctions/">moderndiplomacy.eu</a>.</p><p>[16]: LogicFolding is described as cell-level &#8220;folding&#8221; &#8212; distributing a single logic block&#8217;s gates across two vertically bonded wafer tiers connected by hybrid bonding &#8212; rather than the die-to-die stacking used by HBM or by TSMC&#8217;s SoIC and Intel&#8217;s Foveros. One technical reconstruction puts the Kirin 2026 hybrid-bonding pitch at ~1.5&#181;m (versus TSMC SoIC at &lt;15&#181;m and Intel Foveros at ~25&#181;m TSV pitch), with density scaling roughly as the square of interconnect pitch; the same analysis back-calculates the density gain as 155&#8594;238 MTr/mm&#178;. These are independent analyst figures, not Huawei-published data: GlobalSemiResearch, &#8220;Huawei&#8217;s Tau Scaling Law: A Technical Deep Dive,&#8221; May 25, 2026, <a href="https://globalsemiresearch.substack.com/p/huaweis-tau-scaling-law-a-technical">globalsemiresearch.substack.com</a>; pitch comparisons via SemiAnalysis, <a href="https://semianalysis.com/2025/02/05/iedm2024/">semianalysis.com</a>. The thermal and yield penalties of logic-on-logic stacking are long-documented; see Semiconductor Engineering, &#8220;Stacking Logic On Logic.&#8221;</p>]]></content:encoded></item><item><title><![CDATA[Lobby, Levy, Legislate]]></title><description><![CDATA[How Mistral is trying to convert a perishable contact list into permanent law.]]></description><link>https://www.airealist.ai/p/lobby-levy-legislate</link><guid isPermaLink="false">https://www.airealist.ai/p/lobby-levy-legislate</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Fri, 22 May 2026 06:51:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VJKZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VJKZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VJKZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!VJKZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!VJKZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!VJKZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VJKZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1833262,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/198744415?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VJKZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!VJKZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!VJKZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!VJKZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21134e91-9eeb-4971-b1c4-012b1b9a1c35_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">On May 12, in a near-empty hearing room of the French National Assembly, Arthur Mensch did something that tells you everything about how Mistral actually competes.</p><p style="text-align: justify;">He didn&#8217;t talk about his models. He warned the deputies about someone else&#8217;s &#8212; and the deputies, mostly, hadn&#8217;t bothered to come. He delivered his warning about the fate of European civilization to a scattering of empty benches for ninety minutes, with the cameras running. <a href="https://videos.assemblee-nationale.fr/video.18888392_6a0330a9d4404.vulnerabilites-systemiques-dans-le-secteur-du-numerique--m-arthur-mensch-cofondateur-et-dg-de-mis-12-mai-2026">[1]</a></p><p style="text-align: justify;">The warning itself: Anthropic &#8212; the American lab that sits ahead of Mistral at the frontier, and whose restricted-access Claude Mythos Preview model can autonomously hunt down and exploit software vulnerabilities <a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/">[2]</a> &#8212; had been circling the French defense establishment, offering to scan the army&#8217;s code bases. Mensch&#8217;s counsel to the Republic was to keep them out. Letting a foreign model that deep into French defense, he argued, would create a dependency that is &#8220;hard to unwind.&#8221; <a href="https://the-decoder.com/mistral-ceo-arthur-mensch-warns-france-against-letting-anthropics-mythos-scan-military-code-bases/">[3]</a></p><p style="text-align: justify;">There&#8217;s a real security argument buried in there, and Mensch made the fair version of it himself: you might reasonably not want <em>any</em> vulnerability-hunting model &#8212; foreign or domestic &#8212; crawling through your defense code, and he conceded in the same breath that Mistral&#8217;s own models or Chinese ones could find the same flaws. <a href="https://the-decoder.com/mistral-ceo-arthur-mensch-warns-france-against-letting-anthropics-mythos-scan-military-code-bases/">[3]</a> But notice how neatly the sovereignty case lands on the one outcome that also protects Mistral&#8217;s existing contract with the French armed forces &#8212; and that it asks France to turn away the strongest defensive tool on the market during the worst run of data breaches in its history, a point we&#8217;ll come back to. </p><p style="text-align: justify;">That convenience &#8212; a principled-sounding argument that happens, every time, to favor Mistral &#8212; is the thread running through everything Mensch has said and done this spring.</p><p>Once you pull it, the whole confusing season snaps into focus.</p><h2>The eight days that looked like hypocrisy</h2><p style="text-align: justify;">Here is the sequence in which French Twitter called Mensch a hypocrite.</p><p style="text-align: justify;"><strong>May 7.</strong> Brussels agrees to the &#8220;Digital Omnibus,&#8221; delaying the AI Act&#8217;s high-risk obligations by sixteen months &#8212; from August 2026 to December 2027. Mistral is among the industry voices that lobbied for the slowdown. <a href="https://www.consilium.europa.eu/en/press/press-releases/2026/05/07/artificial-intelligence-council-and-parliament-agree-to-simplify-and-streamline-rules/">[4]</a></p><p style="text-align: justify;"><strong>May 12.</strong> Five days after winning that delay, Mensch sits before the National Assembly and warns that Europe over-regulates, that it has &#8220;heavy regulation and a fragmented market,&#8221; that the stack of GDPR, copyright rules, and the AI Act is an &#8220;<em>empilement</em>&#8220; &#8212; a pile-up. <a href="https://www.lejdd.fr/Societe/fiscalite-energie-dependance-le-patron-de-mistral-ai-alerte-sur-les-faiblesses-de-leurope-174012">[5]</a> Europe, he says, has two years to build its own AI infrastructure or become America&#8217;s &#8220;vassal state.&#8221; <a href="https://videos.assemblee-nationale.fr/video.18888392_6a0330a9d4404.vulnerabilites-systemiques-dans-le-secteur-du-numerique--m-arthur-mensch-cofondateur-et-dg-de-mis-12-mai-2026">[6]</a></p><p style="text-align: justify;"><strong>May 15.</strong> It&#8217;s the front page of the <em>Journal du Dimanche</em>.</p><p style="text-align: justify;">A man lobbies to weaken a regulation, wins, and then days later complains that Europe is over-regulated &#8212; and lands the front page doing it. The hypocrisy reading writes itself. It&#8217;s also wrong, or at least lazy. Mensch isn&#8217;t confused. He&#8217;s running a press strategy with a clock on it, and the clock matters more than the contradiction.</p><p style="text-align: justify;">And the empty room is the proof. If your goal is to persuade legislators, you care whether legislators are in the seats. If your goal is the front page, the seats are set dressing, and the camera is the audience. Mensch wasn&#8217;t talking to the handful of deputies who showed up. He was talking, through them, to the <em>JDD</em>, to Bercy &#8212; the Finance Ministry &#8212; to the &#201;lys&#233;e, and to every procurement officer in France who would read about it three days later. </p><p>The benches were empty because, on some level, everyone involved understood the room wasn&#8217;t the point.</p><p style="text-align: justify;">Then there&#8217;s the third move people keep filing separately. In March, in the <em>Financial Times</em>, Mensch proposed a 1&#8211;1.5% levy on the European revenues of <em>all</em> AI providers &#8212; including the American and Chinese ones &#8212; to fund a European cultural pot. A tax on AI, from the CEO of an AI company. <a href="https://www.itpro.com/technology/artificial-intelligence/mistral-ceo-calls-for-ai-cultural-levy">[7]</a> And back in September, asked about France&#8217;s proposed Zucman wealth tax, he offered warm words &#8212; &#8220;at the risk of disappointing the polemicists, I&#8217;m rather convinced we need more fiscal justice in France&#8221; &#8212; while making clear in the same sentence that he could not and would not pay it himself. <a href="https://www.boursorama.com/bourse/actualites/taxe-zucman-le-patron-de-la-start-up-francaise-mistral-demande-plus-de-justice-fiscale-tout-en-preservant-la-competitivite-de-la-france-6a03e1bdae232eed24bf289d20c2890f">[8]</a></p><p style="text-align: justify;">Pro-tax, anti-tax, pro-rules, anti-rules. It looks incoherent. It isn&#8217;t. Every one of these positions serves the same end.</p><h2>Mistral isn&#8217;t winning on capability</h2><p style="text-align: justify;">Start with the thing the sovereignty conversation is engineered to make you forget: Mistral does not make the best models, and its own strategy quietly concedes the point.</p><p style="text-align: justify;">This isn&#8217;t a knock on the engineering. Mistral&#8217;s latest flagship, Medium 3.5, is a genuinely strong model: a dense 128-billion-parameter system that posts 77.6% on SWE-Bench Verified, a respected coding benchmark, and undercuts the closed frontier models on price by roughly half. <a href="https://huggingface.co/mistralai/Mistral-Medium-3.5-128B">[9]</a> Mistral does benchmark it against the frontier leaders &#8212; and that comparison is exactly the tell. On SWE-Bench, it lands about two points <em>behind</em> Claude Sonnet 4.6 (77.6% versus 79.6%); the pitch is not &#8220;we win,&#8221; it&#8217;s &#8220;we come close and cost less, and you can run the weights yourself.&#8221; <a href="https://techsifted.com/posts/mistral-medium-3-5-review-2026/">[10]</a> That is a deliberate, coherent position &#8212; near-frontier at a fraction of the price &#8212; and it is, by design, a second-place pitch. The pace-setting systems on reasoning and agents remain American.</p><p style="text-align: justify;">Which is exactly the point. If your strategy is to be the affordable, open, sovereign alternative rather than the best model in the world &#8212; and the capex math says it has to be; Mistral&#8217;s roughly $400M in annual recurring revenue, with &#8364;1bn targeted for the year <a href="https://www.maddyness.com/uk/2026/01/23/mistral-ai-on-track-to-reach-one-billion-euros-in-revenue-by-2026/">[11]</a> sits against OpenAI&#8217;s $20bn-plus annualized run-rate as of late 2025 <a href="https://finance.yahoo.com/news/openai-cfo-says-annualized-revenue-173519097.html">[12]</a> &#8212; then &#8220;best model&#8221; can never be your moat. You need a different one.</p><h2>So what is the moat?</h2><p style="text-align: justify;">The fashionable answer is &#8220;sovereignty.&#8221; And there&#8217;s a real version of that argument. Europe genuinely should worry about routing every critical digital service through American infrastructure governed by American law. The CLOUD Act is real. Dependency is real. Mensch is not wrong that a continent with no domestic frontier lab has no leverage.</p><p style="text-align: justify;">But watch what happens when you ask <em>which kind</em> of sovereignty actually protects Mistral, and you find the real answer is none of the ones he names.</p><p style="text-align: justify;">It isn&#8217;t <strong>technical sovereignty</strong> &#8212; data-stays-in-Europe. Microsoft, AWS, and OpenAI are all racing to offer EU data residency. That checkbox is being commoditized by the quarter.</p><p style="text-align: justify;">It isn&#8217;t <strong>legal sovereignty</strong> &#8212; open weights you can self-host. Llama and Qwen are open too. A French integrator could run them on a French cloud under French law and undercut Mistral on price tomorrow.</p><p style="text-align: justify;">It isn&#8217;t <strong>corporate sovereignty</strong>, either &#8212; the cleanest version of Mensch&#8217;s own case. He told the Assembly that US investors hold less than 30% of Mistral and that the founders keep strategic control, aiming for a European listing. <a href="https://the-decoder.com/mistral-ceo-arthur-mensch-warns-france-against-letting-anthropics-mythos-scan-military-code-bases/">[13]</a> That&#8217;s true, and it does distinguish Mistral from a Microsoft-funded OpenAI. But a European cap table is a governance fact, not a competitive one. It tells you who controls the company; it tells you nothing about why a customer would choose the product over a cheaper, equally European-hosted open model. Ownership is a moat against acquisition, not against competition.</p><p style="text-align: justify;">The moat that&#8217;s left, when you subtract the ones that don&#8217;t hold, is the customer list. And the customer list gives the game away.</p><h2>Mistral was born inside the Rolodex</h2><p style="text-align: justify;">Before we read that list, rewind to where it came from. It&#8217;s tempting to picture three research &#8220;kids&#8221; who somehow assembled a blue-chip roster of backers from scratch. That gets them backward. Guillaume Lample and Timoth&#233;e Lacroix were core authors of Meta&#8217;s LLaMA; Arthur Mensch came from DeepMind with his name on Chinchilla and RETRO. In the frenzied weeks after ChatGPT, they were arguably the most bankable large-model team in Europe &#8212; and all three had met years earlier at &#201;cole Polytechnique, the <em>grande &#233;cole</em> that functions as the spine of the French establishment. <a href="https://sifted.eu/articles/mistral-openai-rival-105m-news">[14]</a></p><p style="text-align: justify;">So they didn&#8217;t pitch their way in; the network reached out and pulled them through it. The bridge was the founders of Alan, the French insurtech unicorn: Jean-Charles Samuelian-Werve and Charles Gorintin, who introduced the team around, talked Lightspeed into leading, and worked the phones to fill the round. Gorintin and C&#233;dric O &#8212; Macron&#8217;s former minister for digital &#8212; signed on as founding advisors. <a href="https://techcrunch.com/2025/01/27/alans-founder-role-in-mistrals-origin-story/">[15]</a> The result was &#8364;105M one month after incorporation, before a single product: the largest seed in European history. <a href="https://techcrunch.com/2023/06/13/frances-mistral-ai-blows-in-with-a-113m-seed-round-at-a-260m-valuation-to-take-on-openai/">[16]</a></p><p style="text-align: justify;">And look who was already in that first cheque: Bpifrance &#8212; the French state&#8217;s own investment bank &#8212; Xavier Niel, and the shipping billionaire Rodolphe Saad&#233;, whose CMA-CGM would two years later become Mistral&#8217;s marquee customer. The customer-patron was present at the founding. So was the political bridge, represented by C&#233;dric O, Macron&#8217;s campaign treasurer, <a href="https://www.airealist.ai/p/when-bureaucrats-pick-fights-with">whose story I&#8217;ve told before</a>.</p><p style="text-align: justify;">The point isn&#8217;t that any of this was secret. It&#8217;s that none of it was. Mistral didn&#8217;t earn its way to the French establishment; it was incorporated into it. Which is why the customer list reads the way it does.</p><p style="text-align: justify;">There&#8217;s a darker reading here for anyone following the question of why national AI ecosystems succeed or fail. The usual diagnosis is exclusion &#8212; the most significant builders are outsiders, the system pushed away, or had to be imported. France is the inverse. Its champion was built by the consummate insiders the system produces by design &#8212; Polytechnique, DeepMind, the right dinners &#8212; and the system&#8217;s reward for producing them was to wire them straight into state procurement. The failure mode here isn&#8217;t a talent the country couldn&#8217;t keep. It&#8217;s the opposite: a talent the country captured so completely that the product never had to compete.</p><h2>For now, the Rolodex is the product</h2><p style="text-align: justify;">When Mensch defends Mistral&#8217;s traction, he names the same flagship customers: France Travail, CMA-CGM, Stellantis, and TotalEnergies. This spring, he added the Caisse des D&#233;p&#244;ts, the French state investment bank. <a href="https://www.caissedesdepots.fr/eclairage/actualites/souverainete-numerique-le-groupe-caisse-des-depots-sadjoint-les-services-de-mistral-ai">[19]</a> Read that list not as wins but as buying decisions:</p><p style="text-align: justify;"><strong>France Travail</strong> is a French government agency. <strong>TotalEnergies</strong> is a French strategic asset whose CEO doesn&#8217;t sneeze without an &#201;lys&#233;e check-in. <a href="https://totalenergies.com/news/press-releases/totalenergies-collaborate-mistral-ai-increase-application-artificial">[20]</a> <strong>Stellantis</strong> carries the French state&#8217;s industrial legacy through its old stake in PSA, the Peugeot-Citro&#235;n group that merged into Stellantis. <a href="https://www.stellantis.com/en/news/press-releases/2025/october/stellantis-and-mistral-ai-expand-their-collaboration-to-accelerate-enterprise-wide-ai-adoption">[21]</a> <strong>Caisse des D&#233;p&#244;ts</strong> <em>is</em> a French state-owned institution. And <strong>CMA-CGM</strong> is owned by Rodolphe Saad&#233; &#8212; the shipping billionaire whom Macron meets on his trips to Marseille, and who assembled BFM-TV, RMC, <em>La Provence</em>, <em>La Tribune,</em> and Brut into one of the largest newsrooms in France. <a href="https://www.cmacgm-group.com/en/news-media/cma-cgm-completes-acquisition-altice-media">[22]</a> When <em>La Provence</em> ran a 2024 front page the &#201;lys&#233;e disliked, Saad&#233; suspended its editorial director; the journalists&#8217; union called it political pressure. <a href="https://www.ozap.com/actu/-la-provence-rodolphe-saade-met-a-pied-le-directeur-de-la-redaction-apres-la-publication-d-une-une-qui-aurait-deplu-a-l-actionnaire/643058">[23]</a></p><p style="text-align: justify;">The CMA-CGM deal is the one to study. In April 2025, Saad&#233;&#8217;s group <em>invested</em> &#8364;100M into Mistral and signed a five-year, $110M service contract &#8212; investor and customer, same party, the same circular structure now standard at hyperscaler scale. <a href="https://www.maritime-executive.com/corporate/cma-cgm-group-new-custom-designed-ai-solutions-from-mistral-ai">[24]</a> And here&#8217;s the part that should end the &#8220;Mistral is winning enterprise on merit&#8221; story for good: the same CMA-CGM had already signed a separate $150M AI deal with Google. <a href="https://www.prnewswire.com/news-releases/cma-cgm-embarks-on-a-strategic-partnership-with-google-to-deploy-ai-across-all-shipping-logistics-and-media-activities-302200249.html">[25]</a> Saad&#233; bought Mistral the press release and Google the workload.</p><p style="text-align: justify;">Now hold the strongest version of the counterargument. At the same hearing, Mensch noted that 70% of Mistral&#8217;s revenue is non-French &#8212; proof, he argued, of a genuine export champion rather than a subsidized domestic pet. <a href="https://angelo-lima.fr/en/arthur-mensch-mistral-ai-national-assembly-hearing-en/">[26]</a> Take the number at face value. It doesn&#8217;t touch the argument because the moat was never part of the revenue base. Wherever that 70% lives &#8212; cross-border API calls, seat licenses, partner channels &#8212; it isn&#8217;t the source of the political moat. That lives in the <em>flagship</em> names, the lighthouse customers Mensch puts on the slide to validate the company to the next investor and the next government. And that list, the one that does the political and fundraising work, is almost entirely the French political-industrial complex. No Volkswagen. No Siemens. No Maersk. No ING. No Telef&#243;nica. No European reference customer of consequence outside the French orbit. Mistral may sell tokens to Europe. It anchors its credibility to the French permanent state.</p><p style="text-align: justify;">That&#8217;s the moat. Not sovereignty &#8212; <em>the President&#8217;s contact list.</em> Mistral built a product that the most important buyers were always going to choose, and those buyers are a circle who have lunch together.</p><h2>The game: legislate the Rolodex before it expires</h2><p style="text-align: justify;">A contact list is a fragile asset. Macron is term-limited; 2027 is coming; a contact list does not survive a change of government. So the genius &#8212; and it is genius, in a cold way &#8212; of Mensch&#8217;s spring is that every policy move converts a perishable relationship into durable law. And the conversion is self-reinforcing: each rule that raises a rival&#8217;s cost buys time, and that time is spent deepening the very relationships the rule protects, which in turn supply the political capital to write the next rule. The relationship becomes the law that governs it.</p><p style="text-align: justify;">Read the spring&#8217;s moves as one design, and they line up. The 1&#8211;1.5% levy on all AI providers&#8217; European revenue raises rivals&#8217; operating costs in Europe and falls hardest on high-revenue American companies. A public-procurement &#8220;European preference&#8221; would codify that European <em>equity</em> beats European <em>hosting</em> &#8212; turning the Rolodex itself into a rule. The foundation-model carve-out from the AI Act trims Mistral&#8217;s own compliance bill while leaving the application-layer obligations that mostly bite US deployers. The Digital Omnibus delay buys sixteen months before any of it starts charging rent. The &#8220;vassal state&#8221; rhetoric inoculates against the obvious objection &#8212; that the French state is overpaying for a domestically preferred model. And the warning to keep the army off Anthropic&#8217;s Claude Mythos defends the single highest-value contract on the list from a stronger rival. Six moves, one direction.</p><p style="text-align: justify;">Sam Altman wants to be the CEO of an AI company. Arthur Mensch wants to be the CEO of an AI <em>market</em> &#8212; and the difference is that markets are made of rules, and rules can be written. While Altman lobbies <em>against</em> regulation, Mensch lobbies <em>for the right regulation</em>: the kind his competitors can satisfy only at a cost they can&#8217;t bear, and he doesn&#8217;t pay.</p><p style="text-align: justify;">It&#8217;s the most sophisticated regulatory game in AI right now. Calling it hypocrisy misses how good it is.</p><h2>And you&#8217;re the one paying for it</h2><p style="text-align: justify;">A strategy this elegant still sends someone an invoice. Several someones.</p><p style="text-align: justify;">French taxpayers buy a domestic-preferred model so a French logo can sit on the contract. European consumers may soon pay a 1.5% levy that, as input taxes typically do, flows at least partly downstream into prices. The French army, if Mensch gets his way, runs on the home-team model rather than the strongest available tool, because the strongest is American. And every European who wants the AI Act&#8217;s high-risk protections has to wait until December 2027 for them, courtesy of a delay sold as a boost to competitiveness.</p><p style="text-align: justify;">None of that is sovereignty. It&#8217;s a subsidy &#8212; routed through procurement and regulation instead of a line item, paid to one company, and narrated in the language of national dignity so that questioning it feels unpatriotic.</p><h2>The bill comes due in breaches</h2><p style="text-align: justify;">And the steepest cost isn&#8217;t measured in euros. France is, right now, among the most cyberattacked countries in the world, and the diagnosed root cause is a &#8220;remediation gap&#8221; &#8212; institutions that keep finding vulnerabilities and keep failing to patch them in time. <a href="https://cybernews.com/security/france-cyberattacks-wave-reasons-cnil/">[27]</a> The identity-document agency ANTS leaked up to 19 million passport and license records; <a href="https://cybernews.com/security/ants-hack-france-19-million-records-id-agency-breach/">[28]</a> much of the spree was carried out by teenagers. <a href="https://therecord.media/french-hacker-cyberattacks-arrest">[29]</a> And the marquee name on Mistral&#8217;s own customer list, France Travail, is the largest data breach in French history &#8212; tens of millions of job-seekers exposed and a &#8364;5M regulator fine in January 2026. <a href="https://www.cnil.fr/en/data-breach-5million-fine-france-travail">[30]</a> Sovereignty delivered the French logo on the contract. It did not deliver security.</p><p style="text-align: justify;">Which is what makes the Mythos warning go sour. A remediation gap is exactly what a frontier vulnerability-hunting model closes &#8212; and within days of the hearing, Bloomberg reported that Mistral is building its own such model for European banks shut out of Mythos. <a href="https://www.bloomberg.com/news/articles/2026-05-13/mistral-developing-new-ai-model-for-banks-lacking-mythos-access">[31]</a> So the warning isn&#8217;t &#8220;keep dangerous vulnerability-hunters out of France.&#8221; It&#8217;s &#8220;keep the <em>American</em> one out, while we build and sell ours.&#8221; There is a strong security case for not allowing any foreign model to crawl through the defense code. But weigh it honestly: a country hemorrhaging data because it can&#8217;t find its own holes fast enough is being counseled to refuse the best tool for finding them. Denying France the Mythos audits may be the larger sovereignty risk, and the sovereignty argument and the product roadmap turn out to be the same document.</p><h2>The bet, and how we&#8217;ll know</h2><p style="text-align: justify;">Here&#8217;s the falsifiable part &#8212; the thing to actually watch, rather than the rhetoric to argue about.</p><p style="text-align: justify;">If Mistral&#8217;s strategy is sound, the policy moat buys enough time for the product to close on the frontier <em>and</em> for the company to break out of the Macron orbit into real, arm&#8217;s-length European enterprise demand. So the test is simple: <strong>by the middle of 2027, does Mistral&#8217;s flagship customer list contain names that aren&#8217;t tied to the French state or Macron&#8217;s circle?</strong> A Volkswagen. A Siemens. A bank in Milan or Madrid that chose Mistral in a competitive bake-off and paid full freight.</p><p style="text-align: justify;">If yes, Mensch will have pulled off one of the great industrial strategy plays of the decade &#8212; using the rulebook to buy time to build a real business.</p><p style="text-align: justify;">If no &#8212; if in two years the list is still France Travail and friends &#8212; then the policy moat was never a bridge to a product. It was the product. And policy moats have a half-life measured in election cycles. A non-Macroniste &#201;lys&#233;e could simply stop steering the contracts. Or Brussels could decide that France&#8217;s domestic procurement, routed so reliably to one favored national champion, is a selective advantage that runs into EU state-aid rules. That is a separate exposure from the &#8220;European preference&#8221; now being drafted at the EU level &#8212; which, awkwardly for the sovereignty story, is a French-led project Mistral wants. Either way, the moment the political weather changes, the whole structure reprices overnight.</p><p style="text-align: justify;">Mensch says Europe has two years to avoid becoming America&#8217;s vassal. He may be right. But Europe should be careful not to mistake one clever founder&#8217;s moat for a continent&#8217;s sovereignty &#8212; and careful, too, about who exactly it&#8217;s being asked to be sovereign <em>for.</em></p><p style="text-align: justify;"><em>Mensch</em>, in German, means &#8220;man.&#8221; In Yiddish, it came to mean something better &#8212; a person of integrity, someone who does the right thing. Watching Monsieur Mensch this spring, the open question isn&#8217;t whether he&#8217;s brilliant. It&#8217;s the kind of mensch France thinks it&#8217;s buying.</p><div><hr></div><h2>Notes</h2><p>[1] Assembl&#233;e nationale (official video) &#8212; <em><a href="https://videos.assemblee-nationale.fr/video.18888392_6a0330a9d4404.vulnerabilites-systemiques-dans-le-secteur-du-numerique--m-arthur-mensch-cofondateur-et-dg-de-mis-12-mai-2026">Vuln&#233;rabilit&#233;s syst&#233;miques dans le secteur du num&#233;rique: audition de M. Arthur Mensch</a></em> (May 12, 2026; the near-empty room is visible on the official feed)</p><p>[2] Fortune &#8212; <em><a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/">Anthropic says it&#8217;s testing &#8220;Mythos,&#8221; a powerful new AI model representing a &#8220;step change&#8221; in capabilities</a></em> (the model&#8217;s full name per Anthropic&#8217;s April 7, 2026 system card is &#8220;Claude Mythos Preview&#8221;)</p><p>[3] The Decoder &#8212; <em><a href="https://the-decoder.com/mistral-ceo-arthur-mensch-warns-france-against-letting-anthropics-mythos-scan-military-code-bases/">Mistral CEO Arthur Mensch warns France against letting Anthropic&#8217;s Mythos scan military code bases</a></em> (also the source for Mensch&#8217;s concession that Mistral&#8217;s or Chinese models could find the same vulnerabilities)</p><p>[4] Council of the EU &#8212; <em><a href="https://www.consilium.europa.eu/en/press/press-releases/2026/05/07/artificial-intelligence-council-and-parliament-agree-to-simplify-and-streamline-rules/">Artificial intelligence: Council and Parliament agree to simplify and streamline rules</a></em> (official press release, May 7, 2026)</p><p>[5] Le JDD &#8212; <em><a href="https://www.lejdd.fr/Societe/fiscalite-energie-dependance-le-patron-de-mistral-ai-alerte-sur-les-faiblesses-de-leurope-174012">Fiscalit&#233;, &#233;nergie, d&#233;pendance: le patron de Mistral AI alerte sur les faiblesses de l&#8217;Europe</a></em> (May 15, 2026)</p><p>[6] Assembl&#233;e nationale (official video) &#8212; <em><a href="https://videos.assemblee-nationale.fr/video.18888392_6a0330a9d4404.vulnerabilites-systemiques-dans-le-secteur-du-numerique--m-arthur-mensch-cofondateur-et-dg-de-mis-12-mai-2026">audition de M. Arthur Mensch, &#8220;vassal state&#8221; / two-year warning</a></em> (same hearing, May 12, 2026)</p><p>[7] IT Pro &#8212; <em><a href="https://www.itpro.com/technology/artificial-intelligence/mistral-ceo-calls-for-ai-cultural-levy">Mistral CEO calls for AI cultural levy</a></em> (reporting Mensch&#8217;s Financial Times op-ed, March 20, 2026)</p><p>[8] Boursorama (AFP) &#8212; <em><a href="https://www.boursorama.com/bourse/actualites/taxe-zucman-le-patron-de-la-start-up-francaise-mistral-demande-plus-de-justice-fiscale-tout-en-preservant-la-competitivite-de-la-france-6a03e1bdae232eed24bf289d20c2890f">Taxe Zucman: le patron de Mistral demande &#8220;plus de justice fiscale&#8221; tout en pr&#233;servant la comp&#233;titivit&#233; de la France</a></em></p><p>[9] Mistral AI / Hugging Face &#8212; <em><a href="https://huggingface.co/mistralai/Mistral-Medium-3.5-128B">Mistral-Medium-3.5 model card</a></em> (official benchmarks: dense 128B parameters; 77.6% SWE-Bench Verified)</p><p>[10] TechSifted &#8212; <em><a href="https://techsifted.com/posts/mistral-medium-3-5-review-2026/">Mistral Medium 3.5 review</a></em> (independent comparison: 77.6% vs. Claude Sonnet 4.6&#8217;s 79.6% on SWE-Bench Verified; ~half the per-token price; open weights under modified MIT terms)</p><p>[11] Maddyness UK &#8212; <em><a href="https://www.maddyness.com/uk/2026/01/23/mistral-ai-on-track-to-reach-one-billion-euros-in-revenue-by-2026/">Mistral AI on track to reach one billion euros in revenue by 2026</a></em> (Mensch at Davos; ~$400M ARR, &#8364;1bn a target for the year, not booked revenue)</p><p>[12] Reuters (via Yahoo Finance) &#8212; <em><a href="https://finance.yahoo.com/news/openai-cfo-says-annualized-revenue-173519097.html">OpenAI CFO says annualized revenue crosses $20 billion in 2025</a></em> (ARR and &#8220;annualized run-rate&#8221; are adjacent but not identical metrics; the orders of magnitude make the comparison regardless)</p><p>[13] The Decoder &#8212; <em><a href="https://the-decoder.com/mistral-ceo-arthur-mensch-warns-france-against-letting-anthropics-mythos-scan-military-code-bases/">Mistral CEO Arthur Mensch warns France&#8230;</a></em> (Mensch&#8217;s statement that US investors hold under 30% and founders retain strategic control)</p><p>[14] Sifted &#8212; <em><a href="https://sifted.eu/articles/mistral-openai-rival-105m-news">Meta and DeepMind alumni raise &#8364;105m seed round to build OpenAI rival Mistral</a></em> (founders&#8217; LLaMA / DeepMind pedigree; &#201;cole Polytechnique)</p><p>[15] TechCrunch &#8212; <em><a href="https://techcrunch.com/2025/01/27/alans-founder-role-in-mistrals-origin-story/">Alan&#8217;s founder role in Mistral&#8217;s origin story</a></em> (Samuelian-Werve and Gorintin as connectors; C&#233;dric O founding advisor)</p><p>[16] TechCrunch &#8212; <em><a href="https://techcrunch.com/2023/06/13/frances-mistral-ai-blows-in-with-a-113m-seed-round-at-a-260m-valuation-to-take-on-openai/">France&#8217;s Mistral AI blows in with a $113M seed round at a $260M valuation</a></em> (full investor roster incl. Bpifrance, Niel, Saad&#233;, Schmidt)</p><p>[17] Caisse des D&#233;p&#244;ts (official) &#8212; <em><a href="https://www.caissedesdepots.fr/eclairage/actualites/souverainete-numerique-le-groupe-caisse-des-depots-sadjoint-les-services-de-mistral-ai">Souverainet&#233; num&#233;rique: le groupe Caisse des D&#233;p&#244;ts s&#8217;adjoint les services de Mistral AI</a></em> (May 2026)</p><p>[18] TotalEnergies (official) &#8212; <em><a href="https://totalenergies.com/news/press-releases/totalenergies-collaborate-mistral-ai-increase-application-artificial">TotalEnergies to collaborate with Mistral AI to increase the application of AI in its multi-energy strategy</a></em></p><p>[19] Stellantis (official) &#8212; <em><a href="https://www.stellantis.com/en/news/press-releases/2025/october/stellantis-and-mistral-ai-expand-their-collaboration-to-accelerate-enterprise-wide-ai-adoption">Stellantis and Mistral AI expand their collaboration to accelerate enterprise-wide AI adoption</a></em> (Oct 2025)</p><p>[20] CMA CGM Group (official) &#8212; <em><a href="https://www.cmacgm-group.com/en/news-media/cma-cgm-completes-acquisition-altice-media">CMA CGM completes acquisition of Altice Media</a></em> (BFM-TV, RMC; group also owns La Provence, La Tribune, Corse Matin)</p><p>[21] Purem&#233;dias / Ozap &#8212; <em><a href="https://www.ozap.com/actu/-la-provence-rodolphe-saade-met-a-pied-le-directeur-de-la-redaction-apres-la-publication-d-une-une-qui-aurait-deplu-a-l-actionnaire/643058">&#8220;La Provence&#8221;: Rodolphe Saad&#233; met &#224; pied le directeur de la r&#233;daction apr&#232;s une Une qui aurait d&#233;plu &#224; l&#8217;actionnaire</a></em> (March 2024; &#8220;political pressure&#8221; is the journalists&#8217; union&#8217;s characterization)</p><p>[22] The Maritime Executive &#8212; <em><a href="https://www.maritime-executive.com/corporate/cma-cgm-group-new-custom-designed-ai-solutions-from-mistral-ai">CMA CGM Group: new custom-designed AI solutions from Mistral AI</a></em> (&#8364;100M investment + five-year $110M contract, April 2025; figures as reported in mixed currencies)</p><p>[23] CMA CGM / PR Newswire (official) &#8212; <em><a href="https://www.prnewswire.com/news-releases/cma-cgm-embarks-on-a-strategic-partnership-with-google-to-deploy-ai-across-all-shipping-logistics-and-media-activities-302200249.html">CMA CGM embarks on a strategic partnership with Google to deploy AI across all shipping, logistics, and media activities</a></em> (the separate $150M Google deal, 2024 &#8212; predating the Mistral deal)</p><p>[24] Angelo Lima (hearing analysis, cross-checked to the Assembl&#233;e nationale feed) &#8212; <em><a href="https://angelo-lima.fr/en/arthur-mensch-mistral-ai-national-assembly-hearing-en/">What Arthur Mensch told the French National Assembly</a></em> (Mensch&#8217;s claim that 70% of Mistral revenue is non-French; secondary analysis of the official hearing)</p><p>[25] Cybernews &#8212; <em><a href="https://cybernews.com/security/france-cyberattacks-wave-reasons-cnil/">Experts warn France &#8220;operationally paralyzed&#8221; as cyberattacks mount in 2026</a></em> (single-source characterization; &#8220;among the most-attacked&#8221; softened from &#8220;second-most&#8221; pending an ANSSI/CNIL primary)</p><p>[26] Cybernews &#8212; <em><a href="https://cybernews.com/security/ants-hack-france-19-million-records-id-agency-breach/">ANTS hack: 19 million records exposed in French ID agency breach</a></em> (April 2026)</p><p>[27] The Record &#8212; <em><a href="https://therecord.media/french-hacker-cyberattacks-arrest">French police arrest suspected hacker behind dozens of data breaches</a></em> (HexDex, 21, ~100 breaches incl. sports federations, Education Ministry, SIA)</p><p>[28] CNIL (official) &#8212; <em><a href="https://www.cnil.fr/en/data-breach-5million-fine-france-travail">Data breach: France Travail fined &#8364;5 million</a></em> (Jan 22, 2026)</p><p>[29] Bloomberg &#8212; <em><a href="https://www.bloomberg.com/news/articles/2026-05-13/mistral-developing-new-ai-model-for-banks-lacking-mythos-access">European Banks Explore Mistral AI&#8217;s Alternative to Anthropic&#8217;s Mythos Model</a></em> (May 13, 2026; Mistral developing its own vulnerability-detection model for European banks shut out of Mythos)</p>]]></content:encoded></item><item><title><![CDATA[Zero for Three]]></title><description><![CDATA[Strikeout. Three rare earth tests for Beijing. The summit answered no to each. November 10 is on the calendar.]]></description><link>https://www.airealist.ai/p/zero-for-three</link><guid isPermaLink="false">https://www.airealist.ai/p/zero-for-three</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Tue, 19 May 2026 17:35:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_sQx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_sQx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_sQx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 424w, https://substackcdn.com/image/fetch/$s_!_sQx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 848w, https://substackcdn.com/image/fetch/$s_!_sQx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 1272w, https://substackcdn.com/image/fetch/$s_!_sQx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_sQx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png" width="1264" height="848" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:848,&quot;width&quot;:1264,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2514724,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/198442527?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_sQx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 424w, https://substackcdn.com/image/fetch/$s_!_sQx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 848w, https://substackcdn.com/image/fetch/$s_!_sQx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 1272w, https://substackcdn.com/image/fetch/$s_!_sQx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd15c42fb-c4ee-4780-9e5b-4d11f1fbd70d_1264x848.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Trump-Xi summit in Beijing on May 14-15 was supposed to trade chips for rare earths. Washington would ease restrictions on advanced AI accelerators heading to China. Beijing would open the licensing gate on the rare earths and functional materials that feed every semiconductor fab on the planet. Two interlocking grips, mutually released.</p><p>No deal.</p><p>On May 15, the US Trade Representative sat down with Bloomberg Television. Asked whether semiconductor export controls had come up during the summit that had just concluded, Jamieson Greer answered without hedging: &#8220;This was not a major topic of discussion at the bilateral meeting. We did not talk about chip export controls at the meeting.&#8221;[1]</p><p>One side of the trade is denied at the principal level.</p><p>Last week, this newsletter previewed the summit with three specific tests.[2] </p><ol><li><p>An exemption from China&#8217;s case-by-case licensing for the rare earths used in advanced AI chips (sub-14-nanometer logic, 256-layer memory). </p></li><li><p>A Chinese commitment to replace case-by-case review with blanket export approvals for the functional materials flowing through every semiconductor fab: polishing slurries, sputtering targets, and non-military magnets. </p></li><li><p>A mutual rollback of the October 2025 rule under which Beijing can require an export license for any foreign-made product anywhere in the world that contains more than 0.1% Chinese-origin rare earth content.</p></li></ol><p>Three tests. Three strikes. The dependency the trailer described &#8212; rare earths as the layer beneath chips, cloud, and models &#8212; survived the summit intact.</p><h2>The two readouts</h2><p>The White House Fact Sheet of May 17 announced that China would &#8220;address U.S. concerns regarding supply chain shortages related to rare earths and other critical minerals, including yttrium, scandium, neodymium, and indium,&#8221; and would &#8220;address U.S. concerns regarding prohibitions or restrictions on the sale of rare earth production and processing equipment and technologies.&#8221;[3] The verbs are &#8220;address&#8221; and &#8220;concerns.&#8221; That is the language of diplomatic intention. It is not the language of a regulatory commitment.</p><p>The Chinese readouts said nothing about rare earths.</p><p>Xi Jinping&#8217;s statement, issued by the Ministry of Foreign Affairs on May 14, covered &#8220;strategic stability,&#8221; agricultural trade, and Taiwan.[4] The MOFCOM follow-up on May 17 discussed tariff reductions and announced two new bilateral bodies: a US-China Board of Trade and a US-China Board of Investment.[5] CNBC noted the gap on May 18: &#8220;The Chinese statement also did not mention rare earths, while the U.S. said China would address rare earth shortages.&#8221;[6]</p><p>This is a pattern, not an accident. At Busan in October 2025, the White House announced that China had committed to &#8220;issue general licenses valid for exports of rare earths, gallium, germanium, antimony, and graphite for the benefit of U.S. end users and their suppliers around the world.&#8221; Beijing never confirmed that framing in writing. The gate has stayed narrow. The EU Chamber of Commerce in China reported that MOFCOM approved fewer than 15% of rare-earth license applications submitted by EU firms in 2025, leading to seven production stoppages in August and 46 expected in September.[7] As of December 2025, three Chinese exporters held streamlined general licenses: JL Mag Rare Earth, Ningbo Yunsheng, and Beijing Zhong Ke San Huan.[8]</p><div class="callout-block" data-callout="true"><p>When one side announces a concession that the other side does not acknowledge, the announcement is not a commitment. It is a press release on one side and regulatory silence on the other. </p></div><p>Beijing has now declined twice, at Busan and at Beijing, to confirm in writing the rare-earth language the White House has put out. The licensing gate has stayed closed throughout.</p><h2>Chips off the table</h2><p>The first two tests required leader-level negotiated outcomes on chip-specific carveouts. </p><ul><li><p>Test 1: an exemption from MOFCOM&#8217;s case-by-case licensing for the rare earths used in advanced AI chips. </p></li><li><p>Test 2: a Chinese commitment to replace case-by-case review with blanket export approvals for functional semiconductor materials.</p></li></ul><p>Greer&#8217;s answer closes the path by which either could have happened. If chip export controls were not discussed at the leader level, no carveout for sub-14-nanometer chips was negotiated. No blanket-approval commitment was extracted. The &#8220;address U.S. concerns&#8221; language in the fact sheet is an aspiration. The rules in force are still MOFCOM Notice 61 of October 9, 2025, with the case-by-case review for sub-14-nanometer logic and 256-layer memory currently sleeping under Notice 70.[9] Neither was rescinded. Neither was modified. Neither was discussed.</p><p>Jensen Huang said the quiet part out loud at a Citadel Securities event two weeks before the summit: &#8220;In China, we have now dropped to zero. Conceding an entire market the size of China probably does not make a lot of strategic sense, so I think that has already largely backfired.&#8221;[10] The remark concerned the H200, the AI accelerator Nvidia is licensed by the US to ship to roughly ten approved Chinese firms. Among the buyers are Alibaba, Tencent, ByteDance, and JD.com; among the distributors, Lenovo and Foxconn. Each shipment carries a 25% remittance to the US Treasury and physical transit through US territory for inspection.[11] Trump told reporters aboard Air Force One that the Chinese firms had &#8220;chose not to&#8221; buy &#8220;because they want to develop their own.&#8221;[12]</p><p>Three days after returning from Beijing, Huang reversed course. Asked at a Dell event in San Francisco on May 18 whether the Chinese market would reopen to Nvidia, he answered: &#8220;My sense is that over time, the market will open.&#8221;[13] Reuters framed the H200 file more pointedly: &#8220;Nvidia has received licenses from the U.S. government to sell its H200 chips but has not received approval from Chinese officials who are fostering China&#8217;s own chip suppliers.&#8221; </p><p>Huawei Ascend, Cambricon, and Biren are not waiting for Nvidia&#8217;s return. They are closing the gap while the H200 file sits in MOFCOM&#8217;s review queue. </p><div class="callout-block" data-callout="true"><p>Beijing has no need to bargain for chips it is increasingly producing itself. Rare earths are what buy Beijing the time.</p></div><p>The gate runs both ways. Beijing can stop delivery of US-approved chips through State Council guidance just as Washington can stop delivery of advanced GPUs through Commerce Department rules. The coercion stack the trailer described is not one-sided. Markets priced it within hours. Nvidia closed at $225.04 on May 15, down 4.20%, erasing roughly $170 billion of market value intraday.[14]</p><h2>The cliff</h2><p>The third test asked whether the summit would rescind China&#8217;s October 2025 extraterritorial 0.1% rule, the provision under MOFCOM Notice 61 that lets Beijing require its approval to export any foreign-made product anywhere in the world that contains more than 0.1% Chinese-origin rare earth content. The rule was suspended on November 7, 2025 under MOFCOM Notice 70. The suspension expires November 10, 2026.[15]</p><p>The summit produced no announcement closing this cliff. No language in the White House fact sheet addresses Notice 61 specifically. No Chinese regulatory action followed the summit. The cliff remains live and is scheduled to re-arm automatically in six months.</p><p>This is the deepest finding of the summit. Beijing&#8217;s most powerful tool was not given up. It was not contested. It was not even discussed at the leader level, by the USTR&#8217;s own account. It was left in place, suspended on a calendar timer. November 10 is when the paper rules catch up to the practice on the ground. MOFCOM has held approval rates below 15% throughout the suspension; the licensing gate has been closing in operation while sleeping on paper. The cliff is when the paper wakes up.</p><p>The next signal arrives in September, when Xi is scheduled to visit Washington during United Nations General Assembly week. If he arrives without a renewed suspension already in writing, the cliff becomes the central deal of the cycle. APEC in Shenzhen follows in November, two weeks after the suspension expires. The summit calendar has been arranged around the regulatory calendar, not against it.</p><h2>What markets read</h2><p>Lynas Rare Earths fell from A$19.90 on May 13 to A$17.95 on May 15, a 9.8% decline in two trading sessions.[16] MP Materials rallied to $61.27 on May 15, then dropped 7.5% to $56.67 on May 18 as the &#8220;tactical truce&#8221; reading took hold.[17] The supply-side equities priced the same conclusion the Greer interview made plain: nothing had moved underneath. The rally that built into the summit was unwound by what the summit failed to produce.</p><p>The ex-China spot prices tell the same story. Terbium oxide averaged $1,140 per kilogram FOB in late April; dysprosium oxide averaged $292 per kilogram.[18] Inside China, the same materials cleared at roughly $895 and $125 per kilogram, respectively &#8212; the Western buyer pays a quarter more for terbium and more than double for dysprosium. That spread is the cost of the licensing gate, what the marginal Western buyer pays when MOFCOM approves fewer than 15% of applications. It did not collapse during summit week. It widened.</p><div class="callout-block" data-callout="true"><p>The Western-buyer premium is the price the supply chain pays for the dependency the trailer described, and the summit confirmed that price is staying in place at least through 2028.</p></div><p>The supply-side response continues on its own timeline, indifferent to the summit. MP Materials begins commissioning heavy rare earth separation at Mountain Pass in mid-2026, targeting 200 metric tons per year of dysprosium and terbium combined.[19] Lynas continues its Malaysian expansion. Iluka&#8217;s Eneabba refinery is now targeted for 2027 commissioning, slipping from earlier 2026 guidance.[20] Combined Western heavy rare earth capacity at full ramp is on the order of 600 metric tons per year by 2028, a fraction of the heavy rare earth content embedded in the 58,000 tons of permanent magnets China exported in 2024 alone.[21]</p><p>The summit did not change any of these timelines. It did not need to. The diversification is happening regardless. The cliff is on the calendar regardless.</p><h2>What the summit settled</h2><p>The trailer argued that rare earths form the fourth layer of the AI infrastructure coercion stack, under chips, cloud, and models. The Beijing summit tested that argument against three concrete questions. Each question required a specific regulatory action that would have shown leader-level willingness to ease the rare earth grip. None of the three occurred.</p><p>The summit produced agricultural commitments, Boeing aircraft orders, beef market access restoration, and two new bilateral talking shops. These are real diplomatic outputs. A lower geopolitical temperature reduces tail risk; the next confrontation is postponed. But they are the deliverables of a managed-stability summit, not of a rebalancing of the underlying dependence. Greer&#8217;s sentence confirms the boundary: leader-level discussions did not reach the rules that hold the rare earth grip in place. Working-level talks may continue. Without principal-level direction, MOFCOM has no political cover to dismantle the rules it issued under leader-level authority five months ago. </p><div class="callout-block" data-callout="true"><p>The two readouts confirm the consequence: what one side announces is not what the other side will enforce.</p></div><p>Six months from now, on November 10, 2026, MOFCOM Notice 70 expires. Either Beijing extends the suspension before that date, in writing, or the extraterritorial 0.1% rule re-arms automatically. The two scheduled summit appearances &#8212; Xi in Washington in September, Trump and Xi at APEC Shenzhen in November &#8212; are the venues where that decision will be made.</p><p>Last week, this newsletter set three tests for the Beijing summit. The summit returned each test unchanged. The exposure the trailer described was not negotiated away. It was scheduled forward.</p><p>The chip war happens in press releases. The war underneath happens on the regulatory calendar.</p><div><hr></div><h3>Notes</h3><p>[1] Jamieson Greer, US Trade Representative, Bloomberg Television interview, May 15, 2026, as reported by Reuters: <a href="https://finance.yahoo.com/sectors/technology/articles/chip-export-controls-not-major-014050965.html">&#8220;Chip export controls not major topic in China talks, US trade rep Greer tells Bloomberg News&#8221;</a>.</p><p>[2] <a href="https://www.airealist.ai/p/below-the-silicon">&#8220;Below the Silicon&#8221;</a>, The AI Realist, May 13, 2026.</p><p>[3] <a href="https://www.whitehouse.gov/fact-sheets/2026/05/fact-sheet-president-donald-j-trump-secures-historic-deals-with-china-delivering-for-american-workers-farmers-and-industry/">&#8220;Fact Sheet: President Donald J. Trump Secures Historic Deals with China, Delivering for American Workers, Farmers, and Industry&#8221;</a>, The White House, May 17, 2026.</p><p>[4] <a href="https://www.fmprc.gov.cn/eng/xw/zyxw/202605/t20260514_11910330.html">&#8220;President Xi Jinping Holds Talks with U.S. President Donald J. Trump&#8221;</a>, Ministry of Foreign Affairs of the People&#8217;s Republic of China, May 14, 2026.</p><p>[5] <a href="https://www.cnbc.com/2026/05/18/us-china-announce-deals-after-trump-xi-summit.html">&#8220;White House touts deals on soybeans and rare earths after Trump-Xi summit, while China talks up tariff cuts&#8221;</a>, CNBC, May 18, 2026.</p><p>[6] Ibid.</p><p>[7] <a href="https://www.iss.europa.eu/publications/commentary/false-sense-security-european-complacency-rare-earths-wrong-answer-us-china">&#8220;False sense of security: European complacency on rare earths is the wrong answer to the US-China trade truce&#8221;</a>, European Union Institute for Security Studies, citing EU Chamber of Commerce in China data, accessed May 2026.</p><p>[8] <a href="https://www.mining.com/china-issues-first-batch-of-streamlined-rare-earth-licences/">&#8220;China issues first batch of streamlined rare earth licences&#8221;</a>, Mining.com, December 2, 2025.</p><p>[9] MOFCOM Notice 61 of October 9, 2025; MOFCOM Notice 70 of November 7, 2025, suspending the extraterritorial provisions until November 10, 2026. Analysis: Pillsbury Winthrop Shaw Pittman, <a href="https://www.pillsburylaw.com/en/news-and-insights/china-suspends-export-controls-certain-critical-minerals-related-items.html">&#8220;China Suspends Export Controls on Certain Critical Minerals and Related Items&#8221;</a>; Clark Hill, <a href="https://www.clarkhill.com/news-events/news/china-hits-pause-on-rare-earth-export-controls-and-what-it-means-for-supply-chains/">&#8220;China Hits &#8216;Pause&#8217; on Rare-Earth Export Controls and What it Means for Supply Chains&#8221;</a>.</p><p>[10] Jensen Huang, remarks at Citadel Securities event, early May 2026, as reported by Tom&#8217;s Hardware: <a href="https://www.tomshardware.com/tech-industry/trump-says-china-is-blocking-h200-purchases">&#8220;Trump says China is blocking Nvidia H200 purchases despite US approval &#8212; says country &#8216;chose not to&#8217; sanction purchases, pushing homegrown chips instead&#8221;</a>.</p><p>[11] H200 framework details per Implicator, <a href="https://www.implicator.ai/nvidia-h200-deliveries-to-china-remain-stalled-after-trump-xi-summit/">&#8220;Nvidia H200 China Deliveries Stalled After Trump-Xi Summit&#8221;</a>, May 2026.</p><p>[12] Trump remarks aboard Air Force One, May 15, 2026, as reported by Tom&#8217;s Hardware (op. cit.).</p><p>[13] <a href="https://finance.yahoo.com/sectors/technology/articles/nvidia-ceo-says-he-believes-china-market-will-open-over-time-185612332.html">&#8220;Nvidia CEO says he believes China market will open over time&#8221;</a>, Reuters, San Francisco, May 18, 2026 (Bloomberg Television interview at Dell event).</p><p>[14] <a href="https://tradersunion.com/news/financial-news/show/2060925-nvidia-slides-4-20percent-today-to/">&#8220;Delayed Chinese approval for H200 chips sends Nvidia stock down 4.20%&#8221;</a>, Traders Union, May 15, 2026, citing Google Finance.</p><p>[15] Pillsbury Winthrop Shaw Pittman, op. cit.; Clark Hill, op. cit.; MOFCOM Announcement No. 70 of 2025.</p><p>[16] Lynas Rare Earths (ASX: LYC) close prices per ASX official data, accessed via <a href="https://stockanalysis.com/quote/asx/LYC/">StockAnalysis.com</a>.</p><p>[17] MP Materials (NYSE: MP) close prices per <a href="https://www.morningstar.com/stocks/xnys/mp/quote">Morningstar</a>; <a href="https://rareearthexchanges.com/news/lynas-tumbles-as-trump-xi-truce-lifts-false-calm-over-rare-earths/">&#8220;Lynas Tumbles as &#8216;Trump&#8211;Xi Truce&#8217; Lifts False Calm Over Rare Earths&#8221;</a>, Rare Earth Exchanges, May 18, 2026.</p><p>[18] Rare earth FOB spot price data per Rare Earth Exchanges market reports, May 2026; <a href="https://rare-earth-mining.com/rare-earth-market-outlook-may-2026/">&#8220;Rare Earth Market Outlook May 2026: Prices Fall&#8221;</a>, Rare-earth-mining.com.</p><p>[19] <a href="https://investors.mpmaterials.com/investor-news/news-details/2025/MP-Materials-Reports-Third-Quarter-2025-Results/default.aspx">MP Materials Q3 2025 earnings release</a>; <a href="https://news.bloomberglaw.com/federal-contracting/pentagon-backed-mp-materials-to-start-rare-earths-plant-in-2026">&#8220;Pentagon-Backed MP Materials to Start Rare Earths Plant in 2026&#8221;</a>, Bloomberg, November 6, 2025.</p><p>[20] Iluka Resources Q1 2026 financial reporting; <a href="https://www.iluka.com/media/14mesbwj/6dec24-eneabba-rare-earths-positive-outcome-of-funding-discussions.pdf">&#8220;Eneabba Rare Earths Refinery Funding Update&#8221;</a>, Iluka Resources ASX release, December 6, 2024.</p><p>[21] <a href="https://www.iea.org/reports/global-critical-minerals-outlook-2025">IEA, &#8220;Global Critical Minerals Outlook 2025&#8221;</a>, October 2025, citing 2024 Chinese permanent magnet export volumes.</p>]]></content:encoded></item><item><title><![CDATA[Where the HALEU bet actually pays]]></title><description><![CDATA[Capacity Factor &#8212; Post 2 of 6 in a series on US nuclear fuel cycle equities.]]></description><link>https://www.airealist.ai/p/where-the-haleu-bet-actually-pays</link><guid isPermaLink="false">https://www.airealist.ai/p/where-the-haleu-bet-actually-pays</guid><dc:creator><![CDATA[Dante]]></dc:creator><pubDate>Sat, 16 May 2026 01:05:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bq_f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In Post 1, I argued that the tightest knot in the US nuclear fuel cycle is HALEU enrichment &#8212; high-assay low-enriched uranium, the 5&#8211;19.75% U-235 fuel that every advanced reactor in the US needs for its first core. There is no commercial Western HALEU supply at scale. Until 2024, it all came from Russia.</p><p>There are exactly two US-listed names with direct HALEU exposure. One of them is the obvious pick &#8212; funded by the Department of Energy, owned by ~80% of institutions, up 200% in the last twelve months. The other is a $700M micro-cap whose enrichment subsidiary you&#8217;ve probably never heard of.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.airealist.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The AI Realist! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Thanks for reading! Subscribe for free to receive new posts and support my work.</p><p>I think the smaller one is the better trade today. Not because the bigger name is bad, it isn&#8217;t,  but because the market has already priced its bull case, and the smaller name is the only fundamentally cheap HALEU option in the US public market.</p><p>Here&#8217;s the work.</p><h1>The two listed names</h1><p><strong>Centrus Energy (NYSE: LEU)</strong> is the name. It is the only US-owned commercial enricher and one of three companies funded by the DOE&#8217;s January 2026 $2.7B HALEU and LEU enrichment award. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y7Ub!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 424w, https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 848w, https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 1272w, https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png" width="822" height="604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:604,&quot;width&quot;:822,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TradingView chart&quot;,&quot;title&quot;:&quot;TradingView chart&quot;,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="TradingView chart" title="TradingView chart" srcset="https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 424w, https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 848w, https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 1272w, https://substackcdn.com/image/fetch/$s_!Y7Ub!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9abb73a5-2338-4652-88ec-9cd69d45b43c_822x604.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Created with <a href="https://tradingview.com">TradingView</a></figcaption></figure></div><p>Centrus&#8217;s American Centrifuge Operating subsidiary received $900M for HALEU production at Piketon, Ohio, and produced the first ~900 kg of US-origin HALEU. Centrus raised 2026 revenue guidance to <strong>$450&#8211;500M</strong> in its Q1-26 print (https://www.prnewswire.com/news-releases/centrus-reports-first-quarter-2026-results-302763250.html), and is sitting on a $3.9B contracted backlog. Market cap <strong>$4.37B</strong>.</p><p><strong>ASP Isotopes (NASDAQ: ASPI)</strong> is the optionality name. Its core business is laser-based isotope enrichment for medical (Mo-99 path), semiconductor (Silicon-28), and pharmaceutical applications. The HALEU exposure is via a wholly-owned subsidiary, <strong>Quantum Leap Energy (QLE)</strong>, which holds a long-term HALEU offtake agreement with TerraPower plus a $22M conditional loan, and which signed a non-binding MOU in March 2026 (https://www.stocktitan.net/news/ASPI/) with a major US nuclear power operator for HALEU, LEU+, uranium conversion, and deconversion services. Market cap <strong>$690M</strong>, of which <strong>$333M is cash</strong> as of December 2025.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ycr_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ycr_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 424w, https://substackcdn.com/image/fetch/$s_!Ycr_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 848w, https://substackcdn.com/image/fetch/$s_!Ycr_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 1272w, https://substackcdn.com/image/fetch/$s_!Ycr_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ycr_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png" width="822" height="604" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e98d483-f27f-4512-bc12-a22845893afa_822x604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:604,&quot;width&quot;:822,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;TradingView chart&quot;,&quot;title&quot;:&quot;TradingView chart&quot;,&quot;type&quot;:&quot;image/jpg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="TradingView chart" title="TradingView chart" srcset="https://substackcdn.com/image/fetch/$s_!Ycr_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 424w, https://substackcdn.com/image/fetch/$s_!Ycr_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 848w, https://substackcdn.com/image/fetch/$s_!Ycr_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 1272w, https://substackcdn.com/image/fetch/$s_!Ycr_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e98d483-f27f-4512-bc12-a22845893afa_822x604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Created with <a href="https://tradingview.com">TradingView</a></figcaption></figure></div><h1>The numbers side-by-side:</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bq_f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bq_f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 424w, https://substackcdn.com/image/fetch/$s_!bq_f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 848w, https://substackcdn.com/image/fetch/$s_!bq_f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 1272w, https://substackcdn.com/image/fetch/$s_!bq_f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bq_f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png" width="960" height="853" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:853,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117774,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dante126.substack.com/i/196717594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!bq_f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 424w, https://substackcdn.com/image/fetch/$s_!bq_f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 848w, https://substackcdn.com/image/fetch/$s_!bq_f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 1272w, https://substackcdn.com/image/fetch/$s_!bq_f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa17eb0c4-3b13-49c3-9e0d-9de2cbb418d1_960x853.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you only look at the top three rows, Centrus is obviously the better company. It has revenue, it has guidance, it has DOE money, and the chart has only gone up. ASP Isotopes is small, loss-making, and the stock has gone nowhere for a year while the rest of the nuclear thematic has rallied.</p><p>But the bottom three rows are where the trade actually lives.</p><p><strong>Where the value actually is</strong></p><p>I built probability-weighted scenario DCFs on both names.</p><p><strong>Centrus scenarios (my P-weights):</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Og_s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Og_s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 424w, https://substackcdn.com/image/fetch/$s_!Og_s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 848w, https://substackcdn.com/image/fetch/$s_!Og_s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 1272w, https://substackcdn.com/image/fetch/$s_!Og_s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Og_s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png" width="1310" height="200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:200,&quot;width&quot;:1310,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50962,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dante126.substack.com/i/196717594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Og_s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 424w, https://substackcdn.com/image/fetch/$s_!Og_s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 848w, https://substackcdn.com/image/fetch/$s_!Og_s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 1272w, https://substackcdn.com/image/fetch/$s_!Og_s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ff6ca8c-5901-4cf1-9dad-3c33ca4764ec_1310x200.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Probability-weighted fair value: <strong>~$169/share</strong>. Current price <strong>$222</strong>. That gap doesn&#8217;t mean Centrus is overvalued in any absolute sense &#8212; it means the market is pricing the bull-case outcome at roughly 60&#8211;70% probability, versus my 30%. Either I&#8217;m wrong about the probabilities, or the market is paying ahead of execution. Both can be true at once.</p><p><strong>ASP Isotopes scenarios:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nD5I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nD5I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 424w, https://substackcdn.com/image/fetch/$s_!nD5I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 848w, https://substackcdn.com/image/fetch/$s_!nD5I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 1272w, https://substackcdn.com/image/fetch/$s_!nD5I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nD5I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png" width="1398" height="240" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c960d83e-5a37-410e-93d8-33627da373b9_1398x240.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:240,&quot;width&quot;:1398,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:67674,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dante126.substack.com/i/196717594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nD5I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 424w, https://substackcdn.com/image/fetch/$s_!nD5I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 848w, https://substackcdn.com/image/fetch/$s_!nD5I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 1272w, https://substackcdn.com/image/fetch/$s_!nD5I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc960d83e-5a37-410e-93d8-33627da373b9_1398x240.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Weighted operating NPV ~$0.74B, plus $333M cash on hand &#8594; f<strong>air equity ~$1.07B</strong>. Current market cap <strong>$690M</strong>. That&#8217;s a +55% asymmetric setup, with the cash backstop providing a soft floor.</p><p>Three observations from running these numbers.</p><p><strong>First, the asymmetry runs in opposite directions.</strong> Centrus&#8217;s bear case is &#8211;80% from current; ASPI&#8217;s bear case is &#8211;78%. That looks similar. But ASPI&#8217;s bear case still leaves you with $150M of NPV against $333M of cash, so the actual equity floor is higher than the bear-NPV number suggests. Centrus&#8217;s bear case has no cash floor &#8212; it&#8217;s a working business, and the bear case is operational impairment. The shape of the downside is different even when the magnitude is similar.</p><p><strong>Second, insider ownership is doing real work.</strong> ASPI&#8217;s 13.5% insider ownership versus Centrus&#8217;s 3.3% (and NuScale&#8217;s 0.4%) is the kind of management-alignment signal that tends to matter at exactly the inflection moment ASPI is approaching &#8212; the 2026 commercial-shipment year. Founders who own the company tend not to price-collapse it on the first dilutive raise.</p><p><strong>Third, the TerraPower offtake is a third-party validation that the market hasn&#8217;t internalized.</strong> TerraPower is privately held, well-funded, and has every incentive to source HALEU from the most credible producer it can find &#8212; including, in theory, from Centrus directly. The fact that TerraPower committed offtake terms and a $22M conditional loan to QLE specifically tells you the market is too pessimistic on QLE&#8217;s technical credibility.</p><p><strong>The catalysts most people aren&#8217;t tracking</strong></p><p>Both names have a thick catalyst calendar through the end of 2027. The two that matter most for getting positioning right are very specific.</p><p>For <strong>Centrus</strong>, the binary is the <strong>Q2-26 print in August</strong>, where management will disclose Piketon HALEU production cadence in kg/month run-rate. The current implied schedule has Piketon ramping toward roughly 6 metric tons per year of HALEU output by 2028. That implies a run-rate around 80 kg/month at maturity. If the August print shows the production cadence tracking below ~40 kg/month &#8212; half the implied path &#8212; the bear case activates fast and the multiple compresses with it. If it tracks at or above 60 kg/month, the bull case stays alive, and the stock probably runs further before consolidating.</p><p>For <strong>ASPI</strong>, the binary is <strong>QLE&#8217;s first HALEU pilot output disclosure</strong>, expected in Q4 2026. This is the cleanest existence proof the market has been waiting for. If QLE produces enriched material on schedule, the bull-case probability re-weights upward &#8212; and at a $690M market cap, the re-rating math is significant. If QLE misses by more than two quarters, the bear-case probability dominates, and the cash floor becomes the only thing holding the stock up.</p><p>Two other dates worth flagging:</p><ul><li><p><strong>The Russia uranium waiver expiry in 2027</strong> &#8212; under the Prohibiting Russian Uranium Imports Act is a structurally positive catalyst for both names, but more so for Centrus, which loses Russia LEU revenue but gains tighter pricing on its US-domestic enrichment.</p></li><li><p><strong>Centrus&#8217;s first commercial HALEU shipments</strong>, targeted for Q2 2027, are the bull-case proof point. If those land on schedule with TerraPower, X-energy, or Kairos as the first counterparty, Centrus becomes harder to fade.</p></li></ul><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:8605191,&quot;name&quot;:&quot;Dante&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!KOFz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d887e17-2655-4364-b265-9806e9906ff3_449x449.webp&quot;,&quot;base_url&quot;:&quot;https://dante126.substack.com&quot;,&quot;hero_text&quot;:&quot;All opinions are mine, not financial advice.&quot;,&quot;author_name&quot;:&quot;Dante&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://dante126.substack.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!KOFz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d887e17-2655-4364-b265-9806e9906ff3_449x449.webp" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">Dante</span><div class="embedded-publication-hero-text">All opinions are mine, not financial advice.</div></a><form class="embedded-publication-subscribe" method="GET" action="https://dante126.substack.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div><h1> What this means for stock-picking</h1><p>1. <strong>Centrus is the right structural answer to the wrong question.</strong> &#8220;Which name has the most HALEU exposure?&#8221; gets you to LEU. &#8220;Which HALEU name offers asymmetric upside at current prices?&#8221; gets you to ASPI. Both questions are valid, but only one is a trade.</p><p>2. <strong>The size discount is doing the lifting.</strong> ASPI is small enough that institutional ownership hasn&#8217;t yet crowded out the asymmetry &#8212; 51.6% versus Centrus&#8217;s 79.3%. The same business at $4B of market cap would already be priced in line with Centrus.</p><p>3. <strong>Optionality positions need explicit exits.</strong> I would hard exit if QLE produces no enriched HALEU material by year-end 2026, or if TerraPower offtake terms are publicly restructured downward. The cash backstop makes the position survivable; the falsification triggers make it disciplined.</p><p>4. <strong>Centrus is a buy on pullback, not a buy on chase.</strong> I would build a position below $165, where the implied bull-case probability falls into a range that matches my analytical view. Above $200, the math doesn&#8217;t work even on aggressive assumptions.</p><p>5. <strong>Both names will be revisited together every quarter.</strong> The catalyst structure is interlocking &#8212; Centrus&#8217;s Piketon cadence and ASPI&#8217;s QLE pilot are the two existence proofs that determine whether US-owned HALEU is real or theoretical. Watching only one of them gives you half the signal.</p><h1> <strong>What&#8217;s coming</strong></h1><ul><li><p><strong>Post 3 &#8212; </strong><em><strong>Conversion: the bottleneck nobody can play directly</strong></em><strong>. </strong>Why ConverDyn / Solstice&#8217;s Metropolis Works is the single tightest commercial chokepoint in the chain, and how an HON Advanced Materials spin (rumored, not confirmed) would unlock the cleanest pure-play if it ever lists.</p></li><li><p><strong>Post 4 &#8212; </strong>*<em><strong>Picks-and-shovels</strong></em>*<strong>.</strong> The mid-cap that passes the 4-variable filter in two of seven segments simultaneously, and why I think it&#8217;s a satellite rather than a core position, despite that.</p></li><li><p><strong>Post 5 &#8212; </strong>*<em><strong>SMR demand</strong></em>*<strong>. </strong>Why I think the post-CFPP-cancellation reset on NuScale is more advanced than the market recognizes &#8212; and why that doesn&#8217;t yet mean I&#8217;m long.</p></li><li><p><strong>Post 6 &#8212; </strong>*<em><strong>The book</strong></em>*<strong>. </strong>Five names, position-sized, with explicit falsification triggers for each.</p></li></ul><p>Subscribe if you want this in your inbox over the next weeks.</p><p><strong>Further reading</strong></p><p>- DOE &#8212; *<em>[Awards $2.7B to restore American uranium enrichment](</em>https://www.energy.gov/articles/us-department-energy-awards-27-billion-restore-american-uranium-enrichment)*<em> (Jan 6, 2026)</em></p><p>- Centrus Energy &#8212; *<em>[Q1-26 results press release](</em>https://www.prnewswire.com/news-releases/centrus-reports-first-quarter-2026-results-302763250.html)*</p><p>- ASP Isotopes &#8212; *<em>[Q3-25 10-Q via SEC EDGAR / StockTitan summary](</em>https://www.stocktitan.net/sec-filings/ASPI/10-q-asp-isotopes-inc-quarterly-earnings-report-b8eb89c3dea3.html)*</p><p>- World Nuclear News &#8212; *<em>[US enrichment funding recipients flesh out plans](</em>https://www.world-nuclear-news.org/articles/us-enrichment-funding-reactions)*</p><p>- ANS Nuclear Newswire &#8212; *<em>[DOE awards $2.7B for HALEU and LEU enrichment](</em>https://www.ans.org/news/article-7652/doe-awards-27b-for-haleu-and-leu-enrichment/)*</p><p>- Prohibiting Russian Uranium Imports Act ([P.L. 118-62](https://www.congress.gov/bill/118th-congress-house-bill/1042), May 2024)</p><p>---</p><p>*<em>Capacity Factor is a six-part series on US nuclear fuel-cycle equities.</em></p><p>Thanks for reading! Subscribe for free to receive new posts and support my work.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.airealist.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The AI Realist! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Below the Silicon]]></title><description><![CDATA[Trump flies to Beijing on Wednesday. Here&#8217;s the recipe his hosts control, element by element.]]></description><link>https://www.airealist.ai/p/below-the-silicon</link><guid isPermaLink="false">https://www.airealist.ai/p/below-the-silicon</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Tue, 12 May 2026 04:55:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!pXSO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pXSO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pXSO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 424w, https://substackcdn.com/image/fetch/$s_!pXSO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 848w, https://substackcdn.com/image/fetch/$s_!pXSO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 1272w, https://substackcdn.com/image/fetch/$s_!pXSO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pXSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png" width="1264" height="848" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:848,&quot;width&quot;:1264,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2167475,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/197210359?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pXSO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 424w, https://substackcdn.com/image/fetch/$s_!pXSO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 848w, https://substackcdn.com/image/fetch/$s_!pXSO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 1272w, https://substackcdn.com/image/fetch/$s_!pXSO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f7e9811-3ebd-4f6a-9897-e56ba731f5c2_1264x848.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Inside a TSMC fab in Taiwan, at this moment, an Nvidia Blackwell die is being polished flat to within fractions of a nanometer. A few meters away, a lithography scanner is exposing the next wafer with extreme-ultraviolet light generated by vaporizing tin droplets with a 30-kilowatt laser, 50,000 times a second. This is the most precise industrial process in human history.</p><p>It runs on rare earth elements. China mines 70% of the world&#8217;s supply and refines 91% of it. America refines less than 1%.[1]</p><p>On Wednesday, the President of the United States flies to Beijing to negotiate continued access.</p><h2>The recipe</h2><p>The first step is the polish. Before a chip can be patterned, the silicon wafer has to be made flat to within fractions of an atom across an area the size of a dinner plate. This is done with a slurry of fine abrasive particles. The abrasive is <strong>cerium oxide</strong>, a rare-earth compound made almost entirely in China.[2]</p><p>Next comes lithography: the printing of the chip&#8217;s pattern onto the wafer using extreme-ultraviolet light. Three rare earths appear in the light path. <strong>Erbium</strong> is doped into the optical fibers that amplify the laser&#8217;s pulse. <strong>Terbium</strong> forms a special crystal &#8212; terbium gallium garnet &#8212; that lets light through one direction and blocks it in the other, protecting the laser from its own reflections. <strong>Thulium</strong> will be used in the next generation of these lasers, currently under development at Lawrence Livermore National Laboratory. The lithography machines that will use them are built by ASML &#8212; the Dutch company that supplies every advanced fab in the world.[3]</p><p>Between exposures, a metrology laser checks that the pattern came out right. The crystal at its heart is often Nd:YAG &#8212; <strong>yttrium aluminum garnet</strong>, doped with <strong>neodymium</strong>.[4] After exposure, the patterned features are etched into the silicon by corrosive fluorine and chlorine plasmas. To survive the plasma, the etch chamber is lined with <strong>yttrium oxide</strong>.[5]</p><p>Then comes deposition: laying down the metal films that become the chip&#8217;s wiring. The way to deposit a metal film is to put a solid block of it (a &#8220;sputtering target&#8221;) into a vacuum chamber and knock atoms off it with ions. Most sputtering targets are pure metals &#8212; copper, tungsten, titanium &#8212; but the targets that lay down the high-performance dielectric layers and certain barrier materials contain rare earths, most often <strong>yttrium</strong> and <strong>lanthanum</strong>.[6] Finally, the chip is packaged. In a modern AI accelerator, packaging means stacking multiple silicon dies into high-bandwidth memory &#8212; the &#8220;HBM&#8221; the industry talks about &#8212; and bonding them with millions of microscopic copper joints. Between each bonding step, the surfaces are polished flat again, with the same <strong>cerium oxide</strong> slurry that started the process.</p><p>Then the chip leaves the fab. It arrives at a hyperscaler datacenter on a server board cooled by spinning fans. The motors driving those fans are made from neodymium magnets &#8212; alloys of iron, boron, and <strong>neodymium</strong>, almost always with a small percentage of <strong>dysprosium</strong> or <strong>terbium</strong> added to keep them magnetic at high temperatures.[7] The same magnets power the hard drives, the liquid-cooling pumps that keep modern GPU racks from melting, and every motorized actuator in the rack.</p><p>Behind the chip, the tools that fabricated it run on the same chemistry. Every lithography scanner, every ion implanter, every etch tool &#8212; the precision motors are all <strong>neodymium</strong> magnets, with the highest-performance versions in fab equipment carrying up to ten percent <strong>dysprosium</strong> by weight.[8] The magnetic bearings on many cleanroom and vacuum pumps are <strong>neodymium</strong> too. So are the robotic arms that move wafers between tools.</p><p>A modern AI accelerator is, in material terms, a tightly packed assembly of silicon, copper, and rare-earth elements. The silicon and copper have multiple commercial sources. The rare earths do not. Substitutes exist for some uses but perform worse &#8212; there is no commercial alternative to cerium oxide at advanced lithography nodes, and no replacement for the heavy rare earths in high-temperature magnets.</p><h2>The dependency</h2><p>China controls roughly 70% of global rare earth mining, 91% of separation and refining, and 94% of the world&#8217;s strongest permanent magnets &#8212; the kind used in motors, generators, and precision equipment.[9] The geological deposits that yield commercial quantities of the heavy rare earths used in those magnets &#8212; dysprosium and terbium &#8212; are a specific type of clay-bound ore (geologists call them ion-adsorption clays), found in commercial concentrations only in southern China and northern Myanmar. Together, they account for more than 99% of the world&#8217;s heavy rare-earth feedstock, with Myanmar production largely flowing into Chinese refineries.[10] </p><p>Last year, every gram of terbium America imported came from China. So did every gram of holmium, and every gram of lutetium. Net U.S. import reliance on heavy rare earths is 100%; the small share nominally sourced from third-country processors in Estonia, Japan, and Malaysia is itself derived from Chinese feedstock.[11]</p><p>This is the layer beneath the chip war. &#8220;Access, Disable, Destroy&#8221; mapped a three-switch model of AI infrastructure coercion: chips at the silicon layer, cloud at the infrastructure layer, models at the application layer.[12] The materials layer sits beneath all three. China has commercial and diplomatic reasons not to embargo rare earths outright &#8212; its producers want the revenue, and a formal cutoff would accelerate Western diversification. </p><p>The leverage operates instead through individual export approvals: China&#8217;s Ministry of Commerce (MOFCOM) requires a case-by-case license for any shipment of rare earths destined for advanced semiconductors. The trigger categories are logic chips at process nodes below 14 nanometers (every AI accelerator made today) and memory stacked with more than 256 layers (the high-bandwidth memory inside those accelerators). This licensing regime remains active throughout the November 2025 suspension.[13] A single review can stall a shipment indefinitely, even without a formal export ban. Diversification at the binding constraint takes time that the AI capex cycle does not have: industry estimates place full onshoring of heavy rare-earth refining at 5 to 7 years.[14]</p><h2>The response</h2><p>On February 2, 2026, Donald Trump announced Project Vault &#8212; a $12 billion strategic reserve of rare earth elements, modeled on the Strategic Petroleum Reserve that has insulated the United States against oil shocks since the 1970s. The signal: the administration now treats rare earth dependency as a national security exposure on par with energy security. The structure is a $10 billion, 15-year loan from the Export-Import Bank, plus roughly $1.7 billion of private capital, with procurement handled by three commodities trading houses.[15] They buy imported oxides and metals on behalf of civilian-sector manufacturers, who can draw down their allocations in a disruption and replenish them when supply normalizes.[16] At blended heavy rare earth prices &#8212; terbium oxide at $1,010 per kilogram, dysprosium at $239 &#8212; $12 billion is a serious buffer against price spikes and short interruptions.</p><p>It does not address the binding constraint. The United States has no commercial-scale heavy rare earth separation capability operating today.[17] MP Materials&#8217; Mountain Pass heavy rare earth circuit, backed by a $150 million Department of War loan, targets 200 metric tons per year of dysprosium and terbium production from mid-2026.[18] Lynas, the only commercial-scale producer of separated heavy rare earths outside China, is expanding its Malaysia facility to a full suite of heavy rare earths within two years.[19] Combined Western capacity at full ramp is on the order of 600 metric tons per year of dysprosium and terbium by 2028 &#8212; a fraction of the heavy rare-earth content embedded in the 58,000 tons of permanent magnets China exported in 2024 alone.[20]</p><p>What Project Vault stockpiles is what comes out of the country it was designed to protect against. The reserve relocates the dependency one step upstream &#8212; from end-use to inventory &#8212; without changing the upstream geography. Meanwhile, the chokepoint is moving. In March 2026, Shenzhen launched a state-coordinated R&amp;D program for domestic rare-earth-based polishing slurries &#8212; the same cerium oxide chemistry the wafer polish opens with, currently dominated by U.S. and Japanese suppliers.[21] The pattern is consistent: control raw materials upstream, control separation in the middle, and as Western capacity catches up at the upstream layers, move downstream into the higher-margin functional materials. Each Western response addresses a layer that the chokepoint has already moved past.</p><h2>What to watch on Friday</h2><p>The summit will produce announcements. Boeing purchases. Agricultural commitments. A bilateral Board of Trade. Possibly an extension of the November 2025 suspension beyond the November 10, 2026 expiry, framed as continued de-escalation.[22] None of these alters the materials layer.</p><p>Three things would. First, an exemption from MOFCOM&#8217;s case-by-case licensing for the rare earths used in advanced AI chips &#8212; the sub-14-nanometer logic and 256-layer memory categories now requiring individual Chinese approval. This would dissolve the most direct chokepoint. Second, a commitment to blanket licenses rather than per-shipment review for the functional materials flowing through semiconductor manufacturing: polishing slurries, sputtering targets, and non-military magnets. That would turn managed dependency into something predictable. Third, a mutual rollback of China&#8217;s October 2025 extraterritorial rule, which lets Beijing license any foreign-made product anywhere in the world that contains more than 0.1% Chinese-origin rare earths. That rule is currently suspended; rescinding it would close the November cliff rather than postpone it.</p><p>None of these is on the agenda that the U.S. Trade Representative previewed in April.[23] The summit is one whose success is measured by the absence of breakdown, not by the resolution of substance.</p><p>Every Blackwell, every MI300, every TPU, every Trainium, every HBM stack from Samsung and SK Hynix carries this recipe inside it. The rare earths are extracted from Chinese land. The chips are built by TSMC on Chinese land &#8212; or so they say.</p><p>Beijing claims both halves as Chinese. It controls only one. By Friday, the President will have negotiated with the half it controls. Taiwan, where the chips are made, will be the silence in the room.</p><div><hr></div><h3>Notes</h3><p>[1] Mining figures: <a href="https://pubs.usgs.gov/periodicals/mcs2025/mcs2025-rare-earths.pdf">U.S. Geological Survey, </a><em><a href="https://pubs.usgs.gov/periodicals/mcs2025/mcs2025-rare-earths.pdf">Mineral Commodity Summaries 2025: Rare Earths</a></em><a href="https://pubs.usgs.gov/periodicals/mcs2025/mcs2025-rare-earths.pdf">, January 2025</a>. China mined 270,000 metric tons of REO equivalent in 2024, accounting for 69.2% of the world total (390,000 tons); the United States mined 45,000 tons. Refining figures: <a href="https://www.iea.org/commentaries/with-new-export-controls-on-critical-minerals-supply-concentration-risks-become-reality">International Energy Agency, &#8220;With new export controls on critical minerals, supply concentration risks become reality,&#8221; October 9, 2025</a>. China = 91% of global rare earth separation and refining; 94% of sintered permanent magnet production. U.S. domestic production of refined rare earth compounds and metals in 2024 was approximately 1,300 tons (USGS) &#8212; roughly 0.3% of global production. Most U.S.-mined concentrate is exported for refining elsewhere, principally to China.</p><p>[2] Cerium oxide is the dominant abrasive in chemical-mechanical planarization slurries used for advanced-node silicon wafer polishing; its abrasive properties at sub-nanometer scales are not matched by available substitutes. Chinese mining accounts for the majority of global cerium supply, and Chinese separation accounts for the overwhelming majority of refined cerium oxide production.</p><p>[3] <a href="https://www.llnl.gov/article/52226/llnl-selected-lead-next-gen-extreme-ultraviolet-lithography-research">Lawrence Livermore National Laboratory, &#8220;LLNL selected to lead next-gen extreme ultraviolet lithography research,&#8221; December 23, 2024</a>. Erbium-doped fiber amplifiers are standard in the seed-laser stages of EUV light-source pre-pulse generation. Terbium gallium garnet (TGG) is the standard material for Faraday optical isolators in DUV and short-wavelength laser systems, including those used in lithography, metrology, and inspection. Thulium-doped yttrium lithium fluoride is a candidate gain material for next-generation high-numerical-aperture EUV sources.</p><p>[4] Neodymium-doped yttrium aluminum garnet (Nd:YAG) is a long-established laser crystal used in fab metrology, alignment, inspection, and certain marking applications. See <a href="https://vimaterial.de/en/rare-earth-materials-strategic-materials/">Vimaterial industry overview, &#8220;Rare earth materials for a brighter future,&#8221; February 26, 2026</a>.</p><p>[5] Yttrium oxide ceramic coatings are standard for plasma etch chamber liners due to their resistance to fluorine and chlorine plasma chemistries; they reduce particle contamination and extend chamber service intervals. See industry technical literature on plasma etch chamber materials.</p><p>[6] Sputtering targets composed of rare-earth metals and oxides are used in physical vapor deposition of barrier layers, electrodes, and functional thin films in semiconductor manufacturing. Yttrium, gadolinium, and other rare earths appear across multiple deposition recipes.</p><p>[7] Standard NdFeB permanent magnet formulations contain 1&#8211;3% dysprosium or terbium for elevated-temperature applications. Industry-standard composition; see also USGS MCS 2026, <em>Rare Earths (Heavy)</em> chapter.</p><p>[8] Higher-performance NdFeB grades used in precision-motion applications (semiconductor manufacturing equipment, certain medical devices, defense applications) can contain heavy rare-earth content of up to approximately 10% by mass, depending on temperature and demagnetization-resistance requirements.</p><p>[9] Mining share: USGS MCS 2025, op. cit. (China 270,000 / world 390,000 = 69.2% in 2024). Refining and magnet shares: IEA, op. cit. (91% separation, 94% sintered permanent magnets).</p><p>[10] <a href="https://payneinstitute.mines.edu/explainer-on-the-mp-materials-department-of-defense-partnership/">Payne Institute for Public Policy (Colorado School of Mines), &#8220;Explainer on the MP Materials&#8211;Department of War Partnership,&#8221; August 2025</a>. The principal global sources of separated heavy rare earths, such as dysprosium and terbium, are ion-adsorption clay (IAC) mining operations; the only notable IAC operations in the world are in China and Myanmar (&gt;99%), with the Myanmar production typically flowing into Chinese separation facilities.</p><p>[11] <a href="https://pubs.usgs.gov/periodicals/mcs2026/mcs2026-rare-earths-heavy.pdf">U.S. Geological Survey, </a><em><a href="https://pubs.usgs.gov/periodicals/mcs2026/mcs2026-rare-earths-heavy.pdf">Mineral Commodity Summaries 2026: Rare Earths (Heavy)</a></em><a href="https://pubs.usgs.gov/periodicals/mcs2026/mcs2026-rare-earths-heavy.pdf">, February 2026</a>. US heavy rare-earth imports in 2025: 100 metric tons of compounds and metals. Net import reliance 100% across 2021&#8211;2025. Terbium imports 100% from China; holmium 100% from China; lutetium 100% from China (including Hong Kong); ytterbium 86% from China.</p><p>[12] <a href="https://www.airealist.ai/">&#8220;Access, Disable, Destroy,&#8221; The AI Realist</a>.</p><p>[13] <a href="https://www.whitecase.com/insight-alert/china-imposes-extraterritorial-jurisdiction-and-50-rule-export-controls-rare-earth">White &amp; Case LLP, &#8220;China imposes extraterritorial jurisdiction and a 50% Rule for export controls on rare earth elements and other items,&#8221; October 2025</a>. Article 4 of MOFCOM Notification 61/2025 imposes a case-by-case review for memory chips at 256-layer and above and logic at 14 nanometer and below, plus production and testing equipment. <a href="https://carraglobe.com/china-rare-earth-export-controls-2026/">Carra Globe, &#8220;China Rare Earth Export Controls 2026,&#8221; May 2026</a>: case-by-case review remains active during the November 2025 suspension. MOFCOM original text: <a href="https://cset.georgetown.edu/publication/mofcom-notice-2025-61/">Center for Security and Emerging Technology translation of Notice No. 61</a>.</p><p>[14] Discovery Alert/industry analyst commentary, November 2025, citing industry consensus on heavy rare earth separation onshoring timelines. <em>(B-tier; consistent with multiple industry sources but no single A-tier confirmation.)</em></p><p>[15] <a href="https://www.pbs.org/newshour/politics/watch-trump-announces-plan-for-rare-earth-elements-strategic-reserve">PBS NewsHour / AP wire, &#8220;WATCH: Trump announces plan for rare earth elements strategic reserve,&#8221; February 2, 2026</a>; <a href="https://fortune.com/2026/02/03/project-vault-critical-minerals-stockpile-rare-earths-first-step-break-china-chokehold/">Fortune, &#8220;New &#8216;Project Vault&#8217; critical minerals stockpile is &#8216;first step of many&#8217;,&#8221; February 3, 2026</a>. Procurement firms named: Hartree Partners, Mercuria, Traxys.</p><p>[16] <a href="https://www.questmetals.com/blog/project-vault-12-billion-critical-mineral-stockpile">Quest Metals industry analysis, &#8220;Project Vault: $12 Billion Critical Mineral Stockpile,&#8221; February 5, 2026</a>, describing draw-down and replenishment structure.</p><p>[17] <a href="https://rareearthexchanges.com/news/project-vault-america-wants-a-strategic-minerals-reserve-but-can-it-stockpile-what-it-still-cant-produce/">Rare Earth Exchanges, &#8220;Project Vault: America Wants a Strategic Minerals Reserve &#8212; But Can It Stockpile What It Still Can&#8217;t Produce?,&#8221; May 2026</a>.</p><p>[18] <a href="https://investors.mpmaterials.com/investor-news/news-details/2025/MP-Materials-Reports-Third-Quarter-2025-Results/default.aspx">MP Materials Q3 2025 earnings release, November 6, 2025</a>; USGS MCS 2026, <em>Rare Earths (Heavy)</em> chapter, citing $150 million Department of War loan in August 2025.</p><p>[19] Lynas Rare Earths Q3 FY2026 results; <a href="https://www.argusmedia.com/en/news-and-insights/latest-market-news/2748095-lynas-rare-earth-output-rises-in-3q">Argus Media, &#8220;Lynas rare earth output rises in 3Q,&#8221; November 3, 2025</a>; <a href="https://rareearthexchanges.com/news/lynas-doubles-down-on-heavy-rare-earths-as-the-wests-only-scaled-separation-powerhouse/">Rare Earth Exchanges, &#8220;Lynas Doubles Down on Heavy Rare Earths,&#8221; February 25, 2026</a>.</p><p>[20] IEA, op. cit. China exported 58,000 tons of rare earth magnets in 2024.</p><p>[21] <a href="https://rareearthexchanges.com/news/china-targets-chipmaking-bottleneck-rare-earth-polishing-project-launches-in-shenzhen/">Rare Earth Exchanges, &#8220;China Targets Chipmaking Bottleneck: Rare Earth Polishing Project Launches in Shenzhen,&#8221; March 19, 2026</a>. <em>(B-tier source; project is announced state R&amp;D, not yet commercial-scale; treat as directional signal.)</em></p><p>[22] <a href="https://www.brookings.edu/articles/what-will-happen-when-trump-meets-xi/">Brookings, &#8220;What will happen when Trump meets Xi?,&#8221; May 5, 2026</a>; <a href="https://www.pakistantoday.com.pk/2026/05/08/where-are-the-flash-points-in-next-weeks-trump-xi-talks">Pakistan Today, &#8220;Trump-Xi talks to focus on trade, Iran and Taiwan,&#8221; May 8, 2026</a>.</p><p>[23] <a href="https://www.washingtontimes.com/news/2026/apr/20/chinese-fentanyl-exports-lock-rare-earths-top-trumps-agenda-summit-xi/">Washington Times, &#8220;Chinese fentanyl exports, lock on rare earths to top Trump&#8217;s agenda at summit with Xi,&#8221; April 20, 2026</a>, citing USTR Jamieson Greer testimony to House Appropriations subcommittee.</p>]]></content:encoded></item><item><title><![CDATA[Where the Uranium bottlenecks actually are]]></title><description><![CDATA[Capacity Factor &#8212; Post 1 of 6 in a series on US nuclear fuel cycle equities.]]></description><link>https://www.airealist.ai/p/where-the-uranium-bottlenecks-actually</link><guid isPermaLink="false">https://www.airealist.ai/p/where-the-uranium-bottlenecks-actually</guid><dc:creator><![CDATA[Dante]]></dc:creator><pubDate>Sat, 09 May 2026 23:28:27 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UhRb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Energy is the critical bottleneck for AI infrastructure today. In <em><a href="https://www.airealist.ai/p/the-half-life-of-a-press-release">The Half-Life of a Press Release</a></em>, we examined recent Small Modular Reactor hyperscaler announcements and their critical dependence on nuclear fuel enrichment. In this piece, we will focus on American companies operating in this field.</p><p>In May 2026, McKinsey published <a href="https://www.mckinsey.com/industries/electric-power-and-natural-gas/our-insights/understanding-domestic-nuclear-fuel-production-options-in-the-united-states">this report</a> [1] on the US domestic nuclear fuel cycle that put a number on the rebuild: <strong>$105&#8211;170 billion of capex through 2050</strong>, split across mining, conversion, enrichment, fabrication, and reprocessing.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.airealist.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The AI Realist! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>That&#8217;s a useful frame, but it&#8217;s not the investable number. The investable number is which one or two segments will absorb more than half of the new awards in the next 36 months, because the rest of the chain cannot move without them.</p><p>This is the first in a six-part series on US-listed nuclear-fuel-cycle equities. I screened 22 names against four filters &#8212; small/mid-cap, off all-time-high, accelerating fundamentals, and early narrative &#8212; and by the end of the series, I&#8217;ll be down to a five-name long book.</p><p>But before any of that, you have to understand where the bottlenecks actually are. They are not where most of the public conversation says they are.</p><h1><strong>The five segments and what they cost</strong></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UKOA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UKOA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 424w, https://substackcdn.com/image/fetch/$s_!UKOA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 848w, https://substackcdn.com/image/fetch/$s_!UKOA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 1272w, https://substackcdn.com/image/fetch/$s_!UKOA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UKOA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png" width="846" height="632" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:632,&quot;width&quot;:846,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:48301,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://dante126.substack.com/i/196714213?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!UKOA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 424w, https://substackcdn.com/image/fetch/$s_!UKOA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 848w, https://substackcdn.com/image/fetch/$s_!UKOA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 1272w, https://substackcdn.com/image/fetch/$s_!UKOA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bbb39a4-a091-4ec8-8a54-577ce8f47a7a_846x632.png 1456w" sizes="100vw" loading="lazy" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The fuel cycle decomposes into five sequential nodes plus two adjacencies (reactors and waste/storage). Here&#8217;s the McKinsey capex stack:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bb7J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bb7J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 424w, https://substackcdn.com/image/fetch/$s_!Bb7J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 848w, https://substackcdn.com/image/fetch/$s_!Bb7J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 1272w, https://substackcdn.com/image/fetch/$s_!Bb7J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bb7J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png" width="1234" height="300" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:300,&quot;width&quot;:1234,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:72017,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dante126.substack.com/i/196714213?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Bb7J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 424w, https://substackcdn.com/image/fetch/$s_!Bb7J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 848w, https://substackcdn.com/image/fetch/$s_!Bb7J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 1272w, https://substackcdn.com/image/fetch/$s_!Bb7J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F790fdd88-0120-45c3-ae0e-0eaf62aed7e1_1234x300.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>If you read those numbers naively, reprocessing is the biggest opportunity. It isn&#8217;t. Commercial reprocessing has been effectively blocked in the US since Jimmy Carter&#8217;s 1977 executive order [2] and remains uninvestable on any horizon shorter than a decade. The capex range is wide because it&#8217;s a greenfield-risk number for a thing that probably won&#8217;t get built before 2040.</p><p>Mining looks underweighted at $15&#8211;20B. It is  but globally, there is no shortage of uranium-producing capacity. Kazatomprom alone supplies roughly 40% of global production at low cost [3]. Adding US mining is a national-security argument, not a global-capacity argument. The investable angle in mining is uranium-spot beta plus US-specific permitting and ramp execution &#8212; not ground-up mine economics.</p><p>The interesting numbers are conversion and enrichment.</p><h1><strong>Where the bottleneck actually is</strong></h1><p>I&#8217;d score the seven nodes like this for severity over the next decade. Severity scale: 5 = single point of failure for the chain; 1 = not a binding constraint.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UhRb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UhRb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 424w, https://substackcdn.com/image/fetch/$s_!UhRb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 848w, https://substackcdn.com/image/fetch/$s_!UhRb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 1272w, https://substackcdn.com/image/fetch/$s_!UhRb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UhRb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png" width="1222" height="1056" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1056,&quot;width&quot;:1222,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:190298,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://dante126.substack.com/i/196714213?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!UhRb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 424w, https://substackcdn.com/image/fetch/$s_!UhRb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 848w, https://substackcdn.com/image/fetch/$s_!UhRb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 1272w, https://substackcdn.com/image/fetch/$s_!UhRb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36fb491c-f69a-4f92-84f4-57e4b0be942a_1222x1056.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three observations from this table that surprised me when I started doing this work.</p><p><strong>First, HALEU enrichment is the single tightest knot in the chain.</strong></p><p>HALEU &#8212; high-assay low-enriched uranium, 5&#8211;19.75% U-235 &#8212; is what every advanced reactor needs for its first core:</p><ul><li><p><a href="https://oklo.com/">Oklo</a> Aurora,</p></li><li><p><a href="https://www.terrapower.com/">TerraPower</a> Natrium,</p></li><li><p><a href="https://x-energy.com/">X-energy</a> Xe-100,</p></li><li><p><a href="https://kairospower.com/">Kairos Power</a> KP-FHR.</p></li></ul><p>Until 2024, virtually all commercial HALEU came from Russia. Today, Centrus Energy has produced the first ~900 kg of US-origin HALEU at Piketon, Ohio. That is the entire commercial Western supply.</p><p><strong>Second, conversion is almost as tight &#8212; and there is no way to play it directly on the listed US tape.</strong></p><p>The single operating US conversion facility is <a href="https://www.solsticeam.com/">ConverDyn&#8217;s</a> Metropolis Works in Illinois, running at roughly 7 ktU/yr against an original nameplate of 15 ktU/yr. Its parent is Honeywell. Honeywell is a $137B mega-cap where conversion is a low single-digit percent of revenue. There is no listed pure-play. This matters for the screen because it means even if you correctly identify conversion as the tightest commercial bottleneck, you cannot express it cleanly through a single name. Anyone who tells you they have a &#8220;conversion trade&#8221; via Honeywell is overstating their position.</p><p><strong>Third, advanced fuel fabrication (TRISO and metallic alloys) is also acute, with similarly thin investable exposure.</strong> The NRC granted X-energy the first-ever Category II TRISO fuel fabrication license in February 2026. X-energy is private. The only public direct play in advanced fuel fab is BWX Technologies (NYSE: BWXT) &#8212; and BWXT is a $19B mid-cap trading near its all-time high, well-covered, and structurally above the size cap most thematic books carry.</p><p><strong>Mining sits below those three in severity.</strong> It is a thematic-beta trade with a structural overlay, not a structural trade with a price overlay. That distinction matters: if uranium spot rolls over 25%, mining-name multiples compress fast. The conversion and HALEU bottlenecks don&#8217;t decompress that way.</p><p>The DOE award everyone should be paying attention to</p><p>On January 6, 2026, the US Department of Energy awarded $2.7 billion [4], split evenly three ways:</p><ul><li><p><strong>$900M to American Centrifuge Operating</strong> (a Centrus Energy subsidiary) for HALEU at Piketon, Ohio.</p></li><li><p><strong>$900M to General Matter</strong> for HALEU at the former Paducah Gaseous Diffusion Plant in Kentucky. <a href="https://generalmatter.com/">General Matter</a> only emerged from stealth in April 2025 and signed its DOE land lease in August 2025.</p></li><li><p><strong>$900M to Orano Federal Services</strong> for LEU at Project IKE in Oak Ridge, Tennessee &#8212; a piece of a roughly $5B greenfield enrichment project.</p></li></ul><p>Plus a smaller $28M supplemental award to Global Laser Enrichment [5] (Silex / Cameco JV) for next-gen technology.</p><p>The structure of this award is, to me, the most consequential signal in the McKinsey article. The federal government had a choice: concentrate the bet behind one US-owned producer, or seed three separate efforts. <strong>It chose three.</strong> That decision compresses per-name optionality versus a winner-take-all outcome, but it converts the question from &#8220;will US-owned HALEU exist?&#8221; (speculative) to &#8220;which of three named producers will execute first?&#8221; (handicapping).</p><p>Two of the three are private. The only listed name that won a tranche is <strong>Centrus Energy (NYSE: LEU)</strong>. That is why every conversation about US enrichment exposure starts and often ends with Centrus &#8212; the math of public-market exposure forces it.</p><p><strong>What this means for stock-picking</strong></p><p>If you&#8217;re a thematic investor with a US-listed mandate, the McKinsey frame collapses to a few hard observations.</p><p>1. <strong>HALEU enrichment is where bottleneck severity, federal funding, and listed exposure all converge.</strong> This is where the work has to be most rigorous, because the names are crowded and the cone of outcomes is wide.</p><p>2. <strong>Conversion is structurally critical but offers no clean public expression.</strong> A future Solstice / Honeywell Advanced Materials spinoff is the most-watched corporate-action catalyst in the cycle.</p><p>3. <strong>Mining is investable but it is a uranium-price trade with a structural overlay, not the other way around.</strong> The order of those words is the difference between a 30% drawdown and a five-bagger.</p><p>4. <strong>The picks-and-shovels lane</strong> &#8212; waste handling, dosimetry, decommissioning instrumentation &#8212; <strong>is its own structural thesis</strong>, and there is exactly one filter-compliant mid-cap in it. I&#8217;ll come back to that in Post 4.</p><p>5. <strong>The advanced reactor adjacency</strong> (NuScale, Oklo, Nano Nuclear, BWXT, GE Vernova) <strong>is the demand engine for the entire chain.</strong> But FOAK economics are still unproven and the narrative is loud. Post 5.</p><p>The single most important question I&#8217;m asking through the rest of this series isn&#8217;t &#8220;which of these names is great.&#8221; It&#8217;s &#8220;which of these names is great <em>at a price I should actually pay</em>.&#8221; Most of them aren&#8217;t, today.</p><h1><strong>What&#8217;s coming</strong></h1><ul><li><p><strong>Post 2 &#8212; HALEU enrichment.</strong> Centrus Energy as the McKinsey anchor name. ASP Isotopes&#8217; Quantum Leap Energy subsidiary as the optionality slot. Why I think one of these is fundamentally cheap right now and the other one isn&#8217;t.</p></li><li><p><strong>Post 3 &#8212; Conversion.</strong> The bottleneck nobody can play directly, and the Solstice spin that might fix that.</p></li><li><p><strong>Post 4 &#8212; Picks-and-shovels.</strong> One mid-cap that passes the screen in two of seven segments simultaneously.</p></li><li><p><strong>Post 5 &#8212; SMR demand.</strong> Why I think the post-CFPP-cancellation reset on NuScale is more advanced than the market recognizes &#8212; and why that doesn&#8217;t mean I&#8217;m long.</p></li><li><p><strong>Post 6 &#8212; The book.</strong> Five-name long book, position-sized, with explicit falsification triggers for each.</p></li></ul><p>Subscribe if you want this in your inbox over the next few weeks.</p><p><strong>Further reading</strong></p><p>[1] McKinsey &amp; Co. &#8212; <a href="https://www.mckinsey.com/industries/electric-power-and-natural-gas/our-insights/understanding-domestic-nuclear-fuel-production-options-in-the-united-states">Understanding domestic nuclear fuel production options in the United States</a></p><p>[2] <a href="https://www.osti.gov/biblio/6770612">Jimmy Carter&#8217;s Executive Order</a></p><p>[3] <a href="https://www.kazatomprom.kz/en/page/uranium_market">Kazatomprom</a> - Uranium market</p><p>[4] DOE &#8212; <a href="https://www.energy.gov/articles/us-department-energy-awards-27-billion-restore-american-uranium-enrichment">Awards $2.7 billion to restore American uranium enrichment</a></p><p>[5] ANS Nuclear Newswire &#8212; <a href="https://www.ans.org/news/article-7652/doe-awards-27b-for-haleu-and-leu-enrichment/">DOE awards $2.7B for HALEU and LEU enrichment</a></p><p>World Nuclear News &#8212; <a href="https://www.world-nuclear-news.org/articles/us-enrichment-funding-reactions">US enrichment funding recipients flesh out plans</a></p><p><a href="https://www.congress.gov/bill/118th-congress-house-bill/1042">Prohibiting Russian Uranium Imports Act</a> P.L. 118-62</p><p>*<em>Capacity Factor is a six-part series on US nuclear fuel-cycle equities. Next post: HALEU enrichment.</em>*</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.airealist.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The AI Realist! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The $500 Billion Umbrella]]></title><description><![CDATA[OpenAI called it the largest infrastructure project in history. Now they call it an umbrella.]]></description><link>https://www.airealist.ai/p/the-500-billion-umbrella</link><guid isPermaLink="false">https://www.airealist.ai/p/the-500-billion-umbrella</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Wed, 06 May 2026 15:39:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!E-FD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E-FD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E-FD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!E-FD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!E-FD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!E-FD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E-FD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2225729,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/196113652?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E-FD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!E-FD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!E-FD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!E-FD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07549c52-1fb4-475e-ba27-2cd410d59132_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In January 2025, Sam Altman stood in the White House beside Donald Trump, Masayoshi Son, and Larry Ellison to announce the largest AI infrastructure project in history. Stargate: $500 billion, four years, a network of gigawatt-scale data centers across the United States and eventually the world. Fifteen months later, the project is collapsing from the periphery inward &#8212; and the center isn&#8217;t holding either.</p><p><strong>The scorecard.</strong> In March, OpenAI and Oracle scrapped plans to expand the flagship Stargate campus in Abilene, Texas, from 1.2 gigawatts to 2 gigawatts after financing negotiations broke down. [1] Crusoe, the site developer, had already been struggling with reliability problems &#8212; a winter storm took liquid-cooling infrastructure offline for days. [2]</p><p>Microsoft swept in to rent the abandoned 900 megawatt expansion site from Crusoe. [3] Stargate was supposed to free OpenAI from Microsoft&#8217;s cloud. Now, Microsoft is occupying the data center that OpenAI couldn&#8217;t fill. Oracle, Stargate&#8217;s infrastructure partner, is the landlord to Microsoft at the site OpenAI abandoned. The access moat built a building. Someone else moved in.</p><p>On April 9, OpenAI paused Stargate UK entirely, citing energy costs and the regulatory environment &#8212; and, per Bloomberg, reining in spending ahead of a planned IPO. [4] The Nscale partnership announced in September 2025 &#8212; 8,000 Nvidia processors at Cobalt Park, Tyneside, first quarter 2026 &#8212; passed its own deadline without breaking ground. [5] In Abu Dhabi, Iran&#8217;s Islamic Revolutionary Guard Corps has threatened to destroy the $30 billion Stargate UAE facility, releasing satellite imagery of the site. [6]</p><p>Three sites. Three different failure modes. Financing (Abilene). Energy costs and regulation (UK). Missile threats (UAE). The original Abilene campus is operational &#8212; multiple buildings running Nvidia GPUs for OpenAI. But that campus predated the Stargate announcement. The new infrastructure &#8212; the expansion, the international sites, the multi-gigawatt network &#8212; is what the $500 billion was supposed to buy. None of it has materialized.</p><p>Stargate is not an outlier. Thirty to fifty percent of all US data center builds planned for 2026 face delays or cancellation &#8212; roughly half the industry&#8217;s pipeline. [7] Of the 16 gigawatts of planned capacity, only 5 are under construction. By 2027, it gets worse: 6.3 gigawatts under construction against 21.5 announced. [8] The bottleneck is not money &#8212; it is transformers, switchgear, and batteries that nobody can source fast enough. Stargate is just the project with its name on the White House lawn.</p><p><strong>The pivot.</strong> While the sites were stalling, OpenAI abandoned its plan to build and own data centers altogether. In mid-March, The Information reported that OpenAI is now renting server capacity from cloud providers instead of building its own facilities. [9] The company restructured its entire compute team in response to this shift. [10] Total projected spending dropped from $1.4 trillion through 2033 to $600 billion through 2030. [11] OpenAI signed a $100 billion expansion of its AWS agreement &#8212; making Amazon, not Oracle or SoftBank, the de facto third-party infrastructure backbone. [12]</p><p>On April 29, the Financial Times reported that OpenAI has &#8220;in practice abandoned the joint venture.&#8221; [13] One person involved with Stargate said the company had &#8220;sidelined first-party data centers.&#8221; An insider close to SoftBank put it more bluntly: &#8220;People can basically define what &#8216;Stargate&#8217; is for themselves. To some extent, any compute project involving SoftBank or Oracle can be called &#8216;Stargate.&#8217;&#8221; [13] OpenAI itself now calls it &#8220;an umbrella for our compute strategy.&#8221; In Norway, another Stargate-branded site fell through; OpenAI couldn&#8217;t close an offtake deal with Nscale at the Narvik facility, and Microsoft stepped in to lease the capacity instead. [13] Partners are &#8220;feeling let down and misled.&#8221; One source told the FT they prefer Microsoft as a tenant because &#8220;they are more creditworthy.&#8221; [13]</p><p>On April 11, three of Stargate&#8217;s original infrastructure leads &#8212; including Peter Hoeschele, who ran the early datacenter effort &#8212; left OpenAI for Meta. [14] The people who built the project are leaving. The day the FT story ran, OpenAI published a blog post claiming it had &#8220;surpassed&#8221; its 10 gigawatt target, with &#8220;more than 3 GW added in the last 90 days alone.&#8221; [15] The language was careful: &#8220;The financing models and partnership structures may evolve, but what matters is capacity coming online at scale.&#8221; This is the tell. Three gigawatts of leased capacity from AWS and Oracle is not three gigawatts of Stargate infrastructure. When you rent a hotel room, you don&#8217;t get to claim you built a hotel. The pivot may produce better economics for OpenAI &#8212; controlling chip decisions while renting the buildings is a defensible strategy. The question is whether the $500 billion investment thesis survives the change.</p><p><strong>What the financing reveals.</strong> SoftBank, Stargate&#8217;s financial partner, took out a $40 billion unsecured bridge loan on March 27 with a twelve-month maturity. [16] The loan&#8217;s primary purpose: funding a $30 billion follow-on investment in OpenAI, bringing SoftBank&#8217;s total equity exposure to approximately $64.6 billion in a single pre-IPO company. [17] The loan matures in March 2027, before most Stargate sites will produce a kilowatt. SoftBank is financing the equity bet, not the infrastructure. Oracle, the designated builder, carries over $100 billion in debt on $30 billion in equity, with CDS spreads at their highest since 2009 and its own bondholders suing over undisclosed financing needs. [18] Beyond the original Abilene campus, nobody is financing Stargate construction.</p><p>OpenAI has also signed chip deals totaling nearly 27 gigawatts &#8212; with Nvidia, AMD, Broadcom, and Cerebras. [19] Stargate&#8217;s total planned capacity, as of September 2025, is approximately 7 gigawatts. [20] The chip commitments exceed the infrastructure capacity by roughly 4-to-1. Either the chips go into other people&#8217;s data centers &#8212; which is what &#8220;renting from AWS&#8221; means &#8212; or the commitments are aspirational on both sides.</p><p><strong>The sovereign compute casualty.</strong> The UK pause is not just about energy costs. OpenAI for Countries &#8212; the program extending Stargate to the UK, Australia, Greece, the UAE, Slovakia, Kazakhstan, and others &#8212; was a sovereignty product. [21] The pitch: run frontier models locally within your jurisdiction on dedicated infrastructure. That requires physical infrastructure that OpenAI controls. If OpenAI can&#8217;t build it in the UK &#8212; stable grid, rule of law, English-speaking talent, George Osborne on the payroll &#8212; it can&#8217;t build it in Kazakhstan or Greece either.</p><p>Stargate was the largest AI infrastructure announcement ever made. Fifteen months later, the company that announced it calls it &#8220;an umbrella.&#8221; No international site has broken ground. The builder is being sued by its bondholders. The financier is providing equity financing through a 12-month loan. The people who ran the project are leaving for Meta. Partners who signed up to build data centers are watching Microsoft take the leases. The $500 billion bought a valuation, not a data center.</p><p><strong>What happens next?</strong> Two scenarios.</p><p>First, Oracle&#8217;s balance sheet forces a reckoning. Over $100 billion in debt, negative free cash flow, and a quarter-trillion dollars in off-balance-sheet lease commitments are grounds for a credit downgrade. [18] Oracle can no longer finance the buildout at investment-grade rates. The 4.5 gigawatt agreement with OpenAI shrinks or restructures. SoftBank&#8217;s bridge loan matures in March 2027 without the infrastructure to justify a rollover. The Stargate venture is formally wound down or absorbed into existing bilateral cloud contracts.</p><p>Second, the sovereign compute product dies. OpenAI for Countries promised governments dedicated infrastructure inside their borders. If OpenAI is renting, not building, the infrastructure is Amazon&#8217;s or Microsoft&#8217;s &#8212; subject to US jurisdiction, not sovereign control. Governments that signed memoranda of understanding on the promise of sovereign AI discover they bought ChatGPT Edu licenses and a press photo with Sam Altman. The dependency on US cloud infrastructure that the sovereign product was supposed to escape remains intact.</p><p>For any AI infrastructure deal that follows &#8212; Stargate or otherwise &#8212; the test is simple: a site under construction, a power purchase agreement in force, and a builder whose balance sheet can finish the job. Anything less is a press release.</p><div><hr></div><h3>Notes</h3><p>[1] Brody Ford, Edward Ludlow, and Dina Bass, <a href="https://www.bloomberg.com/news/articles/2026-03-06/oracle-and-openai-end-plans-to-expand-flagship-data-center">&#8220;Oracle and OpenAI End Plans to Expand Flagship Data Center,&#8221;</a> <em>Bloomberg</em>, March 6, 2026.</p><p>[2] <a href="https://www.tomshardware.com/tech-industry/artificial-intelligence/openais-massive-stargate-data-center-canceled-as-firm-cant-reach-terms-with-oracle-operator-struggles-with-reliability-issues-meta-said-to-be-interested-in-snatching-excess-capacity">&#8220;OpenAI&#8217;s massive Stargate data center canceled as firm can&#8217;t reach terms with Oracle,&#8221;</a> Tom&#8217;s Hardware, March 8, 2026. Crusoe liquid-cooling disruption during winter weather is cited in the piece.</p><p>[3] Dina Bass and Brody Ford, <a href="https://www.bloomberg.com/news/articles/2026-03-27/microsoft-rents-data-center-project-developed-for-oracle-openai">&#8220;Microsoft Rents Data Center Project Developed for Oracle, OpenAI,&#8221;</a> <em>Bloomberg</em>, March 27, 2026. Crusoe confirmed approximately 900 MW capacity, with the first building expected in mid-2027. Earlier <a href="https://www.bloomberg.com/news/articles/2026-03-24/microsoft-to-rent-texas-data-center-dropped-by-oracle-openai">Bloomberg reporting</a> (March 24) cited approximately 700 MW; the difference likely reflects site capacity vs. initial IT load.</p><p>[4] <a href="https://www.bloomberg.com/news/articles/2026-04-09/openai-pauses-stargate-uk-data-center-effort-citing-energy-costs">&#8220;OpenAI Pauses Stargate UK Data Center Citing Energy Costs,&#8221;</a> <em>Bloomberg</em>, April 9, 2026. Bloomberg reports OpenAI is &#8220;reining in ambitious spending plans ahead of a highly anticipated public listing.&#8221; OpenAI statement: &#8220;We continue to explore Stargate UK and will move forward when the right conditions, such as regulation and the cost of energy, enable long-term infrastructure investment.&#8221; See also <a href="https://www.cnbc.com/2026/04/09/openai-halts-uk-stargate-project.html">CNBC</a>, April 9, 2026.</p><p>[5] <a href="https://finance.yahoo.com/sectors/technology/articles/openai-flagship-uk-data-project-080000994.html">&#8220;OpenAI&#8217;s flagship UK data project delayed in setback for Starmer,&#8221;</a> <em>The Telegraph</em>, April 4, 2026. The original September 2025 announcement specified ~8,000 Nvidia processors at Cobalt Park, with a Q1 2026 target.</p><p>[6] <a href="https://www.tomshardware.com/tech-industry/iran-threatens-complete-and-utter-annihilation-of-openais-usd30b-stargate-ai-data-center-in-abu-dhabi-regime-posts-video-with-satellite-imagery-of-chatgpt-makers-premier-1gw-data-center">&#8220;Iran threatens &#8216;complete and utter annihilation&#8217; of OpenAI&#8217;s $30B Stargate AI data center in Abu Dhabi,&#8221;</a> Tom&#8217;s Hardware, April 5, 2026. IRGC Brigadier General Ebrahim Zolfaghari&#8217;s statements; satellite imagery of the site included in the IRGC video.</p><p>[9] <em>The Information</em>, reporting on OpenAI&#8217;s shift from building to renting data center capacity, mid-March 2026. Cited by <a href="https://www.datacenterdynamics.com/en/news/openai-reorganizes-leadership-amid-data-center-strategy-readjustment/">Data Center Dynamics</a>, <a href="https://thedeepdive.ca/openai-abandons-own-data-center-plans-reshuffles-stargate-leadership-as-financing-falters/">The Deep Dive</a>, <a href="https://www.cnbc.com/2026/03/22/openai-data-center-pivot-underscores-wall-street-ipo-concerns.html">CNBC</a>, and others.</p><p>[10] <a href="https://www.datacenterdynamics.com/en/news/openai-reorganizes-leadership-amid-data-center-strategy-readjustment/">&#8220;OpenAI reorganizes leadership amid data center strategy readjustment,&#8221;</a> Data Center Dynamics, March 18, 2026. Sachin Katti appointed to oversee Stargate groups; the compute team split into three divisions.</p><p>[11] <a href="https://www.cnbc.com/2026/03/22/openai-data-center-pivot-underscores-wall-street-ipo-concerns.html">&#8220;OpenAI&#8217;s data center pivot underscores Wall Street spending concerns ahead of IPO,&#8221;</a> CNBC, March 22, 2026. Total projected compute spending reduced from $1.4 trillion (through 2033) to $600 billion (through 2030).</p><p>[12] OpenAI expanded its existing AWS agreement by $100 billion over eight years; AWS was designated the exclusive third-party cloud distribution provider for OpenAI&#8217;s enterprise platform. <a href="https://www.cnbc.com/2026/02/27/open-ai-funding-round-amazon.html">CNBC</a>, February 27, 2026.</p><p>[16] SoftBank $40 billion unsecured bridge financing facility, March 27, 2026. Twelve-month maturity (March 25, 2027). Syndicated by JPMorgan Chase, Goldman Sachs, Mizuho, SMBC, and MUFG. Interest rate not publicly disclosed as of publication. <a href="https://tech-insider.org/softbank-40-billion-loan-openai-stargate-2026/">Source</a>.</p><p>[17] SoftBank&#8217;s cumulative OpenAI equity exposure: $19 billion initial Stargate equity + $30 billion follow-on = $49 billion confirmed. Additional Vision Fund 2 positions bring the estimated total to approximately $64.6 billion (~13% ownership). OpenAI&#8217;s funding round closed at $122 billion in March 2026 at an $852 billion post-money valuation (initial $110B close in February expanded to $122B by final close). Author compilation from <a href="https://www.spglobal.com/market-intelligence/en/news-insights/research/softbabnk-openai-oracle-and-mgx-commit-to-100b-for-stargate-ai-infrastructure">S&amp;P Global</a>, <a href="https://www.cnbc.com/2026/02/27/open-ai-funding-round-amazon.html">CNBC</a>, <a href="https://www.cnbc.com/2026/04/15/openai-stargate-norway-project-microsoft.html">CNBC April 15</a>, and SoftBank disclosures.</p><p>[19] Nvidia: 10 GW LOI, September 2025. AMD: 6 GW definitive agreement, October 2025. Broadcom: 10 GW custom silicon term sheet, October 2025. Cerebras: $10 billion / 750 MW inference deal, January 2026. Sources: respective company announcements and <a href="https://www.tomshardware.com/tech-industry/openai-couldnt-finance-its-data-centers-so-it-took-control-of-hardware-instead">Tom&#8217;s Hardware compilation</a>, February 24, 2026.</p><p>[20] OpenAI, <a href="https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/">&#8220;Building the compute infrastructure for the Intelligence Age,&#8221;</a> April 29, 2026, confirms the original 10 GW commitment: &#8220;When we announced Stargate in January 2025, we committed to securing 10GW of AI infrastructure in the United States by 2029.&#8221; September 23, 2025, expansion announcement brought the total to &#8220;nearly 7 gigawatts&#8221; of Stargate-branded planned capacity.</p><p>[21] OpenAI for Countries program: UK, Australia, Greece, UAE, Slovakia, Kazakhstan, and others. <a href="https://openai.com/global-affairs/openai-for-countries/">OpenAI</a>, September 2025. See also <a href="https://www.engadget.com/ai/openai-pauses-its-stargate-uk-data-center-plan-115626978.html">&#8220;OpenAI pauses its Stargate UK data center plan,&#8221;</a> Engadget, April 9, 2026.</p><p>[18] Oracle Corporation Form 10-Q, period ended November 30, 2025 (<a href="https://www.sec.gov/Archives/edgar/data/0001341439/000119312525315925/orcl-20251130.htm">SEC filing</a>). Total debt: $108.1 billion ($8.1B current + $100.0B non-current). Total stockholders&#8217; equity: $30.5 billion. Off-balance-sheet lease commitments of $248 billion are disclosed in notes to financial statements. Bondholder lawsuit: Ohio Carpenters&#8217; Pension Plan v. Oracle, filed January 14, 2026, NYSC. <a href="https://www.bloomberg.com/news/articles/2026-01-15/oracle-sued-over-disclosures-tied-to-18-billion-bond-offering">Bloomberg</a>, January 15, 2026.</p><p>[7] <a href="https://www.sightlineclimate.com/research/data-center-outlook">Sightline Climate, 2026 Data Center Outlook</a>. Of ~16 GW of US data center capacity planned for 2026 across 140 projects, only ~5 GW is under active construction. 25% of projects have not disclosed their powering strategy. See also <a href="https://www.bloomberg.com/news/newsletters/2026-04-01/us-data-center-boom-relies-on-hard-to-find-electrical-equipment">Bloomberg</a>, April 1, 2026.</p><p>[8] 2027 pipeline: 6.3 GW under construction vs. 21.5 GW announced. Beyond 2028, 37 GW of planned capacity has not broken ground, and only 4.5 GW of that has begun work. <a href="https://futurism.com/science-energy/data-centers-construction-supply">Futurism</a>, April 2026; <a href="https://www.zerohedge.com/technology/half-us-data-centers-are-set-be-canceled-or-delayed-2026">ZeroHedge analysis</a> citing Sightline Climate and Canaccord.</p><p>[14] Peter Hoeschele, Shamez Hemani, and Anuj Saharan left OpenAI and are joining Meta. Hoeschele led the early Stargate datacenter effort; Hemani worked on computing strategy; Saharan led within the computing organization. <a href="https://www.bloomberg.com/news/articles/2026-04-11/former-openai-stargate-leaders-plan-to-join-meta-platforms">&#8220;Former OpenAI Stargate Leaders Plan to Join Meta Platforms,&#8221;</a> <em>Bloomberg</em>, April 11, 2026.</p><p>[13] <em>Financial Times</em>, reported April 29, 2026. OpenAI has &#8220;in practice abandoned the joint venture.&#8221; One person involved with Stargate said the company had &#8220;sidelined first-party data centers.&#8221; OpenAI described Stargate as &#8220;an umbrella for our compute strategy.&#8221; A person close to SoftBank: &#8220;People can basically define what &#8216;Stargate&#8217; is for themselves. To some extent, any compute project involving SoftBank or Oracle can be called &#8216;Stargate.&#8217; Norway Stargate site abandoned; Microsoft leased the Narvik facility from Nscale. Partners &#8220;feeling let down and misled.&#8221; Source preference for Microsoft as tenant: &#8220;They are more creditworthy.&#8221; Cited via <a href="https://www.tomshardware.com/tech-industry/artificial-intelligence/openai-has-effectively-abandoned-first-party-stargate-data-centers-in-favor-of-more-flexible-deals-company-now-prefers-to-lease-compute-and-says-stargate-is-an-umbrella-term">Tom&#8217;s Hardware</a>, April 30, 2026. &#8220;Define for themselves&#8221; quote via <a href="https://finance.biggo.com/news/gwEJ350BLfE1EzqP7-BQ">BigGo Finance</a>, May 1, 2026, citing FT sources. See also <a href="https://www.cnbc.com/2026/04/15/openai-stargate-norway-project-microsoft.html">CNBC</a>, April 15, 2026, for details on Norway.</p><p>[15] OpenAI, <a href="https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/">&#8220;Building the compute infrastructure for the Intelligence Age,&#8221;</a> April 29, 2026. Claims to have &#8220;surpassed&#8221; 10 GW target with &#8220;more than 3GW added in the last 90 days alone.&#8221; Note: &#8220;capacity&#8221; in OpenAI&#8217;s usage includes leased capacity from third-party providers (AWS, Oracle, Microsoft), not only self-built infrastructure. The blog&#8217;s language &#8212; &#8220;the financing models and partnership structures may evolve&#8221; &#8212; is an implicit acknowledgment of the FT reporting published the same day.</p>]]></content:encoded></item><item><title><![CDATA[The Round Trip]]></title><description><![CDATA[Four hyperscalers reported $700 billion in combined AI infrastructure spending. The number that matters is how much of their reported AI revenue is their own investment coming home.]]></description><link>https://www.airealist.ai/p/the-round-trip</link><guid isPermaLink="false">https://www.airealist.ai/p/the-round-trip</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Mon, 04 May 2026 09:28:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2Jlv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2Jlv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2Jlv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!2Jlv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!2Jlv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!2Jlv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2Jlv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1431165,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/196115882?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2Jlv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!2Jlv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!2Jlv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!2Jlv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe829ca4a-e89c-4187-b013-b4a83a08470a_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three days in April told the story the earnings calls didn&#8217;t.</p><p>On Sunday, April 27, Microsoft and OpenAI announced they were ending their exclusivity arrangement. The partnership that defined the first era of commercial AI &#8212; Microsoft&#8217;s billions in exchange for sole access to OpenAI&#8217;s models &#8212; was restructured into a looser arrangement: a non-exclusive license through 2032, OpenAI free to serve any cloud, Microsoft no longer paying OpenAI a revenue share. The word &#8220;exclusive&#8221; became &#8220;non-exclusive.&#8221; [1]</p><p>On Monday, OpenAI&#8217;s models went live on Amazon Web Services. Bedrock customers could now access GPT and Codex alongside Anthropic&#8217;s Claude, which had been on Bedrock since 2023. Andy Jassy posted on X to celebrate. [2]</p><p>On Tuesday evening, all four of the world&#8217;s largest technology companies reported quarterly earnings within hours of each other. Amazon, Microsoft, Alphabet, and Meta collectively guided capital expenditures to approximately $700 billion in 2026 &#8212; the largest single-year infrastructure commitment in the history of corporate America. [3] Each reported what it called AI revenue. No two defined the term the same way.</p><p>The market moved. Microsoft surged after reporting capital expenditure of $3.4 billion below expectations. Meta dropped 6% after raising its capex guidance by $10 billion. [4] Alphabet climbed on Google Cloud&#8217;s 63% revenue growth. Amazon beat on every line. [5] The consensus held: AI spending is working.</p><p>What the consensus missed is that the spending and the revenue are partly the same money.</p><h2>The Numbers Everyone Saw</h2><p>The headline figures are genuinely impressive. AWS grew 28%, Google Cloud crossed $20 billion at 63% growth, Azure hit 40% in constant currency, and Meta&#8217;s ad revenue surged 33%. [6][7][8][9] The cloud businesses are performing. The question is what else the earnings showed.</p><p>Amazon&#8217;s net income surged 77% to $30.3 billion. Buried in the 8-K: $16.8 billion of that came from pre-tax gains on its investment in Anthropic, booked as non-operating income. [10] Strip the Anthropic gain, and Amazon&#8217;s net income growth was respectable but unremarkable.</p><p>Alphabet&#8217;s net income rose 81% to $62.6 billion. The filing disclosed $37.7 billion in gains from nonmarketable equity securities &#8212; a category that includes Alphabet&#8217;s stakes in both Anthropic (estimated at 14%) and SpaceX (estimated at 6%). [11] The gains alone exceeded Google Cloud&#8217;s entire quarterly revenue.</p><p>Meta&#8217;s earnings included an $8 billion one-time tax benefit from the One Big Beautiful Bill Act &#8212; a different kind of inflation, clearly labeled and widely noted. [12]</p><p>Of the four companies, only Microsoft reported earnings growth driven primarily by operations rather than investment gains or tax adjustments. Its $4.27 EPS, up 21%, reflected actual growth in cloud and software revenue. [13] This matters for what follows.</p><p>The question nobody on the earnings calls asked: how much of the AI revenue driving cloud business growth comes from the same companies whose rising valuations are boosting net income?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JS9Z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JS9Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 424w, https://substackcdn.com/image/fetch/$s_!JS9Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 848w, https://substackcdn.com/image/fetch/$s_!JS9Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 1272w, https://substackcdn.com/image/fetch/$s_!JS9Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JS9Z!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:885,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73330,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/196115882?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JS9Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 424w, https://substackcdn.com/image/fetch/$s_!JS9Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 848w, https://substackcdn.com/image/fetch/$s_!JS9Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 1272w, https://substackcdn.com/image/fetch/$s_!JS9Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fe1957c-42e5-4f99-8b3a-6e2696932573_1400x885.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Capital Loop</h2><p>The financing architecture of frontier AI has converged on a single structure. Call it the round-trip.</p><p><strong>Step one: the hyperscaler invests equity in the model company.</strong> Amazon has committed up to $33 billion in Anthropic &#8212; $8 billion in prior rounds, $5 billion in April 2026, and up to $20 billion more tied to commercial milestones. It also committed $50 billion to OpenAI in February, with $15 billion disbursed initially. [14] Alphabet committed up to $40 billion in Anthropic &#8212; $3 billion in prior rounds, $10 billion in April 2026, and up to $30 billion contingent on performance. [15] Microsoft invested $5 billion in Anthropic in November 2025, alongside Nvidia&#8217;s $10 billion. [16] Microsoft&#8217;s prior investment in OpenAI exceeds $13 billion. [17]</p><p><strong>Step two: the model company commits to spend multiples of the investment on the hyperscaler&#8217;s infrastructure.</strong> Anthropic committed more than $100 billion to AWS over ten years, securing up to 5 gigawatts of Trainium capacity. [18] It committed $30 billion to Microsoft Azure. [19] It signed a deal with Google and Broadcom for 5 gigawatts of TPU capacity. [20] OpenAI committed $250 billion to Azure through 2032 &#8212; though that commitment is now non-exclusive. [21] OpenAI committed 2 gigawatts of Trainium capacity on AWS. [22]</p><p>Add it up: Anthropic alone has committed at least $130 billion in cloud infrastructure spending to the three hyperscalers that have collectively pledged up to $88 billion in equity to Anthropic. OpenAI has committed over $250 billion to cloud providers that have invested over $63 billion in OpenAI. The committed infrastructure spending exceeds the equity investment by more than 2-to-1.</p><p><strong>Step three: the hyperscaler reports the resulting cloud consumption as AI revenue.</strong> Amazon cites $15 billion in annualized AI revenue run rate, noting that over 100,000 customers run Claude on Bedrock. [23] Microsoft reports $37 billion in annualized AI revenue, which includes &#8220;all revenue from model builders&#8221; on Azure. [24] Google Cloud&#8217;s $20 billion in quarterly revenue includes Anthropic&#8217;s TPU consumption, and the company&#8217;s backlog of $462 billion &#8212; which nearly doubled quarter over quarter &#8212; now includes TPU hardware agreements. [25]</p><p>None of the three discloses how much of their reported AI revenue comes from companies in which they&#8217;ve invested.</p><p><strong>Step four: the hyperscaler books mark-to-market gains on the equity investment.</strong> These are unrealized gains &#8212; the investment&#8217;s value has risen on paper, but the hyperscaler hasn&#8217;t sold. Amazon recorded $16.8 billion in pre-tax gains from Anthropic in Q1 alone &#8212; gains driven by Anthropic&#8217;s valuation rising from $183 billion (September 2025) to $380 billion (February 2026). [26] Alphabet recorded $37.7 billion in gains from nonmarketable equity securities, a category that includes both Anthropic and SpaceX. [27] Both gains are booked as income in the same quarter the hyperscaler reports growing AI revenue from those same companies.</p><p>The loop is self-reinforcing. The hyperscaler invests, which funds the model company&#8217;s growth. The model company spends on the hyperscaler&#8217;s infrastructure, which grows the cloud business. The cloud growth supports the hyperscaler&#8217;s stock price. The model company&#8217;s valuation rises &#8212; driven partly by the revenue it generates, which is partly the hyperscaler&#8217;s own infrastructure spend. The hyperscaler books the valuation gain as income.</p><p>This is not fraud. It is not accounting manipulation. Every transaction is arm&#8217;s-length, audited, and disclosed. But the market is pricing the revenue growth and the investment gains as if they are two independent signals of AI&#8217;s value. They are substantially the same signal, measured twice.</p><h2>The Gains Amplifier</h2><p>The revenue side of the round-trip is meaningful but bounded. Even if Anthropic&#8217;s entire $100 billion AWS commitment were disbursed evenly over 10 years, it would amount to roughly $10 billion annually &#8212; significant, but less than 7% of AWS&#8217;s current run rate. The round-trip revenue is a growing fraction of cloud revenue, not the majority.</p><p>The income statement side is a different order of magnitude.</p><p>Amazon&#8217;s $16.8 billion gain on its Anthropic investment in Q1 exceeded the company&#8217;s entire AWS operating income for the quarter. [28] Strip the gain, and Amazon&#8217;s EPS falls from $2.78 to roughly $1.55 &#8212; an adjusted figure that barely clears the $1.66 consensus. [29] The disclosure is there. The headline isn&#8217;t.</p><p>Alphabet&#8217;s $37.7 billion in equity gains &#8212; which include SpaceX and other private investments alongside Anthropic &#8212; exceeded Google Cloud&#8217;s combined quarterly revenue and operating income. [30] Alphabet&#8217;s reported EPS was $5.11 against a consensus of $2.63. Strip the equity gains: CNBC reported the adjusted EPS at $2.62 &#8212; a one-cent <em>miss</em>. [31] The most celebrated earnings beat of the quarter was, on an adjusted basis, not a beat at all. To be precise: not all of Alphabet&#8217;s $37.7 billion in gains came from Anthropic &#8212; SpaceX is likely the largest single contributor, and the filing doesn&#8217;t disaggregate. But the structural point holds for every dollar that did come from AI investments: a substantial portion of the gains that inflated Alphabet&#8217;s headline earnings came from private companies whose valuations the hyperscalers&#8217; own capital helped create.</p><p>The gains amplifier compounds the revenue story. The cloud business is growing partly because major companies are spending on infrastructure. That growth justifies higher capex. Higher capex builds more infrastructure. More infrastructure attracts more model company commitments. Model company valuations rise. Analysts revise EPS estimates upward. The cycle continues until the model companies either stop growing, go public (crystallizing the gains), or diversify away from the infrastructure that supports the loop.</p><p>The OpenAI-Microsoft restructuring is evidence that the third scenario &#8212; diversification &#8212; is already underway.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zGjh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zGjh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 424w, https://substackcdn.com/image/fetch/$s_!zGjh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 848w, https://substackcdn.com/image/fetch/$s_!zGjh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 1272w, https://substackcdn.com/image/fetch/$s_!zGjh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zGjh!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:850,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:111213,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/196115882?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zGjh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 424w, https://substackcdn.com/image/fetch/$s_!zGjh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 848w, https://substackcdn.com/image/fetch/$s_!zGjh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 1272w, https://substackcdn.com/image/fetch/$s_!zGjh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f2189f1-aa0f-4191-a413-fe7c614e5e89_1400x850.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The &#8220;Primary&#8221; Ratchet</h2><p>The word &#8220;primary&#8221; appears in every major hyperscaler-model company agreement. It is doing extraordinary financial work.</p><p>Amazon calls itself Anthropic&#8217;s &#8220;primary cloud provider&#8221; and &#8220;primary training partner.&#8221; [32] The April 2026 announcement extended this designation through a ten-year, $100 billion commitment. But Anthropic simultaneously holds commitments from Google (5 gigawatts of TPU capacity), Microsoft ($30 billion in Azure), and Nvidia ($10 billion in an optimization partnership). Claude is available on all three major clouds. &#8220;Primary&#8221; means first among several, not sole provider.</p><p>Microsoft was OpenAI&#8217;s exclusive cloud partner. Then, in February 2026, OpenAI signed a $50 billion deal with Amazon &#8212; including exclusive rights for OpenAI&#8217;s Frontier agent tool on AWS. Microsoft publicly objected, insisting it maintained exclusive API rights. [33] Two months later, the exclusivity was gone. The April 27 restructuring made Microsoft&#8217;s license non-exclusive, permitted OpenAI to serve all products on any cloud, and eliminated Microsoft&#8217;s revenue share payments to OpenAI. [34] The word &#8220;exclusive&#8221; became &#8220;primary.&#8221; Give it two renegotiations and &#8220;primary&#8221; may become &#8220;significant.&#8221;</p><p>I examined this pattern two weeks ago in the Amazon-Anthropic context: three qualifying axes &#8212; contract language, product geography, and access scope &#8212; each narrowing with every successive deal. [35] The OpenAI-Microsoft restructuring confirms the pattern is structural, not company-specific. Model companies outgrow exclusivity. The infrastructure commitments survive in the contract, but the revenue concentration doesn&#8217;t survive in practice. The market&#8217;s reaction was telling: Microsoft&#8217;s stock rose on the restructuring. Ending the revenue share and retaining a non-exclusive license through 2032 was priced as discipline, not loss. That may be right &#8212; for Microsoft. The question is whether the same logic applies to Amazon and Google, whose round-trips are still deepening.</p><p>The financial asymmetry is sharp. The hyperscaler&#8217;s infrastructure investment &#8212; $200 billion in Amazon&#8217;s case, $190 billion for Microsoft &#8212; is sunk. Data centers and custom silicon can&#8217;t be redeployed to non-AI workloads overnight. The model company&#8217;s commitment is contractual: real, binding, but portable across providers in a multi-cloud world. The model company diversifies its supply; the hyperscaler has already poured the concrete.</p><h2>The Control Case</h2><p>Meta reported the same week and serves as the experiment&#8217;s control group.</p><p>Meta&#8217;s AI capex is enormous &#8212; $125 to $145 billion guided for 2026, raised from $115 to $135 billion this quarter due to higher component pricing. [36] But Meta&#8217;s structure is fundamentally different. It doesn&#8217;t invest equity in external model companies. It doesn&#8217;t receive compute commitments from AI startups. Its AI investment is entirely internal: model training, inference infrastructure, and product integration. The return shows up in one place &#8212; advertising revenue.</p><p>That return is real. Meta&#8217;s ad revenue grew 33% to $55 billion, driven by AI improvements in ad targeting, with impressions up 19% and price per ad up 12%. [37] No mark-to-market gains inflated the earnings. The $8 billion tax benefit was clearly disclosed and widely noted.</p><p>Meta&#8217;s 33% revenue growth on pure internal AI investment &#8212; no round-trip, no circular gains, no mark-to-market windfalls &#8212; is what AI spending looks like when the market can see both sides of the ledger. The market punished Meta not for bad results but for spending $135 billion without the gains amplifier that made Amazon&#8217;s and Alphabet&#8217;s earnings look transformative.</p><h2>The Revenue That Can&#8217;t Be Counted</h2><p>The round-trip creates a measurement problem that neither the companies nor the analysts have solved.</p><p>Microsoft&#8217;s $37 billion in annualized AI revenue includes revenue from model builders running on Azure &#8212; but also Copilot enterprise seats, GitHub Copilot, and Azure AI services that have nothing to do with the round-trip. [38] The OpenAI-specific component is undisclosed but likely a fraction of the total. Before April 27, that fraction included all of OpenAI&#8217;s API traffic. After April 27, OpenAI can run on any cloud. How much of the $37 billion is at risk of migration? Microsoft hasn&#8217;t said. Amy Hood noted on the call that Azure could have grown above 40% if the company hadn&#8217;t allocated GPU capacity to first-party products like Copilot, framing the growth as supply-constrained. [39] That narrative was credible when OpenAI was exclusive. It reads differently when OpenAI has options.</p><p>Anthropic&#8217;s $30 billion run-rate revenue is reported on a gross basis &#8212; counting total end-customer spend as revenue and booking cloud infrastructure costs as expenses. [40] OpenAI disputes this approach, arguing it inflates Anthropic&#8217;s figure by approximately $8 billion relative to a net reporting basis. [41] The dispute matters here because gross reporting means Anthropic&#8217;s revenue includes the full hyperscaler infrastructure payment as both a top-line figure and an expense. The same dollar appears in both Anthropic&#8217;s revenue and AWS&#8217;s or Google Cloud&#8217;s AI revenue. This is standard revenue recognition &#8212; not double-counting in the GAAP sense &#8212; but it means the market is valuing both sides of the same transaction. To be clear: Anthropic&#8217;s cloud spending is real compute demand, the same infrastructure purchase any enterprise makes at market rates. The circularity is not in consumption but in financing &#8212; the equity investment that funds the model company and the valuation gains that flow from that same company&#8217;s growth.</p><p>Google&#8217;s $462 billion backlog now includes TPU hardware agreements with Anthropic. [42] Alphabet expects to recognize just over 50% of that backlog as revenue within 24 months. [43] When a cloud provider&#8217;s backlog includes committed purchases from a company the cloud provider has invested $40 billion in, the backlog is partly a measure of the cloud provider&#8217;s own capital commitment cycling through the system.</p><p>The aggregate picture: three hyperscalers have collectively pledged over $140 billion in equity to two model companies &#8212; over $150 billion, including Nvidia&#8217;s $10 billion. Those model companies have committed over $380 billion in infrastructure spending back to the same hyperscalers. The hyperscalers report the resulting cloud revenue as AI growth. They book the rising valuations as income. And the market prices both signals &#8212; the revenue and the gains &#8212; as independent validation that AI is working.</p><p>Most of it is working. The cloud growth is predominantly organic: enterprise AI adoption is real, developer demand for inference is real, and the shift to AI-native workloads is structural. The hyperscalers would argue the round-trip is simply a customer acquisition cost &#8212; that anchoring the model company to their infrastructure attracts organic enterprise customers who arrive for Claude or GPT and stay for the platform. That argument has merit. The question is whether the market is pricing AI revenue at the margin of a customer acquisition funnel or at the margin of independent organic demand.</p><p>But the gains that turned good quarters into spectacular ones &#8212; $16.8 billion at Amazon, $37.7 billion at Alphabet &#8212; are the round-trip showing up in the income statement. And those gains are entirely a function of rising private valuations that the hyperscalers&#8217; own investments helped create.</p><h2>What Breaks</h2><p>The round-trip is stable as long as three conditions hold: <strong>the model companies keep growing</strong>, <strong>the infrastructure commitments convert to actual spend</strong>, and <strong>the private valuations keep rising</strong>. All three are under pressure.</p><p>The capex itself is increasingly debt-funded. Amazon&#8217;s trailing twelve-month free cash flow collapsed 95% to $1.2 billion &#8212; meaning its $200 billion in 2026 capex requires substantial debt issuance. [44] Amazon would note that its FCF was similarly compressed during the AWS buildout a decade ago, and that investment proved transformative. The difference is structural: the current investment includes $83 billion in equity deployed to companies whose revenue partly cycles back through Amazon&#8217;s own infrastructure &#8212; a financing loop the AWS buildout did not have. And much of the equity &#8220;deployed&#8221; to model companies remains on paper: of Amazon&#8217;s $33 billion in commitments to Anthropic, roughly $13 billion has been disbursed; the remaining $20 billion is tied to commercial milestones that may or may not be met. The round-trip depends on commitments converting to cash. The commitments are real. The cash is conditional.</p><p>Anthropic&#8217;s growth is extraordinary &#8212; $1 billion in annualized revenue at the end of 2024, $30 billion by April 2026. [45] But OpenAI&#8217;s CFO has reportedly said the company cannot afford its promised infrastructure spending, and OpenAI is missing internal targets for users and revenue. [46] If the model companies&#8217; growth decelerates, the infrastructure commitments &#8212; which are contractual but demand-paced &#8212; may disburse more slowly than the backlog implies.</p><p>The IPO question cuts both ways. Anthropic is reportedly considering a listing as early as October 2026 at a potential valuation above $380 billion, with secondary market interest at $800 billion. [47] An IPO would crystallize the hyperscalers&#8217; gains &#8212; converting unrealized mark-to-market into a liquid position with a market price. But it would also end the valuation escalation that produces the gains amplifier. A public Anthropic trading at 25 times revenue is a known quantity. A private Anthropic valued on the latest secondary trade between rounds is an appreciating asset every quarter. The hyperscalers benefit more from the IPO approach than from the event itself.</p><p>The &#8220;primary&#8221; ratchet is the structural risk the market has not priced. Every renegotiation loosens the model company&#8217;s commitment to a single provider. Microsoft went from exclusive to non-exclusive in seven years. The model companies would frame the diversification differently &#8212; not as instability but as leverage. A model company that can serve any cloud sets the terms. But the hyperscaler that poured the concrete can&#8217;t repour it. The infrastructure is sunk &#8212; Amazon&#8217;s $200 billion in 2026 capex is building data centers and manufacturing custom silicon that will be operational for a decade. The model companies&#8217; commitments are real but operate on a different timescale. A $100 billion, 10-year commitment is $10 billion a year. A model company that can serve any cloud will allocate that spend to wherever the price, performance, and capacity are best. &#8220;Primary&#8221; protects the first call; it doesn&#8217;t protect the margin.</p><p>The round-trip ratio &#8212; equity invested divided by committed revenue, adjusted for mark-to-market gains &#8212; connects the spending story to the earnings story. Until the hyperscalers disclose how much of their AI revenue comes from companies they&#8217;ve invested in, the market will continue to treat revenue growth and investment gains as independent signals. They&#8217;re not. </p><p>They are two readings of the same thermometer, and the temperature is partly the hyperscalers&#8217; own heat.</p><div><hr></div><h3>Notes</h3><p>[1] Microsoft Blog, <a href="https://blogs.microsoft.com/blog/2026/04/27/the-next-phase-of-the-microsoft-openai-partnership/">&#8220;The next phase of the Microsoft-OpenAI partnership,&#8221;</a> April 27, 2026; OpenAI, &#8220;Next phase of Microsoft partnership,&#8221; April 27, 2026.</p><p>[2] CNBC, <a href="https://www.cnbc.com/2026/04/28/openai-brings-models-to-aws-after-ending-exclusivity-with-microsoft.html">&#8220;OpenAI brings its models to Amazon&#8217;s cloud after ending exclusivity with Microsoft,&#8221;</a> April 28, 2026.</p><p>[3] Author&#8217;s compilation from Q1 2026 earnings reports: <a href="https://ir.aboutamazon.com/news-release/news-release-details/2026/Amazon-com-Announces-First-Quarter-Results/">Amazon</a> ~$200B (8-K, FY2026 guidance reiterated); <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">Microsoft</a> ~$190B (CY2026, per CFO Amy Hood on Q3 FY2026 earnings call, April 29, 2026); <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">Alphabet</a> $180&#8211;190B (raised from $175&#8211;185B, Q1 2026 earnings call); <a href="https://www.cnbc.com/2026/04/29/meta-q1-earnings-report-2026.html">Meta</a> $125&#8211;145B (raised from $115&#8211;135B, Q1 2026 earnings release). Midpoints sum to approximately $700B.</p><p>[4] <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">Microsoft</a> Q3 FY2026 capex of $31.9B vs. $35.3B consensus (Visible Alpha); <a href="https://www.cnbc.com/2026/04/29/meta-q1-earnings-report-2026.html">Meta</a> shares fell approximately 6% in after-hours trading following capex guidance increase.</p><p>[5] Alphabet shares rose approximately 6% after-hours (<a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">CNBC</a>); Amazon beat on revenue ($181.5B vs. $177.3B consensus) and EPS ($2.78 vs. $1.66 consensus) (<a href="https://www.cnbc.com/2026/04/29/amazon-amzn-q1-earnings-report-2026.html">CNBC</a>).</p><p>[6] <a href="https://www.sec.gov/Archives/edgar/data/0001018724/000101872426000012/amzn-20260331xex991.htm">Amazon 8-K, Q1 2026</a>: AWS revenue $37.59B, +28% YoY. StreetAccount consensus was $36.64B.</p><p>[7] <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">Alphabet Q1 2026 earnings release</a>: Google Cloud revenue $20.0B, +63% YoY. Operating income $6.6B, operating margin 32.9%, up from 17.8% in Q1 2025.</p><p>[8] Microsoft Q3 FY2026 earnings call, April 29, 2026: Azure and other cloud services grew 40% in constant currency. AI annualized revenue $37B, +123% YoY. Revenue figure includes &#8220;all revenue from model builders&#8221; per <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">CNBC reporting</a>.</p><p>[9] <a href="https://www.sec.gov/Archives/edgar/data/0001326801/000162828026028364/meta-03312026xexhibit991.htm">Meta Q1 2026 8-K</a>: Total revenue $56.31B, +33% YoY. Advertising revenue ~$55B. Ad impressions +19%, average price per ad +12%.</p><p>[10] <a href="https://www.sec.gov/Archives/edgar/data/0001018724/000101872426000012/amzn-20260331xex991.htm">Amazon 8-K, Q1 2026</a>: &#8220;First quarter 2026 net income includes pre-tax gains of $16.8 billion included in non-operating income from our investments in Anthropic.&#8221; Net income $30.3B, +77% YoY.</p><p>[11] <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">Alphabet Q1 2026 earnings release</a>: Net income $62.57B, +81% YoY. &#8220;Other income&#8221; of $37.7B, &#8220;primarily the result of net unrealized gains on our nonmarketable equity securities.&#8221; Alphabet&#8217;s private investments include stakes in Anthropic (estimated 14%) and SpaceX (estimated 6%). The filing does not disaggregate gains by investment.</p><p>[12] <a href="https://www.sec.gov/Archives/edgar/data/0001326801/000162828026028364/meta-03312026xexhibit991.htm">Meta Q1 2026 8-K</a>: Net income $26.77B. Includes $8.03B income tax benefit tied to the One Big Beautiful Bill Act and U.S. Treasury Notice 2026-7. &#8220;Diluted EPS would have been $3.13 lower without this benefit.&#8221;</p><p>[13] <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">Microsoft Q3 FY2026 earnings release</a>: EPS $4.27, +21% YoY. Revenue $82.89B, +18% YoY. Operating income up 20%.</p><p>[14] Amazon-Anthropic: $8B in prior investments (multiple rounds, 2023&#8211;2024); $5B in April 2026 at $350B valuation; up to $20B additional tied to commercial milestones. Anthropic blog, <a href="https://www.anthropic.com/news/anthropic-amazon-compute">&#8220;Anthropic and Amazon expand collaboration,&#8221;</a> April 20, 2026. Amazon-OpenAI: $50B committed February 2026 ($15B initial + $35B conditional). <a href="https://www.cnbc.com/2026/04/29/openai-drift-from-microsoft-to-amazon-turns-aggressive-after-subtlety.html">CNBC</a>, February 2026.</p><p>[15] Alphabet-Anthropic: ~$3B in prior investments; $10B at $350B valuation April 24, 2026; up to $30B additional tied to performance targets. <a href="https://techcrunch.com/2026/04/24/google-to-invest-up-to-40-billion-in-anthropic/">TechCrunch</a>, April 24, 2026; Bloomberg, April 24, 2026.</p><p>[16] Microsoft-Anthropic: $5B investment, November 2025. Nvidia-Anthropic: $10B, November 2025. Anthropic committed to $30B Azure compute and 1 GW of Nvidia Grace Blackwell / Vera Rubin capacity. Microsoft Blog, <a href="https://blogs.microsoft.com/blog/2025/11/18/microsoft-nvidia-and-anthropic-announce-strategic-partnerships/">&#8220;Microsoft, NVIDIA and Anthropic announce strategic partnerships,&#8221;</a> November 18, 2025. See also <a href="https://www.anthropic.com/news/microsoft-nvidia-anthropic-announce-strategic-partnerships">Anthropic announcement</a>.</p><p>[17] Microsoft&#8217;s total investment in OpenAI exceeds $13B across multiple rounds since 2019. Exact figure is not publicly disclosed in aggregate; $13B is the widely reported estimate. <a href="https://www.cnbc.com/2025/11/18/anthropic-ai-azure-microsoft-nvidia.html">CNBC</a>.</p><p>[18] Anthropic blog, <a href="https://www.anthropic.com/news/anthropic-amazon-compute">&#8220;Anthropic and Amazon expand collaboration,&#8221;</a> April 20, 2026: &#8220;We are committing more than $100 billion over the next ten years to AWS technologies, securing up to 5GW of new capacity to train and run Claude. The commitment spans Graviton and Trainium2 through Trainium4 chips.&#8221;</p><p>[19] Microsoft Blog, <a href="https://blogs.microsoft.com/blog/2025/11/18/microsoft-nvidia-and-anthropic-announce-strategic-partnerships/">&#8220;Microsoft, NVIDIA and Anthropic announce strategic partnerships,&#8221;</a> November 18, 2025: &#8220;Anthropic has committed to purchase $30 billion of Azure compute capacity and to contract additional compute capacity up to one gigawatt.&#8221;</p><p>[20] Anthropic blog, <a href="https://www.anthropic.com/news/google-broadcom-partnership-compute">&#8220;Anthropic expands partnership with Google and Broadcom,&#8221;</a> April 7, 2026. Broadcom SEC filing showed the deal includes 3.5 GW of compute. Separately, Google&#8217;s $40B investment includes 5 GW of TPU capacity over five years.</p><p>[21] OpenAI&#8217;s Azure commitment exceeds $250B through 2032, per multiple reporting outlets citing Microsoft deal terms (now non-exclusive per April 27, 2026 restructuring). Exact figure not disclosed in Microsoft&#8217;s public filings; $250B is the widely cited estimate. B-tier sourcing. See <a href="https://www.cnbc.com/2026/04/29/openai-drift-from-microsoft-to-amazon-turns-aggressive-after-subtlety.html">CNBC</a> for deal history.</p><p>[22] CNBC, <a href="https://www.cnbc.com/2026/04/28/openai-brings-models-to-aws-after-ending-exclusivity-with-microsoft.html">&#8220;OpenAI brings its models to Amazon&#8217;s cloud,&#8221;</a> April 28, 2026. OpenAI committed to 2 GW of AWS Trainium for training.</p><p>[23] Anthropic blog, <a href="https://www.anthropic.com/news/anthropic-amazon-compute">&#8220;Anthropic and Amazon expand collaboration,&#8221;</a> April 20, 2026: &#8220;over 100,000 customers now run Claude on Amazon Bedrock.&#8221; Amazon Q4 2025 earnings call: AI services annualized run-rate revenue of $15B. Reiterated in Q1 context by multiple outlets.</p><p>[24] Microsoft Q3 FY2026 earnings call, April 29, 2026. <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">CNBC</a>: &#8220;The number includes business from clients running AI services on Azure, including all revenue from model builders, as well as revenue from Microsoft&#8217;s own AI tools.&#8221;</p><p>[25] Alphabet Q1 2026 earnings call: Backlog $462B, CFO Anat Ashkenazi noted increase driven by &#8220;strong demand for enterprise AI offerings and the inclusion of TPU hardware sales.&#8221; Expects to recognize &#8220;just over 50% of the backlog as revenue over the next 24 months.&#8221; <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">CNBC</a>.</p><p>[26] Amazon 8-K, Q1 2026: $16.8B pre-tax gains from Anthropic investments. Anthropic&#8217;s valuation history: $183B (September 2025, Series F); $380B (February 2026, Series G per <a href="https://www.anthropic.com/news/anthropic-series-g">Anthropic blog</a>, February 12, 2026).</p><p>[27] Alphabet Q1 2026 earnings release: $37.7B gains from nonmarketable equity securities. Includes Anthropic and SpaceX among other private investments. <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">CNBC</a>.</p><p>[28] <a href="https://www.sec.gov/Archives/edgar/data/0001018724/000101872426000012/amzn-20260331xex991.htm">Amazon 8-K, Q1 2026</a>: AWS operating income was $11.5B (derived: total operating income $23.9B minus North America $8.3B minus International segment operating income). The $16.8B Anthropic gain exceeds this figure. Note: AWS operating income figure derived from segment data; verify against 10-Q when available.</p><p>[29] Author&#8217;s non-GAAP adjustment. Amazon Q1 2026 net income of $30.3B included $16.8B pre-tax Anthropic gains. At an assumed ~20% effective tax rate on the gain, the after-tax impact is approximately $13.4B, or approximately $1.23 per diluted share. $2.78 reported EPS minus $1.23 &#8776; $1.55 adjusted EPS, compared to the $1.66 consensus estimate. Methodological note: exact tax treatment of the Anthropic gain depends on the structure of the investment and the applicable tax rate, which will be disclosed in the 10-Q. This adjustment is not provided by the company and is constructed by the author for analytical purposes.</p><p>[30] Alphabet Q1 2026: Google Cloud revenue $20.0B, operating income $6.6B, combined $26.6B. Equity gains of $37.7B exceed this combined figure. <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">CNBC</a>.</p><p>[31] CNBC, <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">&#8220;Alphabet Q1 2026 earnings,&#8221;</a> April 29, 2026. Alphabet disclosed the equity securities gains added $2.35 to diluted EPS. Adjusted EPS of $2.62 vs. $2.63 expected by analysts polled by LSEG.</p><p>[32] Anthropic blog, <a href="https://www.anthropic.com/news/anthropic-amazon-compute">&#8220;Anthropic and Amazon expand collaboration,&#8221;</a> April 20, 2026: &#8220;We named AWS our primary cloud provider in 2023 and our primary training partner in 2024.&#8221;</p><p>[33] Microsoft blog, February 2026, day of OpenAI-Amazon announcement: &#8220;Microsoft maintains its exclusive license and access to intellectual property across OpenAI models and products. &#8230; Azure remains the exclusive cloud provider of stateless OpenAI APIs.&#8221; <a href="https://techcrunch.com/2026/04/27/openai-ends-microsoft-legal-peril-over-its-50b-amazon-deal/">TechCrunch</a>, April 27, 2026.</p><p>[34] Microsoft Blog, <a href="https://blogs.microsoft.com/blog/2026/04/27/the-next-phase-of-the-microsoft-openai-partnership/">&#8220;The next phase of the Microsoft-OpenAI partnership,&#8221;</a> April 27, 2026. Key terms: non-exclusive license through 2032; OpenAI can serve all products on any cloud; Microsoft no longer pays revenue share to OpenAI; OpenAI revenue share payments to Microsoft continue through 2030, capped.</p><p>[35] <a href="https://www.airealist.ai/">&#8220;The Price of &#8216;Primary,&#8217;&#8221;</a> published April 21, 2026, The AI Realist.</p><p>[36] <a href="https://www.sec.gov/Archives/edgar/data/0001326801/000162828026028364/meta-03312026xexhibit991.htm">Meta Q1 2026 8-K</a>: 2026 capex guidance $125&#8211;145B, raised from prior $115&#8211;135B. &#8220;This reflects our expectations for higher component pricing this year and, to a lesser extent, additional data center costs to support future year capacity.&#8221;</p><p>[37] <a href="https://www.sec.gov/Archives/edgar/data/0001326801/000162828026028364/meta-03312026xexhibit991.htm">Meta Q1 2026 8-K</a>: Advertising revenue ~$55B, +33% YoY. Ad impressions +19%, average price per ad +12%.</p><p>[38] Microsoft Q3 FY2026 earnings call, April 29, 2026. <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">CNBC</a>: AI annualized revenue of $37B includes revenue from model builders on Azure, Copilot enterprise seats, GitHub Copilot, and other first-party AI tools.</p><p>[39] <a href="https://www.cnbc.com/2026/04/29/microsoft-msft-q3-2026-earnings.html">Microsoft Q3 FY2026 earnings call</a>, April 29, 2026. CFO Amy Hood: Azure could have grown above 40% absent GPU supply constraints / allocation to first-party products.</p><p>[40] <a href="https://sacra.com/c/anthropic/">Sacra</a>, &#8220;Anthropic revenue, valuation &amp; funding&#8221; (accessed April 30, 2026): &#8220;Anthropic reports revenue from cloud resellers (AWS, Google, Microsoft) on a gross basis &#8212; counting total end-customer spend as revenue and booking partner payouts as expenses &#8212; which inflates top-line figures relative to net-reporting peers.&#8221;</p><p>[41] OpenAI&#8217;s position per multiple reports: Anthropic&#8217;s gross revenue overstates by approximately $8B vs. net reporting. Cited by <a href="https://sacra.com/c/anthropic/">Sacra</a>, TNW, Remio.</p><p>[42] Alphabet Q1 2026 earnings call, April 29, 2026: Backlog of $462B now includes TPU hardware agreements. CFO Anat Ashkenazi noted the increase was driven by &#8220;strong demand for enterprise AI offerings and the inclusion of TPU hardware sales.&#8221; <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">CNBC</a>.</p><p>[43] Alphabet Q1 2026 earnings call, April 29, 2026: Expects to recognize &#8220;just over 50% of the backlog as revenue over the next 24 months.&#8221; <a href="https://www.cnbc.com/2026/04/29/alphabet-googl-q1-2026-earnings.html">CNBC</a>.</p><p>[44] <a href="https://www.sec.gov/Archives/edgar/data/0001018724/000101872426000012/amzn-20260331xex991.htm">Amazon Q1 2026</a>: trailing twelve-month free cash flow of $1.2B, down from $25.9B in Q1 2025, a decline of approximately 95%. Operating cash flow declined while capital expenditure (including finance leases and purchase of property/equipment) increased. Multiple outlets citing 8-K data; verify against 10-Q when available.</p><p>[45] Anthropic revenue trajectory: ~$1B ARR end of 2024; $9B end of 2025; $14B February 2026 (per Series G announcement); $30B April 2026 (per Anthropic compute announcement). <a href="https://www.anthropic.com/news/anthropic-amazon-compute">Anthropic blog</a>, April 20, 2026; <a href="https://sacra.com/c/anthropic/">Sacra</a>.</p><p>[46] WSJ report, April 2026: OpenAI missed internal targets for active users and revenue. OpenAI CFO Sarah Friar and CEO Sam Altman disputed the report. <a href="https://www.cnbc.com/2026/04/28/openai-brings-models-to-aws-after-ending-exclusivity-with-microsoft.html">CNBC</a>, April 28, 2026, referencing WSJ.</p><p>[47] Anthropic IPO discussions: Goldman Sachs, JPMorgan, Morgan Stanley advising; potential October 2026 listing; expected raise exceeding $60B. <a href="https://thenextweb.com/news/anthropic-ipo-plan-ai-unicorn">TNW</a>, April 15, 2026, referencing Bloomberg. Secondary market offers at $800B+.</p>]]></content:encoded></item><item><title><![CDATA[Simplify Up, Enforce Down]]></title><description><![CDATA[The EU spent five months simplifying the AI Act for the companies it can reach. Last night, after twelve hours, those negotiations failed.]]></description><link>https://www.airealist.ai/p/simplify-up-enforce-down</link><guid isPermaLink="false">https://www.airealist.ai/p/simplify-up-enforce-down</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Thu, 30 Apr 2026 14:41:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QN8W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QN8W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QN8W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 424w, https://substackcdn.com/image/fetch/$s_!QN8W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 848w, https://substackcdn.com/image/fetch/$s_!QN8W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 1272w, https://substackcdn.com/image/fetch/$s_!QN8W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QN8W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png" width="1248" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1248,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1184578,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/196005179?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QN8W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 424w, https://substackcdn.com/image/fetch/$s_!QN8W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 848w, https://substackcdn.com/image/fetch/$s_!QN8W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 1272w, https://substackcdn.com/image/fetch/$s_!QN8W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ce8868f-e8ee-4e83-bad5-6f005c2b8e13_1248x832.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Twelve hours. That is how long the second political trilogue on the EU AI Act Omnibus lasted on April 28, 2026. When it ended without agreement, the institutions issued statements. Compliance teams updated their Slack channels. Law firms published client alerts by midnight.</p><p>Nobody called DeepSeek or MiniMax. Nobody called the teams maintaining the Llama and Qwen download pages on Hugging Face. They were not in the room. They never needed to be.</p><p>The AI Act&#8217;s original deadlines are now back in force. High-risk AI obligations &#8212; employment screening, credit scoring, biometric systems, law enforcement tools &#8212; apply from 2 August 2026 as written. A follow-up trilogue is scheduled for 13 May 2026, and a July publication in the Official Journal remains theoretically possible.[1] But five months of negotiation just collapsed over a single unresolved file, and any CTO or General Counsel who has been planning against an assumed delay should stop.</p><h2><strong>What Actually Failed</strong></h2><p>The Omnibus collapsed over Annex I &#8212; specifically, the conformity-assessment architecture for AI systems embedded in regulated products: industrial machinery, medical devices, and in-vitro diagnostics.[2] The European Parliament wanted AI Act requirements horizontally integrated into the sectoral safety laws governing those products. The Council did not converge on that approach. Twelve hours of negotiation produced no bridge.</p><p>This was not a minor procedural dispute. The conformity-assessment architecture determines who certifies what, under which framework, and inspected by which authority. The Standing Committee of European Doctors had already objected to medical devices being moved out of the AI Act&#8217;s high-risk framework and into sectoral-only oversight.[3] Sectoral regulators have their own conformity regimes. The Parliament&#8217;s proposal to shift AI medical devices into those regimes produced exactly the resistance that a proposal to weaken overlapping safety requirements tends to produce.</p><p>The package that failed included the high-risk deadline delay (Annex III systems to December 2, 2027; Annex I systems to August 2, 2028), the nudifier ban, and the reinstated registration requirement for self-assessed non-high-risk systems &#8212; a provision the Commission had tried to delete, and both institutions had independently restored.[4] All of it is now on hold until May 13 at the earliest, July at the latest, and August 2 if negotiations fail entirely.</p><p>Dutch MEP Kim van Sparrentak put it plainly: &#8220;Big Tech is probably popping champagne. While European companies that care about safety and did their homework now face regulatory chaos.&#8221;[5] She named the wrong beneficiary. Big Tech was never the problem the Omnibus was meant to solve.</p><h2><strong>Who Was Never at the Table</strong></h2><p>Here is what five months of Omnibus negotiations produced: an extended debate among European institutions, industry associations, and member states over how to make compliance easier for European deployers.</p><p>The actors who justified the controls were not party to that debate. They are not subject to it.</p><p>A US foundation model provider operating under contractual terms of service and API access restrictions faces AI Act obligations primarily at the GPAI layer &#8212; Chapter V, Articles 51 through 55, which govern general-purpose AI models. Those provisions entered into force on 2 August 2025 and were never part of the Omnibus dispute.[6] The high-risk deployer obligations that the Omnibus was trying to delay apply to the European companies that deploy those models in employment, credit, and public-safety contexts, not to the labs that built them.</p><p>A Chinese open-weight model distributed through Hugging Face, downloaded by a European startup, and deployed in a recruitment pipeline sits in a more ambiguous position. The AI Act claims jurisdiction twice: Article 2(1)(a) covers any provider placing a system on the EU market regardless of location; Article 2(1)(c) extends to any third-country provider whose outputs are used in the Union.[7] And the open-source exemption under Article 2(12) collapses when the system is deployed in a high-risk context, which EU recruitment is, under Annex III. But claiming jurisdiction and exercising it are different operations. Enforcement against a Chinese lab with no EU legal entity, no EU revenue recognition, and no EU contractual relationship is a different exercise from enforcement against a Frankfurt insurance company that bought an HR screening tool from a certified vendor. The Frankfurt company is in the registration database. The Chinese lab is not.</p><p>This is the governance paradox the Omnibus negotiations made visible. The five months of debate were about how to adjust rules for the population that was already complying. The population that justified the rules &#8212; and that the rules structurally struggle to reach &#8212; had no seat at the table because the table has no jurisdiction over them.</p><p>The more effective the governance mechanism for the governed, the sharper the line between the governed and the ungoverned. The Omnibus was trying to move the line. It collapsed before it could.</p><h2><strong>What August 2 Actually Means</strong></h2><p>Three things are in force regardless of what happens on May 13.</p><p>Article 4 AI literacy obligations have been in effect since 2 February 2025. Every provider and deployer must ensure that staff working with AI systems have sufficient AI literacy for their role. No prescribed curriculum; no direct administrative fine attached; liability exposure arises through the revised Product Liability Directive and national tort law when inadequate training contributes to harm.[8]</p><p>Article 50 transparency obligations apply from 2 August 2026 for new systems: disclose chatbot interactions, label AI-generated content in public-interest contexts, and mark synthetic audio and video. The watermarking sub-provision &#8212; the machine-readable component &#8212; is the one provision still genuinely in play at trilogue, with Parliament proposing November 2, 2026, and the Council proposing February 2, 2027.[9] Everything else in Article 50 lands on August 2 as written.</p><p>High-risk deployer obligations &#8212; conformity assessments, technical documentation, registration, human oversight mechanisms &#8212; apply from August 2 under the original law. If May 13 closes a deal and July publication clears, those obligations move to December 2, 2027. If not, <strong>they will go live in 94 days</strong>. Building a governance architecture that can absorb either outcome is not a delay strategy. It is the only strategy that works in both scenarios.</p><h2><strong>What May 13 Changes and What It Doesn&#8217;t</strong></h2><p>A deal on May 13 delivers the delay. Compliance teams gain twelve months on Annex III high-risk obligations. Any CTO managing a Q3 sprint against August 2 would welcome that.</p><p>What it does not change: the conformity assessment dispute that sank April 28 had nothing to do with foundation model providers, open-weight distributors, or cross-border API operators. It was a dispute between European institutions about European products deployed in European markets. The labs that built the systems the framework was designed to constrain have faced GPAI obligations since August 2025, and enforcement mechanisms that are still being built.</p><p>May 13 will determine whether the governed get twelve more months. If it closes and the July publication clears, this piece&#8217;s urgency evaporates, but its structural argument does not. The delay arrives; the boundary does not move.</p><p>What does not change in either scenario: <strong>while the EU institutions were spending five months negotiating compliance architecture for companies that were already in the room, the companies that were never in the room kept building</strong>. DeepSeek, MiniMax, Kimi, and others have shipped new models. The open-weight frontier moved. Every month of trilogue is a month the rest of the world does not spend waiting for a conformity assessment framework to resolve. The Act governs the governed. The ungoverned are compounding.</p><div><hr></div><h3>Notes</h3><p>[1] <a href="https://www.europarl.europa.eu/legislative-train/package-digital-package/file-digital-omnibus-on-ai">European Parliament Legislative Train Schedule, Digital Omnibus on AI</a>, updated April 29, 2026.</p><p>[2] <a href="https://iapp.org/news/a/eu-ai-act-reform-talks-stall-as-key-compliance-deadline-looms">IAPP, &#8220;EU AI Act reform talks stall as key compliance deadline looms,&#8221;</a> April 29, 2026. Cypriot Council Presidency official statement: &#8220;It was not possible to reach an agreement with the European Parliament.&#8221; MLex Chief AI Correspondent Luca Bertuzzi confirmed the specific sticking point: &#8220;talks broke down around 2 am, with the expected fault line on the European Parliament&#8217;s push to move sectoral legislation from Annex I Section A to B.&#8221; See also <a href="https://thenextweb.com/news/eu-ai-act-omnibus-deal-fails-april-2026-talks">TNW</a>, April 29, 2026.</p><p>[3] <a href="https://www.cpme.eu/news/move-fast-and-break-things-must-not-endanger-patient-safety-medical-devices-must-remain-under-safeguards-of-the-ai-act">CPME, &#8220;&#8217;Move fast and break things&#8217; must not endanger patient safety: Medical devices must remain under safeguards of the AI Act,&#8221;</a> March 25, 2026.</p><p>[4] <a href="https://www.nicfab.eu/en/posts/digital-omnibus-ai-plenary-vote/">NicFab, &#8220;Digital Omnibus on AI: EP Adopts Position (569 Votes),&#8221;</a> March 27, 2026. Both Parliament and Council reinstated the registration obligation after the Commission proposed deleting it; <a href="https://www.edpb.europa.eu/news/news/2026/edpb-and-edps-support-streamlining-ai-act-implementation-call-stronger-safeguards_en">EDPB and EDPS Joint Opinion 1/2026</a> supported reinstatement.</p><p>[5] Van Sparrentak quote via <a href="https://iapp.org/news/a/eu-ai-act-reform-talks-stall-as-key-compliance-deadline-looms">IAPP citing Reuters</a>, April 29, 2026.</p><p>[6] <a href="https://iapp.org/news/a/ai-act-omnibus-what-just-happened-and-what-comes-next">IAPP, &#8220;AI Act Omnibus: What just happened and what comes next?&#8221;</a>, April 29, 2026.</p><p>[7] <a href="https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689">EU AI Act (Regulation (EU) 2024/1689)</a>, Article 2(1)(a) (providers placing systems on the EU market regardless of location), Article 2(1)(c) (third-country providers/deployers where outputs are used in the EU), and Article 2(12) (open-source exemption collapses for high-risk systems under Article 6(2) and Annex III).</p><p>[8] <a href="https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689">EU AI Act, Article 4</a> (AI literacy), in force from 2 February 2025 per Article 113(a). The AI Act does not create a standalone civil liability cause of action; liability exposure for AI-related harm arises through the <a href="https://eur-lex.europa.eu/eli/dir/2024/2853/oj">revised Product Liability Directive (Directive (EU) 2024/2853)</a>, transposition deadline 9 December 2026, and national tort law. The AI Liability Directive was formally withdrawn by the Commission in October 2025.</p><p>[9] CDT Europe AI Bulletin, April 2026; <a href="https://www.aoshearman.com/en/insights/digital-omnibus-on-ai-what-is-really-on-the-table-as-trilogues-begin">A&amp;O Shearman trilogue analysis</a>. Parliament proposes November 2, 2026 for watermarking; Council proposes February 2, 2027.</p>]]></content:encoded></item><item><title><![CDATA[More Sovereign, Different Stack: The Builder Tax]]></title><description><![CDATA[Sovereignty forces a different stack. The Commission's framework prices neither the cost nor the exit it delivers.]]></description><link>https://www.airealist.ai/p/more-sovereign-different-stack-the</link><guid isPermaLink="false">https://www.airealist.ai/p/more-sovereign-different-stack-the</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Wed, 29 Apr 2026 06:36:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ZXSo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZXSo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZXSo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!ZXSo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!ZXSo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!ZXSo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZXSo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2272018,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/195836795?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZXSo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!ZXSo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!ZXSo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!ZXSo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb703c0d3-988e-4c26-87e7-57bac2e8d94d_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AWS European Sovereign Cloud GmbH is 100% owned by Amazon.com Inc.[1] It launched in January 2026 with improved technical isolation, EU-resident operations, and a thinner managed-service catalog than the commercial AWS track.[2] Microsoft&#8217;s Sovereign Public Cloud sits in the same architectural category.[3] Neither severed the US parent-company chain. Microsoft France told the French Senate under oath on June 10, 2025, that the company cannot guarantee EU data sovereignty against US authority requests. The CLOUD Act (18 U.S.C. &#167; 2713) and FISA Section 702 (50 U.S.C. &#167; 1881a) apply regardless of where the data sits.[4]</p><p>On April 17, 2026, the European Commission awarded framework contracts under its Cloud III procurement, worth up to &#8364;180 million, to handle sensitive EU institutional workloads over six years.[5] Four consortia were prequalified. [6] <em><a href="https://www.airealist.ai/p/ten-percent-sovereign">Ten Percent Sovereign</a></em> surfaced Canadian jurisdiction dependencies in the Proximus and OVHcloud consortia; they are not considered here.[7] The two clean awardees &#8212; Scaleway (Iliad/Niel) and StackIT (Schwarz Digits) &#8212; clear the SEAL-3 "Digital Resilience" bar and the legal-pathway test cleanly: no US parent, no US technology dependency at the substrate, no extraterritorial chain. The Commission's framework names this divide and prices nothing else. The buyer who reads SEAL-3 as "compliant for sensitive workloads" is also committing to a different architectural stack, one that hyperscalers cannot deliver within the sovereign lane. That commitment is the Builder Tax. It is paid in productivity. It is partially redeemed in portability. The framework prices neither.</p><h2>What SEAL Is Selecting For, And What It Isn&#8217;t</h2><p><em><a href="https://www.airealist.ai/p/ten-percent-sovereign">Ten Percent Sovereign</a></em> examined the April 17 award as a procurement signal rather than a sovereignty signal. This piece extends to the cost of the signal not surfacing.</p><p>After member states failed to agree on a European cybersecurity certification scheme for cloud services (EUCS),[9] the Commission needed an internal-market-compatible substitute it could deploy unilaterally. The Cloud Sovereignty Framework methodology, published October 20, 2025,[8] does that work. SEAL &#8212; Sovereignty Effectiveness Assurance Levels &#8212; runs from SEAL-0 (no sovereignty) to SEAL-4 (full EU supply chain from chips to software).[10] It is the measurable yardstick the Commission applied to the Cloud III procurement track without waiting for Council unanimity.</p><p>The framework&#8217;s eight objectives cover strategic, legal-and-jurisdictional, data-and-AI, operational, supply-chain, technology, security-and-compliance, and environmental dimensions, weighted to a 100% sovereignty score.[11] SEAL-3 &#8212; &#8220;Digital Resilience&#8221; &#8212; is defined as immunity from supply-chain disruption by non-EU third parties.[12]</p><p>That is a procurement signal, not a technical one. SEAL-3 says: this provider&#8217;s stack would not break if a non-EU government attempted to coerce it. It does not say: "This provider can run the workload you actually have, on the architecture you actually use.&#8221;</p><p>The framework gets the legal-pathway leg right by design. Scaleway and StackIT have no extraterritorial chain at any layer of the stack. AWS Sovereign Cloud and Azure Sovereign Cloud &#8212; the two principal hyperscaler-built alternatives &#8212; share the same legal posture: improved technical separation, EU-resident operations, customer-managed encryption. And a US parent that the CLOUD Act and FISA reach, regardless. The two products are technically distinct, legally identical. The clean SEAL-3 pool is genuinely more sovereign in the dimension the framework measures.</p><p>That is the leg the framework names. What it does not name is the cost, or what the cost incidentally delivers.</p><p>The argument matters now, not in retrospect, because mini-competitions have not yet begun. The April 17 award was the prequalification, not the workload assignment.[13] Before the first sovereign-track workload is committed to a SEAL-3 provider, the Commission will publish an updated Cloud Sovereignty Framework. Member states preparing their own procurement under CADA &#8212; the forthcoming Cloud and AI Development Act, expected on May 27, 2026, under Article 114 TFEU[14] &#8212; will look to that updated framework as a template.</p><h2>The Architectural Fork</h2><p>Sovereignty is a stack philosophy, not a smaller catalog. The buyer who procures sovereignty under the SEAL-3 frame is committing, without the framework saying so, to a different architectural posture. The hyperscaler-sovereign tracks cannot match it inside the sovereign lane. The clean awardees can match it only because the open-source ecosystem they participate in is cross-vendor by definition. SEAL-3 says: this provider&#8217;s stack is sovereign. What it does not say is that the architecture the buyer must run on that stack is the cloud-native commodity stack, and only that stack.</p><p>Hyperscalers compete on proprietary managed-service depth. AWS Bedrock for managed multi-vendor proprietary-frontier-model APIs. Amazon SageMaker for a full-lifecycle ML platform. AWS Lambda for serverless compute with a deep ecosystem of triggers and integrations. DynamoDB, Aurora, Step Functions, Kinesis. Azure equivalents: Azure OpenAI Service, Azure Machine Learning, Azure Functions, Cosmos DB. Google Cloud equivalents: Vertex AI, Cloud Run, Spanner. The buyer who builds on these primitives captures real productivity gains and accepts lock-in by construction.[15] Migration cost rises with every proprietary primitive consumed, because the application architecture is shaped by the primitive&#8217;s API surface rather than by an open standard.</p><p>The clean SEAL-3 awardees cannot match that proprietary depth. They do not have the engineering scale, the R&amp;D budget, or the ten-year head start. Scaleway publishes approximately 60 distinct services; StackIT publishes 42.[16] The hyperscalers publish more than 200 each.[17] What Scaleway and StackIT can offer &#8212; and offer cleanly &#8212; is the open-source commodity stack: Kubernetes, S3-compatible object storage, PostgreSQL-compatible managed databases, Kafka-compatible streaming, OpenSearch, OpenTelemetry, Prometheus and Grafana, vLLM and SGLang for open-weight model inference, Confidential Computing primitives where available. Open standards. Portable workloads. No managed-primitive lock-in.</p><p>This is not a smaller catalog. This is a different stack philosophy.</p><p>The objection is real and worth answering directly: hyperscalers offer Kubernetes, too. Amazon EKS runs in the AWS European Sovereign Cloud region. Azure Kubernetes Service runs in Microsoft Sovereign Public Cloud. A buyer can, in principle, run pure Kubernetes on AWS Sovereign and capture portability while staying inside the hyperscaler track. The objection collapses on price and rationale. The 15% sovereign premium buys the sovereignty wrapper around the proprietary catalog above the Kubernetes layer &#8212; Bedrock, SageMaker, Lambda, the deep managed-service depth that justifies the AWS procurement decision. A buyer paying that premium for pure Kubernetes-and-open-source on AWS Sovereign is paying for sovereignty assurances, the legal pathway analysis already deemed insufficient &#8212; the parent-company chain still reaches via &#167; 2713 &#8212; to run a workload that commodity infrastructure delivers cheaper elsewhere. The architectural fork cuts both ways.</p><p>Lock-in operates through two mechanisms. The catalog constraint: only certain primitives are natively available, and consuming them shapes the application architecture. The contract gate: production at scale requires an enterprise commitment. Hyperscaler proprietary catalogs operate simultaneously. Bedrock shapes the application around its API surface. SageMaker ties productivity gains to the enterprise commitment level. Lambda triggers lock the hardest &#8212; applications written against Lambda&#8217;s trigger ecosystem cannot be ported without rewriting the trigger logic.</p><p>The clean SEAL-3 stack has no equivalent catalog constraint. The application architecture is shaped by the open-source API, which is the same API that the buyer can deploy on Hetzner, on IONOS, on bare-metal infrastructure in a customer-controlled colocation, or migrate to a future SEAL-3 awardee. The architecture is not lock-in-shaped because the open-source standard does the work that the managed-service catalog would otherwise do.</p><p>This is what the Commission&#8217;s framework implicitly procures without naming. SEAL-3 says: this provider&#8217;s stack is sovereign. What the provider can actually deliver, given its scale and its catalog, is the cloud-native commodity stack. So the framework selects providers that &#8212; by architectural necessity, not by buyer choice &#8212; force a implement a commodity-stack architecture for every workload that lands on them. The framework prices the sovereignty. It does not name that sovereignty equals architectural commitment.</p><h2>The Productivity Cost</h2><p>The productivity cost is real, and it falls on the typical sensitive-workload buyer harder than on the frontier-AI workload buyer.</p><p>The managed multi-vendor proprietary-frontier-model API category is the sharpest. A buyer wanting to A/B-test Claude against GPT, Mistral, and Cohere through a single API &#8212; with managed billing, rate limits, observability, and safety filters &#8212; has that capability natively on the hyperscaler commercial track. The clean SEAL-3 pool runs no managed frontier-model service at all. Scaleway runs open-weight LLMs through its own managed inference; StackIT serves open-weight LLMs through AI Model Serving. Neither runs a multi-vendor-through-one-API service.</p><p>The European frontier-model precedent worth naming is Mistral itself. In February 2024, Mistral premiered Mistral Large first on Azure, and its frontier deployment posture has historically been multi-cloud, with Azure first.[20] Mistral Compute &#8212; restructured around 13,800 NVIDIA GB300 GPUs at the Eclairion-operated Bruy&#232;res-le-Ch&#226;tel site &#8212; is now in the planning phase, not yet in commercial operation as of publication date.[21] The CIO who wants a managed frontier-model service in 2026 chooses between a hyperscaler-sovereign service (delivered partially, with US-parent exposure) and a clean SEAL-3 service (open-weight inference, self-managed).</p><p>The full-lifecycle ML platform category matches the same pattern. The hyperscaler reference is a managed, end-to-end platform that covers data preparation, training, deployment, monitoring, and governance through a single console. The clean SEAL-3 pool offers IaaS GPU access, managed Kubernetes, and open-source ML tooling (Kubeflow, MLflow, Ray, KServe). The buyer assembles the lifecycle from open-source components rather than consuming a managed abstraction. That cost hits teams shipping their first model in weeks. It disappears for teams running mature ML platforms where the open-source assembly is already in place.</p><p>The thinner managed-platform depth across managed-Kafka, FaaS with rich trigger ecosystems, managed-streaming pipelines, and multi-region active-active is the commodity-layer cost. Scaleway can offer managed Kubernetes, but its managed Kafka offering is newer and less feature-rich than its AWS counterpart. StackIT has managed PostgreSQL and managed object storage, but does not run a counterpart to Step Functions or Cosmos DB.[22] Both providers can run workloads that hyperscalers can run. Neither can run <em>every</em> workload. The gap is concentrated in the proprietary layer. Confidential Computing runs asymmetrically: StackIT&#8217;s Confidential Kubernetes exceeds what the hyperscaler-sovereign tracks offer in the same form factor, while AWS Nitro Enclaves and Azure Confidential VMs cover the commercial regions in ways the clean awardees do not.[22]</p><p>The engineering capacity to operate commodity-stack architecture is the second cost. A team assembling Kubernetes, vLLM, and Kafka on a SEAL-3 provider is doing more architectural work than a team consuming Bedrock, SageMaker, and Lambda on AWS commercial. In practice, this means additional DevOps and platform engineering hires&#8212;or retraining existing staff&#8212;to manage infrastructure that the hyperscaler&#8217;s managed services would otherwise abstract away. That cost falls on engineering organizations that did not budget for it when the procurement decision was made on legal-pathway grounds alone.</p><p>The productivity cost is not catastrophic. It is unevenly distributed across workload categories and is absent from the framework&#8217;s signal.</p><h2>The Exit Optionality</h2><p>The opposite leg of the trade is what re-architecture into the commodity stack delivers: exit from hyperscaler lock-in.</p><p>CNCF and FinOps Foundation research on cloud-native architecture has consistently shown that workloads built on standard Kubernetes APIs and open-source data infrastructure carry materially lower migration costs than workloads built on hyperscaler proprietary primitives.[23] The mechanism is straightforward: open API contracts, portable data formats, provider-agnostic observability. Migration from one Kubernetes deployment to another is a redeploy. Migration across hyperscaler proprietary stacks is a rewrite. A streaming pipeline built on Kafka-on-Kubernetes with open-source consumers can move from Scaleway to StackIT by redeploying manifests. The same pipeline built on Amazon MSK with Kinesis triggers and Lambda consumers requires rewriting every integration point.</p><p>Sovereign procurement forces commoditization. Commoditization delivers portability. The buyer that lands on Scaleway&#8217;s commodity stack today can migrate to StackIT, to a future SEAL-3 awardee, to a Hetzner deployment with BSI credentials, or to a customer-controlled colocation &#8212; at redeploy cost, not rewrite cost. The buyer on AWS Commercial who has consumed the hyperscaler-managed service stack has no equivalent path. Even the migration from AWS commercial to AWS European Sovereign Cloud is not free &#8212; the Bedrock catalog differs, GPU instances are absent, and the Lambda trigger ecosystem in the sovereign region is a subset of the commercial one.</p><p>This is the option value, not the exercise value. Most enterprise migrations do not happen. But every workload carries an embedded option to migrate, and the strike price of that option is the migration cost &#8212; a rewrite for hyperscaler-proprietary workloads, a redeploy for commodity-stack workloads. The option does not become more likely to be exercised by being cheaper to exercise. It just becomes cheaper to exercise. Pricing it as if every workload will eventually migrate overstates the benefit; pricing it at zero understates it.</p><p>The broader European commodity-stack market makes the optionality concrete. Hetzner Online GmbH operates one of the largest commodity cloud businesses in Europe, is classified by BSI as a critical infrastructure operator under KRITIS, and holds C5 Type 2 certification as of March 25, 2026.[24] IONOS Cloud Solutions generated &#8364;177 million in segment revenue in fiscal 2024, holds BSI C5 and IT-Grundschutz certifications, won the federal sovereign cloud contract for ITZBund in April 2024, and signed a strategic cooperation agreement with the BSI in January 2026.[25] Neither was prequalified for the Cloud III award. Both run the same commodity stack as the SEAL-3 awardees. A workload deployed on Scaleway or StackIT can migrate to either platform without rewriting the deployment manifests, only the application.</p><p>A company that has re-architected onto the commodity stack is less locked into any individual cloud provider than one running on hyperscaler managed services. Vendor concentration risk is lower. Migration-cost contingencies in the diligence model are lower. Exit optionality at the infrastructure layer is higher. None of this is in the SEAL designation. All of it is in the architectural pattern of the SEAL designation forces.</p><h2>What Mini-Competitions Will Reveal</h2><p>The SEAL-3 framework has not yet been operationally tested. The April 17 award was the prequalification. Workload mini-competitions will follow over the coming months and quarters. Three questions arise.</p><p>First: Will the productivity cost be operationally confirmed? If more than thirty percent of mini-competitions over the next twelve months produce fewer than two qualifying bids &#8212; concentrated in workloads that depend on managed frontier-model APIs, full-lifecycle ML platforms, or Lambda-trigger-shaped serverless architectures &#8212; the cost is confirmed and binding.[27]</p><p>Second: Will the carve-out volume reveal where the framework selects against itself? Workloads the SEAL-3 pool cannot serve will be granted exemptions to use SEAL-2 providers. If those exemptions accumulate into a parallel track that quietly does most of the high-value work, the framework&#8217;s signal is inconsistent with its own selection criteria.</p><p>Third: Is the optionality leg empirically real? Workloads that land on a SEAL-3 provider and subsequently migrate to a different SEAL-3 provider &#8212; or to a non-prequalified European commodity-stack provider &#8212; will demonstrate the migration-cost asymmetry the architectural fork predicts.</p><p>If mini-competitions produce competitive bids across all workload categories, if exemption volume stays marginal, and if no observable migration-cost asymmetry emerges between the commodity-stack pool and the proprietary-stack pool, the thesis is wrong and the Builder Tax does not exist.</p><h2>The Builder Tax</h2><p>Sovereignty in the EU cloud, on April 28, 2026, is a two-track procurement choice with a trade neither side names.</p><p>One track: AWS Sovereign Cloud and Azure Sovereign Cloud. Improved technical separation. US legal chain intact. A thinner version of the proprietary catalog, a 15% premium, and the same parent-company exposure as the CLOUD Act and FISA reach.</p><p>The other track: Scaleway and StackIT. Legal-pathway test cleared by construction. A commodity stack &#8212; Kubernetes, S3-compatible storage, open-source data infrastructure, open-weight inference &#8212; that costs the buyer the proprietary catalog&#8217;s productivity gains and delivers, in exchange, exit from the lock-in that catalog enforces. The exit is not theoretical: Hetzner and IONOS run the same commodity stack outside the SEAL-3 pool, and a workload built on open standards can move between any of them at redeploy cost.</p><p>The framework names neither side of the trade. It rates the providers and sends a signal &#8212; that SEAL-2 carve-outs are equivalent options for sensitive workloads &#8212; that flattens the divide the legal-pathway analysis surfaces. Mini-competitions will reveal what the framework has been quiet about.</p><p>The Builder Tax is the productivity loss on the managed-service layer that the clean awardees cannot deliver. It is partially redeemed by portability &#8212; exit from hyperscaler lock-in &#8212; that the commodity-stack architecture delivers. The Tax falls harder on CTOs running mature managed-service platforms, where the productivity cost is the binding constraint, than on CISOs, who should weigh the legal-pathway leg more heavily and treat the productivity cost as cost-of-compliance. Diligence should weigh exit optionality even more heavily.</p><p>I would recommend applying two tests to any SEAL-tier vendor evaluation. First, evaluate which managed primitives the provider can deliver natively, which require rearchitecture, and which are unavailable. Second, clarify the legal-pathway exposure of the proposed architecture &#8212; parent-company chain, supplier chain, EU-only operational independence &#8212; for the workload&#8217;s data sensitivity. A SEAL-3 designation that does not answer those questions is a sovereignty score, not a procurement signal.</p><p>The Commission&#8217;s updated framework will determine whether the Builder Tax becomes a priced cost in EU public-sector AI infrastructure or remains silent inside a procurement signal. The decisions made in the next six months are the decisions that get templated into the regulatory architecture.</p><p>The framework scores sovereignty. It does not mention the productivity the buyer loses or the portability the buyer gains. Every company should decide whether the latter is worth paying for the former.</p><h2>Notes</h2><p>[1] AWS, <a href="https://press.aboutamazon.com/aws/2026/1/aws-launches-aws-european-sovereign-cloud-and-announces-expansion-across-europe">&#8220;AWS Launches AWS European Sovereign Cloud and Announces Expansion Across Europe,&#8221;</a> Potsdam, January 15, 2026. The AWS European Sovereign Cloud GmbH parent structure is detailed in AWS&#8217;s own published documentation; the operating entity is wholly owned by Amazon.com Inc. The Brandenburg datacenter region opened with two Availability Zones at GA. See also AWS Security Blog, <a href="https://aws.amazon.com/blogs/aws/opening-the-aws-european-sovereign-cloud/">&#8220;Opening the AWS European Sovereign Cloud,&#8221;</a> January 15, 2026.</p><p>[2] AWS European Sovereign Cloud service catalog at general availability: approximately 90 services with plans to expand. Two Availability Zones at launch. Pricing premium of approximately 15% versus commercial EU regions, per independent benchmarking by tecRacer across EC2, S3, RDS, and Lambda price points (cited and analyzed by Cloudvisor). Bedrock available at GA but limited to Amazon Nova Lite and Nova Pro models &#8212; Anthropic Claude, Mistral, Meta Llama, and other proprietary frontier models are absent. No GPU instances. No CloudFront. Source: AWS European Sovereign Cloud documentation; comparative analysis per Cloudvisor, <a href="https://cloudvisor.co/aws-european-sovereign-cloud/">&#8220;Sovereignty as a Service: The AWS European Sovereign Cloud is Live,&#8221;</a> published shortly after launch. AWS publishes &#8220;200+ services&#8221; globally; the comparison figure is Cloudvisor&#8217;s, not an AWS-published number. AWS European Sovereign Cloud also achieved SOC 2 Type 1 and C5 Type 1 attestation reports plus seven ISO certifications covering 69 services on March 16, 2026 &#8212; see AWS Security Blog, <a href="https://aws.amazon.com/blogs/security/aws-european-sovereign-cloud-achieves-first-compliance-milestone-soc-2-and-c5-reports-plus-seven-iso-certifications/">&#8220;AWS European Sovereign Cloud achieves first compliance milestone.&#8221;</a></p><p>[3] Microsoft Sovereign Public Cloud &#8212; the Microsoft-operated EU-Data-Boundary track &#8212; is generally available across European Azure datacenter regions for European customers, supporting Azure, Microsoft 365, Microsoft Security, and Power Platform. The track adds Data Guardian (EU-resident operator approval), External Key Management (customer-controlled encryption), and Regulated Environment Management to the commercial Azure platform. See Microsoft Learn, <a href="https://learn.microsoft.com/en-us/azure/azure-sovereign-clouds/microsoft-sovereign-cloud">&#8220;What is Microsoft Sovereign Cloud?&#8221;</a>. Distinguish from Microsoft Sovereign Private Cloud (Azure Local-based, customer-deployed) and National Partner Clouds &#8212; these are different products. The Sovereign Public Cloud is the AWS European Sovereign Cloud architectural analog for the comparison in this piece.</p><p>[4] AWS European Sovereign Cloud GmbH is 100% owned by Amazon.com Inc. per AWS&#8217;s published structure. Microsoft Sovereign Public Cloud runs on Microsoft Corporation infrastructure; Microsoft Corporation is a US-incorporated entity. The CLOUD Act&#8217;s compelled disclosure provision (<a href="https://www.law.cornell.edu/uscode/text/18/2713">18 U.S.C. &#167; 2713</a>) requires US-incorporated providers to produce data within their &#8220;possession, custody, or control&#8221; regardless of where the data is stored. FISA Section 702 (<a href="https://www.law.cornell.edu/uscode/text/50/1881a">50 U.S.C. &#167; 1881a</a>) authorizes warrantless collection of non-US persons&#8217; communications by US intelligence agencies. The Court of Justice of the European Union&#8217;s <em>Schrems II</em> ruling (<a href="https://curia.europa.eu/juris/liste.jsf?num=C-311%2F18">Case C-311/18</a>, July 16, 2020) cited Section 702 as incompatible with EU fundamental rights. Microsoft France told the French Senate under oath on June 10, 2025 that the company cannot guarantee EU data sovereignty against US authority requests; the testifying witnesses were Anton Carniaux (Director of Public and Legal Affairs, Microsoft France) and Pierre Lagarde (Technical Director, Public Sector). The hearing was held by the Senate inquiry commission on public procurement and digital sovereignty, chaired by Senator Simon Uzenat. Carniaux&#8217;s response to the question of whether Microsoft could guarantee that French citizens&#8217; data would not be transmitted to US authorities without French consent was: &#8220;Non, je ne peux pas le garantir.&#8221; See SDxCentral, <a href="https://www.sdxcentral.com/news/microsoft-tells-french-lawmakers-it-cant-protect-user-data-from-us-demands/">&#8220;Microsoft tells French lawmakers it can&#8217;t protect user data from US demands,&#8221;</a> July 2025; Senate transcript at <a href="https://www.senat.fr/">senat.fr</a>. AWS contests the framing, citing its own Supplementary Addendum and zero-disclosure record since June 2020 (&#8221;there have been no data requests to AWS that resulted in disclosure of enterprise or government content data stored outside the U.S. to the U.S. government&#8221; &#8212; AWS, <a href="https://aws.amazon.com/compliance/cloud-act/">&#8220;Clarifying Lawful Overseas Use of Data Act,&#8221;</a>). The legal exposure is statutory, not retrospective enforcement: the absence of past disclosure does not eliminate the possibility of future ones. For broader analysis of the AWS and Microsoft sovereign cloud legal architectures, see Julien Simon, <a href="https://julsimon.medium.com/two-sovereign-clouds-one-legal-wall-ff39bcc0432b">&#8220;Two Sovereign Clouds, One Legal Wall,&#8221;</a> February 2026.</p><p>[5] European Commission, <a href="https://commission.europa.eu/news-and-media/news/commission-advances-cloud-sovereignty-through-strategic-procurement-2026-04-17_en">&#8220;Commission advances cloud sovereignty through strategic procurement,&#8221;</a> April 17, 2026. The Cloud III procurement is structured as a Dynamic Purchasing System with an estimated total value of up to &#8364;180 million over six years.</p><p>[6] Ibid. The Commission also prequalified two further consortia &#8212; a Post Telecom consortium with OVHcloud and CleverCloud, and a Proximus consortium that uses S3NS, Clarence and Mistral. This piece&#8217;s analysis focuses on Scaleway and StackIT as the clean reference pool because the other two prequalified consortia carry documented foreign-jurisdictional exposure (the OVHcloud Canadian subsidiary case examined in <em><a href="https://julsimon.medium.com/the-sovereignty-mirage-why-european-clouds-wont-save-your-data-a565e82127f5">The Sovereignty Mirage</a></em>, December 4, 2025; and the Datacenter United ownership structure following the February 28, 2025 Proximus datacenter sale to a Cordiant Capital-managed investment vehicle). The piece&#8217;s binary thesis tracks the cleanly comparable cases.</p><p>[7] Julien Simon, <a href="https://www.airealist.ai/p/ten-percent-sovereign">&#8220;Ten Percent Sovereign,&#8221;</a> The AI Realist, April 17, 2026.</p><p>[8] European Commission, <em><a href="https://commission.europa.eu/document/download/09579818-64a6-4dd5-9577-446ab6219113_en">Cloud Sovereignty Framework</a></em>, Version 1.2.1, October 20, 2025.</p><p>[9] On the EUCS deadlock and the Commission&#8217;s response, see European Cybersecurity Certification Scheme for Cloud Services (EUCS) drafting history at the <a href="https://interoperable-europe.ec.europa.eu/collection/eucs-european-cybersecurity-certification-scheme-cloud-services">Interoperable Europe Portal</a>; ENISA, <a href="https://www.enisa.europa.eu/topics/certification/cybersecurity-market/cloud-cybersecurity-market">Cloud Cybersecurity Certification Scheme</a> market and status pages, 2024&#8211;2025.</p><p>[10] The SEAL scoring scheme &#8212; SEAL-0 (no sovereignty), SEAL-1 (Jurisdictional Sovereignty), SEAL-2 (Data Sovereignty), SEAL-3 (Digital Resilience), SEAL-4 (Full Digital Sovereignty: complete EU control across the supply chain) &#8212; is documented in European Commission, <em><a href="https://commission.europa.eu/document/download/09579818-64a6-4dd5-9577-446ab6219113_en">Cloud Sovereignty Framework</a></em>, Version 1.2.1 (October 20, 2025). For the April 17, 2026 procurement, the Commission set SEAL-2 as the minimum eligibility threshold; the prequalified Scaleway and StackIT consortia clear SEAL-3.</p><p>[11] The eight Cloud Sovereignty Framework objectives, with their published weights summing to 100% of the Sovereignty Score: SOV-1 Strategic Sovereignty (15%); SOV-2 Legal and Jurisdictional Sovereignty (10%); SOV-3 Data and AI Sovereignty (10%); SOV-4 Operational Sovereignty (15%); SOV-5 Supply Chain Sovereignty (20%); SOV-6 Technology Sovereignty (15%); SOV-7 Security and Compliance (10%); SOV-8 Environmental Sovereignty (5%). Source: European Commission, <em><a href="https://commission.europa.eu/document/download/09579818-64a6-4dd5-9577-446ab6219113_en">Cloud Sovereignty Framework</a></em>, Version 1.2.1, October 20, 2025.</p><p>[12] European Commission, <em><a href="https://commission.europa.eu/document/download/09579818-64a6-4dd5-9577-446ab6219113_en">Cloud Sovereignty Framework</a></em>, Version 1.2.1, SEAL-3 definition (Digital Resilience).</p><p>[13] European Commission, <a href="https://commission.europa.eu/news-and-media/news/commission-advances-cloud-sovereignty-through-strategic-procurement-2026-04-17_en">&#8220;Commission advances cloud sovereignty through strategic procurement,&#8221;</a> April 17, 2026 announcement: &#8220;This is the first step. Mini-competitions will follow over the coming months.&#8221;</p><p>[14] CADA was originally listed for Q1 2026 in the European Commission&#8217;s <a href="https://commission.europa.eu/strategy-and-policy/strategy-documents/commission-work-programme/commission-work-programme-2026_en">2026 Work Programme</a> (October 20, 2025) under Article 114 TFEU. The proposal is now expected on May 27, 2026, per techUK&#8217;s <a href="https://www.techuk.org/resource/dispatch-from-brussels-updates-on-eu-tech-policy-march.html">&#8220;Dispatch from Brussels&#8221;</a> (March 2026), as a flagship of the Commission&#8217;s &#8220;tech sovereignty package&#8221; &#8212; alongside a parallel revision of EU public procurement rules. Stated objectives include tripling EU data centre capacity, EU-wide eligibility requirements for cloud service providers, and a single EU-wide cloud policy for public administrations and procurement. See European Parliament Legislative Train Schedule, <a href="https://www.europarl.europa.eu/legislative-train/theme-a-new-plan-for-europe-s-sustainable-prosperity-and-competitiveness/file-cloud-and-ai-development-act">&#8220;Cloud and AI Development Act&#8221;</a>; EPRS briefing, <a href="https://www.europarl.europa.eu/thinktank/en/document/EPRS_BRI(2025)779251">&#8220;Cloud and AI Development Act&#8221;</a> (PE 779.251, December 2025).</p><p>[15] The Catalog-and-Contract Test framework was developed in the AI Tooling vertical of this publication; see Julien Simon, <a href="https://julsimon.medium.com/open-source-closed-orbit-b874004f7517">&#8220;Open Source, Closed Orbit: The Hardware Monopolist&#8217;s Guide to Owning Open Source,&#8221;</a> The AI Realist, March 2026. The test originally targeted lock-in mechanisms in the consumer side of the AI ecosystem (model hosting, inference platforms, developer tooling); this piece extends the framework to the build side of the cloud stack, where the same dual-mechanism (catalog constraint + contract gate) operates.</p><p>[16] <a href="https://www.scaleway.com/en/all-products/">Scaleway product navigation</a>, retrieved April 28, 2026. <a href="https://www.stackit.de/en/products/">StackIT product documentation portal</a>, retrieved April 28, 2026.</p><p>[17] AWS publishes <a href="https://aws.amazon.com/products/">&#8220;200+ services&#8221;</a> globally per its corporate communications. Microsoft Azure and Google Cloud publish comparable counts. Counts vary by methodology &#8212; distinct services, distinct API endpoints, or distinct billable units produce different numbers. For the purposes of this piece, the order-of-magnitude comparison is sufficient.</p><p>[19] AWS European Sovereign Cloud Bedrock catalog at GA: limited to Amazon Nova Lite and Nova Pro, per AWS European Sovereign Cloud documentation and <a href="https://cloudvisor.co/aws-european-sovereign-cloud/">Cloudvisor analysis</a>. Anthropic Claude, Meta Llama, Mistral, and other proprietary or open-weight frontier models from third-party vendors are absent from the sovereign-region Bedrock catalog at GA.</p><p>[20] Microsoft Azure blog, <a href="https://azure.microsoft.com/en-us/blog/microsoft-and-mistral-ai-announce-new-partnership-to-accelerate-ai-innovation-and-introduce-mistral-large-first-on-azure/">&#8220;Microsoft and Mistral AI announce new partnership to accelerate AI innovation and introduce Mistral Large first on Azure,&#8221;</a> February 26, 2024. Mistral Large premiered first on Azure AI before becoming available on other deployment surfaces.</p><p>[21] Mistral Compute press materials, 2025&#8211;2026. The original announcement contemplated 18,000 NVIDIA Grace Blackwell systems hosted via Scaleway. The structure was subsequently reorganized around approximately 13,800 NVIDIA GB300 GPUs at the Eclairion-operated Bruy&#232;res-le-Ch&#226;tel site (44 MW), funded through a $830M (~&#8364;750M) debt facility from a seven-bank consortium (Bpifrance, BNP Paribas, Cr&#233;dit Agricole CIB, HSBC, La Banque Postale, MUFG, Natixis CIB), announced March 30, 2026. See DatacenterDynamics, <a href="https://www.datacenterdynamics.com/en/news/mistral-ai-raises-830m-in-debt-financing-for-data-center-in-paris-france/">&#8220;Mistral AI raises $830m in debt financing for data center in Paris, France.&#8221;</a> Operations expected to begin Q2 2026.</p><p>[22] <a href="https://www.stackit.de/en/products/">StackIT product portfolio</a>, retrieved April 28, 2026; <a href="https://www.scaleway.com/en/all-products/">Scaleway product documentation</a>, retrieved April 28, 2026. Confidential Computing detail: StackIT publishes a <a href="https://www.stackit.de/en/product/stackit-kubernetes-engine">Confidential Kubernetes</a> offering that the hyperscaler-sovereign tracks do not match in the same managed-Kubernetes form factor; AWS <a href="https://aws.amazon.com/ec2/nitro/nitro-enclaves/">Nitro Enclaves</a> (AWS commercial regions), <a href="https://learn.microsoft.com/en-us/azure/confidential-computing/confidential-vm-overview">Azure Confidential VMs</a>, and <a href="https://cloud.google.com/confidential-computing">GCP Confidential VMs</a> (commercial regions) provide managed Confidential Computing that the clean SEAL-3 pool does not match in the hyperscaler-managed-VM form factor. AWS European Sovereign Cloud&#8217;s launch catalog at GA does not include managed Confidential Computing as a top-line service offering.</p><p>[23] Cloud Native Computing Foundation, <a href="https://www.cncf.io/reports/">Annual Survey</a>, 2024 and 2025 editions &#8212; documenting Kubernetes adoption in production at 66%+ of respondents and rising multi-cloud deployment as the dominant pattern. FinOps Foundation, <a href="https://stateoffinops.org/">State of FinOps</a> reports, 2024 and 2025 editions &#8212; documenting that organizations running multi-cloud commodity-stack architectures report materially lower switching costs than those dependent on single-provider proprietary services. The comparative-cost mechanism is also analyzed in Mompo Redoli &amp; Ullah, <a href="https://arxiv.org/abs/2504.11007">&#8220;Kubernetes in the Cloud vs. Bare Metal: A Comparative Study of Network Costs,&#8221;</a> arXiv:2504.11007, April 2025.</p><p>[24] Hetzner Online GmbH, <a href="https://www.hetzner.com/news/hetzner-receives-bsi-c5-certification/">&#8220;Hetzner receives BSI C5 Type 2 certification,&#8221;</a> March 25, 2026. Hetzner is classified as a critical infrastructure operator under the German KRITIS regime per BSI designation; see <a href="https://docs.hetzner.com/general/company-and-policy/information-security-at-hetzner/">Hetzner information security documentation</a>.</p><p>[25] IONOS Group SE <a href="https://www.ionos-group.com/fileadmin/Publications/Berichte/IONOS_Annual_Report_2024.pdf">FY2024 Annual Report</a>: IONOS Cloud Solutions segment revenue &#8364;177 million; total IONOS Group revenue &#8364;1.56 billion. IONOS BSI C5 attestation achieved November 7, 2023; IONOS BSI IT-Grundschutz certification achieved September 2022. ITZBund framework contract awarded April 2, 2024 &#8212; see IONOS, <a href="https://www.ionos-group.com/investor-relations/publications/announcements/ionos-builds-cloud-solution-for-the-german-federal-administration.html">&#8220;IONOS builds cloud solution for the German federal administration,&#8221;</a> five-year term, &#8364;410M ceiling. BSI strategic cooperation agreement signed January 13, 2026, by BSI President Claudia Plattner and IONOS CTO Markus Noga; see <a href="https://www.bsi.bund.de/DE/Service-Navi/Presse/Pressemitteilungen/Presse2026/260113_Digitale_Souveraenitaet_Cloud_Computing.html">BSI press release</a> and <a href="https://www.ionos.de/newsroom/news/ionos-und-bsi-vereinbaren-strategische-kooperation-fuer-souveraene-cloud-sicherheit-in-deutschland/">IONOS newsroom</a>.</p><p>[26] Disclosure: the author serves as AI Operating Partner at Fortino Capital, a European private equity firm whose portfolio includes companies whose cloud architecture decisions are within the scope of this piece&#8217;s analysis. This disclosure illustrates the structural diligence question the piece names; it is not an endorsement of any specific provider or procurement choice.</p><p>[27] Mini-competition outcome thresholds are derived from comparable EU framework procurement programs and represent an order-of-magnitude observable signal. The &#8220;thirty percent&#8221; threshold is the author&#8217;s calibration, not a Commission-defined metric. Note: the Commission does not systematically publish mini-competition bid counts. The first falsifiability condition depends on whether the Commission or member-state procurement authorities make bid data available &#8212; either directly or through parliamentary scrutiny of sovereign cloud procurement outcomes.</p>]]></content:encoded></item><item><title><![CDATA[Ten Percent Sovereign]]></title><description><![CDATA[The European Commission awarded &#8364;180 million under a framework that scores legal sovereignty at ten percent &#8212; and didn&#8217;t see what an Ontario court had just done.]]></description><link>https://www.airealist.ai/p/ten-percent-sovereign</link><guid isPermaLink="false">https://www.airealist.ai/p/ten-percent-sovereign</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Wed, 22 Apr 2026 08:37:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8PCP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8PCP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8PCP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!8PCP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!8PCP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!8PCP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8PCP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2019713,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/194821600?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8PCP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!8PCP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!8PCP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!8PCP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b476f4d-d815-439c-8a9c-ad2ca7550b2e_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On page six of the European Commission&#8217;s Cloud Sovereignty Framework (CSF), in a paragraph no one asked for, nothing required, and apparently no lawyer reviewed, the Commission explains why Legal and Jurisdictional Sovereignty counts for only ten percent of the Sovereignty Score. The text reads: <em>&#8220;The weighting considers that the procurement procedure already contains significant safeguards in certain domains such as SOV-2 (Legal and Jurisdictional) and SOV-7 (Security and Compliance).&#8221;</em>[1]</p><p>This is the Commission arguing, in its own document, that it downweighted legal sovereignty because the procurement process was already handling it. It is difficult to produce a more self-incriminating sentence. </p><p>Six months after that framework was published, the Commission awarded &#8364;180 million in framework contracts under it.[2] The press release notes that one winning consortium &#8220;leverages capacities of partners S3NS, Clarence, and Mistral from a technical environment based on Google Cloud technology, exclusively operated by EU companies.&#8221;[3] That will certainly raise a few eyebrows.</p><p>And did the Commission ignore that the consortium&#8217;s Belgian data center footprint is owned, with fifty percent voting control, by a Canadian-managed investment vehicle? And that one month before the Commission published its framework, the jurisdiction that manages that vehicle had issued a ruling from the Ontario Court of Justice &#8212; compelling a French sovereign-cloud champion to hand over EU-hosted customer data to Canadian police, and dismissing France&#8217;s blocking statute as an &#8220;empty vessel&#8221;?[4] </p><p>Oops.</p><p>Either way, the Commission did not relax the definition of sovereignty. It scored it down to ten percent.</p><h2>The Award</h2><p>The EU&#8217;s Cloud III Dynamic Purchasing System ran the procurement. Four consortia won framework contracts &#8212; maximum authorized spend &#8364;180 million, six-year term &#8212; to provide sovereign cloud services to EU institutions and agencies.[5]</p><p>The winners: Post Luxembourg&#8217;s data center subsidiary DEEP, in partnership with OVHcloud and Clever Cloud. Schwarz Digits&#8217; StackIT, the cloud unit of the Schwarz Group behind the Lidl and Kaufland retail chains. Iliad&#8217;s Scaleway. And a Proximus-led consortium bundling S3NS, Clarence, Mistral AI, and Thales.[6]</p><p>No winner reached SEAL-4 (Sovereignty Effectiveness Assurance Level 4), &#8220;Full Digital Sovereignty&#8221;, defined as <em>&#8220;Technology and operations under complete EU control, subject only to EU law, with no critical non-EU dependencies.&#8221;</em>[7]</p><p>Three of the four &#8212; Post/OVHcloud/Clever Cloud, StackIT, and Scaleway &#8212; reached SEAL-3, defined as <em>&#8220;EU law applicable and enforceable, EU actors exercising meaningful but not full influence; service, technology or operations under marginal control of non-EU third parties.&#8221;</em> The Proximus consortium reached SEAL-2, defined as <em>&#8220;EU law applicable and enforceable, with material non-EU dependencies remaining; service, technology or operations under indirect control of non-EU third parties.&#8221;</em> </p><p>The tier progression encodes a distinction that the piece will return to. SEAL-2 permits &#8220;<em>material non-EU dependencies remaining</em>.&#8221; SEAL-3 permits only &#8220;<em>marginal control of non-EU third parties</em>.&#8221; SEAL-4 permits no critical non-EU dependencies at all and requires operations &#8220;<em>subject only to EU law</em>.&#8221; The framework treats these as cumulative steps on a single ladder. They are not. They are three different questions, and a consortium can satisfy the first two while remaining legally exposed on the third.</p><p>The Commission drafted SEAL-4 as Full Digital Sovereignty, published it in the framework, and awarded &#8364;180 million without using it. Europe&#8217;s most credible sovereign cloud providers, competing in a procurement designed by the Commission to measure sovereignty, could not clear the sovereignty tier the Commission itself defined as the goal. SEAL-4 is not a target. It is the placeholder the framework reserves for a future in which European digital sovereignty exists.</p><h2>The Eight Objectives</h2><p>The Cloud Sovereignty Framework, Version 1.2.1, was published by the Commission in October 2025, after iterative drafts spanning roughly two years. It organizes sovereignty into eight objectives, each with a weighting that sums to 100%.[8]</p><p>The critical ratio, for a PE or procurement reader: <strong>supply chain</strong> (SOV-5) and <strong>technology</strong> (SOV-6) together carry 35% of the Sovereignty Score; <strong>operational sovereignty</strong> (SOV-4) another 15%; <strong>strategic sovereignty</strong> (SOV-1) 15%. <strong>Legal and jurisdictional sovereignty</strong> (SOV-2) &#8212; the objective that measures exposure to non-EU legal reach &#8212; carries a 10% weighting. <strong>Security and compliance</strong> (SOV-7): another 10; <strong>data and AI sovereignty</strong> (SOV-3): 10; <strong>environmental sustainability</strong> (SOV-8): 5.[9]</p><div class="pullquote"><p>The framework, with its heaviest weights, measures whether a provider&#8217;s hardware and software stack is built and operated in the EU. It measures, with the lightest legal weight it has, whether the provider&#8217;s data is actually beyond non-EU legal reach. </p></div><p>This is defensible only if physical and operational EU control substantially produces legal EU insulation. The assumption is load-bearing. It is also contestable.</p><p>The framework&#8217;s SOV-2 scoring rubric enumerates two specific foreign instruments: the United States CLOUD Act and the Chinese Cybersecurity Law, with a residual reference to &#8220;non-EU laws with cross-border reach.&#8221;[10] The CLOUD Act deserves its prominence. It is the most documented threat to EU data sovereignty, with Microsoft, Google, and Amazon each publishing transparency reports detailing US authority demands for EU-hosted data. The framework was right to center it.</p><p>But the framework was published in October 2025. One month earlier, on September 25, 2025, a Canadian court issued a ruling that demonstrated exactly what the framework&#8217;s &#8220;non-EU laws with cross-border reach&#8221; catch-all does not capture&#8212;and, materially, does not name.</p><h2>The Ottawa Ruling</h2><p>The case is <em>R. v. OVH Group SA and H&#233;bergement OVH Inc.</em>, Court File 24-000659, Ontario Court of Justice, before Justice Heather Perkins-McVey.[11]</p><p>In April 2024, the Royal Canadian Mounted Police (RCMP) obtained a Production Order under section 487.014(1) of the Canadian Criminal Code, targeting subscriber data and metadata associated with four IP addresses hosted on OVH servers in France, the United Kingdom, and Australia. The investigation is a national security matter; the underlying facts are sealed. </p><div class="pullquote"><p>The jurisdictional question is: does a Canadian criminal court have jurisdiction to compel a French cloud provider, through its Montreal-based subsidiary, to produce data stored entirely on European soil?</p></div><p><strong>OVH&#8217;s defense was triple-layered</strong>. The French parent argued it had no physical presence in Canada and that its Montreal subsidiary, H&#233;bergement OVH Inc., was a separate legal entity that did not control the parent&#8217;s data. The French Blocking Statute &#8212; Loi 68-678 of July 26, 1968, strengthened by Decree No. 2022-207 &#8212; prohibited French companies and personnel from disclosing economic or technical data to foreign public authorities outside formal treaty channels, on penalty of six months&#8217; imprisonment and fines up to &#8364;90,000 for legal persons. And Canada and France have a Mutual Legal Assistance Treaty (MLAT); France&#8217;s Ministry of Justice had offered expedited processing. The lawful channel was sitting open.[12] </p><p><strong>The court rejected all three.</strong></p><p>On jurisdiction, Justice Perkins-McVey applied what Canadian courts call the &#8220;virtual presence&#8221; doctrine: a seven-year settled line across four provinces holding that a foreign corporation with a &#8220;real and substantial connection&#8221; to Canada &#8212; data centers, Canadian customers, marketing targeting Canadian users &#8212; falls within Canadian jurisdiction regardless of where its data is physically stored.[13] OVH conceded that it controlled the data and was capable of responding to a lawful court order.[14] From there, the question answered itself.</p><p>On the blocking statute, the court acknowledged that Article 1 bis of Loi 68-678 applied. France&#8217;s SISSE &#8212; the Service de l&#8217;Information Strat&#233;gique et de la S&#233;curit&#233; &#201;conomiques, designated as the official enforcement point under the 2022 reforms &#8212; had written to OVH in May 2024 asserting that disclosure would violate French law. The court weighed that evidence and dismissed it. Only one conviction had ever been recorded under Article 1 bis in nearly half a century since it was added to the statute in 1980. Expert witnesses could not point to a single case in which the statute had been respected by a foreign court. Justice Perkins-McVey adopted the English High Court&#8217;s &#8220;empty vessel&#8221; characterization &#8212; the standard articulated by Butcher J in <em>Tugushev v. Orlov</em> and applied in <em>Joshua v. Renault</em> &#8212; concluding that <strong>the blocking statute, in practical effect, is an empty vessel.</strong>[15]</p><p>On MLAT, the court held the treaty process permissive, not mandatory: Canadian courts retain jurisdiction to issue production orders where the target is present in Canada, and the availability of MLAT does not preclude that jurisdiction.[16]</p><p>The application to revoke the production order was dismissed. OVH Group SA and H&#233;bergement OVH Inc. were ordered to comply by October 27, 2025. OVH filed for judicial review to the Ontario Superior Court of Justice through Miller Thomson at the end of October 2025. As of this writing, that appeal is pending.[17]</p><p>Three facts carry forward from the ruling into the framework analysis.</p><p><strong>The &#8220;virtual presence&#8221; doctrine is not a one-off improvisation by a trial court</strong>. It is a seven-year settled line across British Columbia, Alberta, Quebec, and Ontario, anchored by two provincial Courts of Appeal and reinforced by two 2025 Superior Court rulings extending the doctrine to restraint and management orders. A single Newfoundland ruling that pointed the other way was explicitly declared non-binding.</p><p><strong>OVH&#8217;s own sovereignty marketing was quoted by the Crown as evidence of control</strong>. OVH&#8217;s corporate website describes the company as <em>&#8220;part of an active ecosystem that shares the same values, and a common vision of a sovereign cloud&#8221;</em> &#8212; language lifted directly into the Crown&#8217;s compendium at Tab 4 as proof that OVH operates as a unified global enterprise over which its Canadian subsidiary exercises possession or control.[18] The branding that OVH built to sell sovereignty became the evidence that defeated its sovereignty defense in court.</p><p>And a twenty-five-year-old Ontario Superior Court ruling, <em>Wilson v. Servier Canada</em>, had already held in the civil context that <strong>French blocking statutes should be given &#8220;minimal weight&#8221; when they conflict with Canadian proceedings</strong> anchored on a real and substantial connection to Ontario.[19] <em>Wilson</em> was a civil class action, not a criminal production order, and the analogy has limits. But the Canadian courts&#8217; general posture toward French extraterritorial protection predates 2024 by decades. </p><p>CSF Version 1.2.1 was published one month after Justice Perkins-McVey released her ruling. A Commission drafter scoping the CSF&#8217;s SOV-2 risk vectors would have found it on a first search.</p><div class="pullquote"><p>The framework that Europe will use to classify cloud providers as sovereign does not, on its face, incorporate the jurisdiction that had just demonstrated, against a French sovereign-cloud champion, exactly what the framework claims to measure.</p></div><h2>The Proximus Consortium</h2><p>Of the four winning consortia, only the Proximus-led group scored SEAL-2. It also carries the most complex non-EU exposure, and the Commission&#8217;s press release names the exposure on both ends of the stack.</p><p><strong>The consortium bundles five entities</strong>. Clarence is a Luxembourg-based joint venture between Proximus and LuxConnect (a Luxembourg state-owned company) that deploys Google Distributed Cloud in an air-gapped configuration. S3NS is Thales Cloud S&#233;curis&#233; SAS, a Thales subsidiary licensed to operate Google Cloud technology from three &#206;le-de-France facilities under the French &#8220;Cloud de Confiance&#8221; framework, qualified SecNumCloud 3.2 in December 2025.[20] Mistral AI provides model services. Thales contributes security infrastructure, including hardware security modules. Proximus is the Belgian telecommunications incumbent &#8212; with the Belgian state holding an economic stake of 53.51 percent through SFPI-FPIM (Soci&#233;t&#233; F&#233;d&#233;rale de Participations et d&#8217;Investissement) &#8212; and acts as a systems integrator for Belgian federal agencies, covering more than 70,000 users.[21]</p><p><strong>Two of the five &#8212; S3NS and Clarence &#8212; run Google Cloud technology</strong>. Both sit within sovereign-cloud qualification constraints intended to insulate the technology from its American owner. The mechanism is corporate and operational: Thales holds a controlling interest in S3NS with Google as a minority partner, and only cleared EU citizens can access production systems. Thales holds the hardware security module keys, and the Google software stack is deployed through a technical quarantine intended to prevent upstream control. The French ANSSI referential for SecNumCloud 3.2 caps non-EU ownership at 24 percent individually and 39 percent collectively, covering both share capital and voting rights, direct or indirect. Google&#8217;s stake in S3NS is structurally constrained below these caps by the qualification requirement itself; the exact equity split has not been publicly disclosed by Thales or Google.[22] For its part, the Commission&#8217;s April 17 press release accepts the arrangement at SEAL-2 &#8212; &#8220;Data Sovereignty&#8221; in the framework&#8217;s own naming. It describes the technical environment as &#8220;<em>based on Google Cloud technology, exclusively operated by EU companies</em>.&#8221;[23]</p><p><strong>The SecNumCloud qualification is real</strong>. The personnel, key-custody, and operational constraints it imposes on Google-powered partnerships materially reduce the likelihood that a CLOUD Act demand on the American partner reaches tenant data, the specific threat vector the framework was designed to address. What SecNumCloud does not address is an exposure category involving different actors in the stack: the physical infrastructure owner, the infrastructure owner&#8217;s shareholders, and the shareholders&#8217; investment manager. That exposure is not within the qualified entity&#8217;s control, nor is it within the framework&#8217;s named risk vectors.</p><p><strong>The more direct exposure is elsewhere in the consortium&#8217;s stack</strong>. The integrator workloads Proximus runs for Belgian federal agencies &#8212; the logging infrastructure, the operational telemetry, the coordination surfaces &#8212; run on Belgian data centers. Proximus itself owned those facilities until March 2025, when it sold them to Datacenter United for &#8364;128 million cash, against a combined enterprise value of &#8364;200.5 million.[24]</p><p>Who owns Datacenter United? TINC NV, a Belgian-listed infrastructure fund, holds 47.5 percent economic / 50 percent voting. Cordiant Digital Infrastructure Limited, together with a second Cordiant-managed vehicle, holds 47.5 percent economic / 50 percent voting, with aggregate equity consideration of &#8364;92.3 million. Friso Haringsma, CEO, holds 5 percent of non-voting shares.[25]</p><p>Let&#8217;s keep digging. Cordiant Digital Infrastructure Limited is a closed-end investment company listed on the London Stock Exchange since February 2021. Its investment manager &#8212; the firm that controls investment decisions, allocation, and board representation on behalf of the listed company &#8212; is Cordiant Capital Inc., a private markets asset management firm headquartered in Montreal, Quebec, with additional offices in London, Luxembourg, and S&#227;o Paulo.[26] </p><div class="pullquote"><p>The entity with fifty percent voting control over the company that owns the Belgian data centers hosting the Proximus consortium&#8217;s integrator stack is a UK-listed vehicle managed by a Canadian firm.</p></div><p>The analytical parallel to <em>OVH</em> is imperfect, and the piece acknowledges this. Cordiant Capital Inc. is an asset manager; it does not directly access tenant data, and a Canadian production order to Cordiant would face a higher &#8220;possession or control&#8221; hurdle than OVH Canada faced. A fund&#8217;s management jurisdiction imposes regulatory obligations on the fund, not automatically on the tenants of its portfolio companies.</p><p>The objections are real. What they do not close is the structural exposure. Belgium has no equivalent to Loi 68-678. <strong>The legal barrier the </strong><em><strong>OVH</strong></em><strong> ruling dismissed is not even in place to be dismissed here</strong>. The seven-year Canadian precedent line is still expanding, and <em>In re TD Bank Production Order</em> in 2025 narrowed the &#8220;possession or control&#8221; test further &#8212; <strong>the Quebec Superior Court found that a Canadian parent had possession or control over data held by a US subsidiary on the strength of its corporate control, without requiring direct technical access</strong>. That test applied to the Cordiant chain is harder than OVH Canada faced, but not categorically different from the one TD Bank lost on.[27] </p><div class="pullquote"><p>The framework&#8217;s SOV-2 names only the CLOUD Act and the Chinese Cybersecurity Law. They do not name the Canadian &#8220;virtual presence&#8221; doctrine. They do not name the UK Investigatory Powers Act 2016, whose &#8220;telecommunications operator&#8221; definition under section 261(10) reaches any person who provides a service to UK users or controls a system operated from the UK, and whose technical capability notices under section 253 can be served extraterritorially. They do not name the Five Eyes coordination framework through which extraterritorial demands are routinely shared.</p></div><p>The exposure is present in the ownership chain, but has not yet been demonstrated in an enforcement event. Risk committees should assess exposure categories that the framework does not name without assuming any specific exposure has materialized.</p><h2>What Actually Exists</h2><p>Three of the four winners come closer to what the framework&#8217;s SEAL-4 language describes.</p><p>The <strong>Post Luxembourg consortium</strong> runs through DEEP, the data center subsidiary of Post Telecom (itself a subsidiary of POST Luxembourg, 100% owned by the Luxembourg state). DEEP operates three data centers in Luxembourg at Windhof, Kayl, and Betzdorf, carrying Tier IV Uptime Institute certification &#8212; Kayl and Betzdorf at Tier IV Constructed Facility level, Windhof at Tier IV Design Documents level &#8212; and hosts OVHcloud infrastructure and Clever Cloud workloads on sovereign Luxembourg soil.[28] OVHcloud itself is listed on Euronext Paris; the Klaba family holds approximately 81 percent of shares.[29] Clever Cloud is French and privately held. The full stack runs on EU-owned infrastructure in an EU jurisdiction. This is a genuine sovereignty claim at SEAL-3, and if the framework&#8217;s SOV-2 actually tested legal exposure, it could plausibly have scored higher.</p><p><strong>StackIT</strong>, the cloud unit of Schwarz Digits, runs in data centers the Schwarz Group has operated for decades &#8212; DC01 in Neckarsulm, DC08 in Ellhofen, DC10 in Ostermiething, Austria, and a new 200 MW facility under construction at L&#252;bbenau with an &#8364;11 billion capex commitment through 2027.[30] Schwarz Group, the privately-held Schwarz family holding that also controls Lidl and Kaufland, reported fiscal year 2024 total sales of &#8364;175.4 billion; Schwarz Digits segment revenue was &#8364;1.9 billion, with StackIT as one of several business units within the digital segment.[31] No non-EU entity appears in the ownership or operational chain.</p><p><strong>Scaleway</strong>&#8217;s parent is Iliad, the French telecommunications group controlled by Xavier Niel. Its Paris-anchored sovereign-qualified infrastructure &#8212; Paris DC2 through DC5, Lyon, Marseille, and facilities in Poland &#8212; is operated by OpCore, carved out of Scaleway in July 2024 and converted into a fifty-fifty joint venture between Iliad and InfraVia Capital Partners on March 2025 closing, at an enterprise value &#8364;860 million and a &#8364;2.5 billion capex commitment over ten years.[32] InfraVia is a French independent private equity firm headquartered in Paris, regulated under the EU&#8217;s Alternative Investment Fund Managers Directive.</p><p>Three of the four consortium members run infrastructure that the framework was designed to produce. The SOV-5 Supply Chain weight &#8212; the heaviest in the rubric at 20% &#8212; rewards EU-owned hardware and operations, and the SEAL-2 eligibility floor keeps non-sovereign offerings out entirely. The result: <strong>all four winners operate on EU soil with EU-controlled physical layers. Physical sovereignty is genuine at three of the four.</strong></p><p>The framework succeeded at physical sovereignty. It was not designed to deliver legal sovereignty, and SOV-2 at 10% never was. The SEAL-4 definition says it plainly: "<em>subject only to EU law</em>." </p><p>That is not a question of where servers sit. The question is whether every legal avenue through which a non-EU authority could compel disclosure has been closed. None of the four winners can answer yes. </p><div class="pullquote"><p>No winner reached SEAL-4. The burning question is why?</p></div><p>The Commission has not published individual scoring breakdowns. We know the tier outcomes; we do not know which objectives each winner failed, by how much, or whether the gap was structural or marginal. That opacity is itself a finding.</p><h2>The Governance Pattern</h2><p>The Commission&#8217;s sovereign cloud framework is the third European attempt in three years to codify cloud sovereignty as a procurement instrument. EUCS &#8212; the European Union Cybersecurity Certification Scheme for cloud services &#8212; stalled in draft over exactly this question: <strong>whether the highest assurance level should require ownership by an EU entity not subject to non-EU law</strong>. France pushed for the requirement, with Italy and Spain supporting it. A coalition of Member States, including Denmark, Estonia, Greece, Ireland, the Netherlands, Poland, and Sweden, resisted, signing a joint non-paper in July 2022; by mid-2023, approximately twelve Member States opposed the sovereignty requirement. The final March 2024 draft dropped the ownership requirement in favor of an International Company Profile Attestation.[33] The proposed Cloud and AI Development Act (CADA), announced but not yet tabled as of April 2026, is the Commission&#8217;s next attempt.[34]</p><p>The CSF filled the gap left by EUCS, and CADA has not closed it. The Commission chose to do what it could procure &#8212; a scoring rubric tied to a dynamic purchasing system &#8212; rather than what it could not secure politically: <strong>a legal definition of sovereignty with binding ownership constraints.</strong></p><p>For three years, Paris held the line in the EUCS negotiations: sovereignty meant EU ownership, full stop, no hyperscaler joint ventures, no Cloud de Confiance workarounds. Italy and Spain agreed. Twelve Member States did not. The EUCS draft dropped the ownership requirement in March 2024. The maximalist French demand became a scoring coefficient that the Commission preemptively justified as redundant.</p><p>CISPE, the European cloud providers&#8217; trade association, called the outcome six months early. Its October 24, 2025 statement &#8220;No Such Thing as &#8216;75% Sovereign&#8217;&#8221; warned that the framework&#8217;s ten-percent legal weighting would allow hyperscaler-backed offerings to qualify; Secretary General Francisco Mingorance was quoted saying &#8220;<em>the big players will be able to achieve very high overall scores that minimize the impact of poor results in legal and jurisdictional sovereignty &#8212; which only account for 10% of the total score</em>.&#8221; The Commission did not dispute the analysis. It published the award six months later, exactly as described.[36]</p><h2>What Would Have to Break</h2><p>The Commission&#8217;s April 17 press release announcing the &#8364;180 million award also announced the framework revision that would incorporate the lessons learned from the award. In the same press release. The Commission published the awards and the planned fix in a single document, which is either unusual procedural efficiency or a tell about what the Commission expected the reaction to be. Lessons are ordinarily learned after events. Here they were scheduled before the event&#8217;s ink dried.</p><p><strong>The CSF&#8217;s defenders can argue the framework was never meant to be final</strong>. SEAL-4 exists in the definition even though no winner reached it; CADA is on the legislative calendar; the Commission stated in its April 17 announcement that it &#8220;will publish an updated version of the Sovereign Cloud Framework based on lessons learned from this tender.&#8221; True. Insufficient. A lessons-learned revision is a calibration instrument &#8212; it adjusts weights and contributing factors. The structural question is whether a future CSF iteration will restructure the SOV-2 weighting and extend the named risk vectors beyond the CLOUD Act and the Chinese Cybersecurity Law, not as an administrative update, but as a political choice.</p><p><strong>The real test is whether the ten-percent SOV-2 weight reflects a drafting accident that  a later iteration will correct, or a negotiated outcome</strong> reflecting a coalition that cannot accept binding legal sovereignty as a procurement requirement. The Commission&#8217;s own justification &#8212; that procurement procedure &#8220;already contains significant safeguards&#8221; at SOV-2 &#8212; points to the second. </p><div class="pullquote"><p>If SOV-2 were weighted at thirty percent, the Sovereignty Score formula would downrank all winners below the SEAL-3 threshold that the other objectives produced</p></div><p>The Proximus consortium&#8217;s composite score would fall below what the other three winners achieved. If SOV-2 contributing factors extended beyond the two named threats, Commonwealth extraterritorial doctrines would enter the evaluation matrix. <strong>The ten-percent weight and the two named threats are the mechanisms by which the framework was shipped to procurement on its intended timeline</strong>. Removing them reopens the EUCS deadlock.</p><p>This thesis would be wrong if CADA, when tabled, restructures the SOV-2 weighting and extends the named risk vectors to cover Canadian, UK, and Five Eyes extraterritorial reach, and does so with binding effect. It would be wrong if the Ontario Superior Court of Justice overturns <em>OVH</em> on the pending appeal, unwinding the case on narrow procedural grounds that deny broader doctrinal implications.</p><p>None of these seems likely. CADA has been on the European Parliament&#8217;s legislative train since 2025 and has not been tabled; the Q1 2026 tabling window closed without a Commission proposal. The <em>OVH</em> appeal, even if successful, unwinds the specific case but not the doctrine &#8212; the Canadian precedent line is seven years and four provinces deep, and one reversal does not erase two Courts of Appeal and two 2025 Superior Court rulings. European institutions process data continuously; the framework contracts are live now.</p><p>What actually breaks the pattern is an enforcement event. A Canadian production order that reaches a Cordiant-managed Belgian data center; a UK Investigatory Powers Act technical capability notice served under section 253 on a telecommunications operator in the ownership chain of a Commission-awarded tenant&#8217;s infrastructure; a CJEU ruling, on preliminary reference under GDPR Articles 44&#8211;48, that personal data processed under a CSF-awarded contract fell outside the transfer framework because the CSF&#8217;s sovereignty assurance did not substantially block non-EU access. Until one of those arrives, the framework holds. </p><p>The CSF&#8217;s enforcement failure, like the failures that preceded EUCS and CADA, is not a bug. It is the negotiated outcome. The Commission&#8217;s page-six explanation &#8212; that SOV-2 is downweighted because procurement procedure &#8220;already contains significant safeguards&#8221; &#8212; is, read strictly, accurate. The procurement process did contain safeguards. They were the ones the Commission chose to put there.</p><div class="pullquote"><p>The Cloud Sovereignty Framework is not a measurement instrument. It is a justification instrument. </p></div><p>The weights were set, the named threats were chosen, and the SEAL tiers were defined to produce a specific outcome: a procurement the Commission could run, with winners it could defend, under a rubric it controlled. The OVH ruling didn&#8217;t fit that rubric. The Cordiant ownership chain didn&#8217;t fit that rubric. The  Canadian virtual presence doctrine didn&#8217;t fit that rubric. So they were not named, SOV-2 was weighted at 10%, and the contracts were awarded on schedule.</p><p>The European Commission did not relax the definition of sovereignty on April 17, 2026. <strong>It built a scoring matrix that made its preferred outcome look like a sovereignty finding</strong>, awarded &#8364;180 million under it, and announced the lessons-learned revision in the same press release. The next framework will be more defensible. It will not be more sovereign.</p><p>Did you really expect something else?</p><div><hr></div><h3>Notes</h3><p>[1] European Commission, Directorate-General for Digital Services, <em><a href="https://commission.europa.eu/document/download/09579818-64a6-4dd5-9577-446ab6219113_en">Cloud Sovereignty Framework</a></em>, Version 1.2.1, October 2025, p. 6 (Section 5, Computation of Sovereignty Score). The document cover carries &#8220;Version 1.2.1 &#8211; Oct. 2025&#8221;; external sources (CISERO, Interoperable Europe Portal, CISPE responses) place publication on or about October 20, 2025. Direct quotation from the document&#8217;s justification for SOV-2 and SOV-7 weighting.</p><p>[2] European Commission press release, &#8220;<a href="https://ec.europa.eu/commission/presscorner/detail/en/ip_26_833">Commission advances cloud sovereignty through strategic procurement</a>,&#8221; IP/26/833, April 17, 2026. &#8364;180 million is the maximum authorized spend under the framework contract, not a committed disbursement; actual call-offs across the six-year term will determine spend. Mechanism: Cloud III Dynamic Purchasing System. The same announcement states that the Commission &#8220;will also publish an updated version of the Sovereign Cloud Framework based on lessons learned from this tender&#8221;; this is cited in the body&#8217;s Act 6 as the Commission&#8217;s own acknowledgment of a lessons-learned revision cycle.</p><p>[3] European Commission press release, April 17, 2026, describing the Proximus-led consortium&#8217;s technical architecture. Direct quotation.</p><p>[4] <em>R. v. OVH Group SA and H&#233;bergement OVH Inc.</em>, Ontario Court of Justice (Ottawa), Court File 24-000659, Heather E. Perkins-McVey J., decision released September 25, 2025. <a href="https://drive.google.com/file/d/1QVwO9lPmxuDSQsGd9fHH3QN_ToXs2LQ8/view">Signed final decision</a> available via David Fraser, McInnes Cooper. The &#8220;empty vessel&#8221; characterization originates in Butcher J&#8217;s judgment in <em>Tugushev v. Orlov</em>, [2021] EWHC 1514 (Comm) at paragraph [33], and was applied to Loi 68-678 in <em>Thomas John Joshua et al v. Renault S.A.</em>, [2024] EWHC 1424 (KB) at paragraphs [77]&#8211;[78]. Adopted in <em>R. v. OVH</em> at paragraph [115].</p><p>[5] European Commission announcement, April 17, 2026. The four awardees are identified in the Commission&#8217;s release.</p><p>[6] Winner composition per Commission announcement. Thales is named as a consortium partner in Commission and Proximus materials; the Commission press release specifically names S3NS, Clarence, and Mistral in describing the Google Cloud technology base.</p><p>[7] <em>Cloud Sovereignty Framework</em>, Version 1.2.1, Section 3, page 3. SEAL tier names and definitions verbatim from the document: SEAL-0 &#8220;No Sovereignty&#8221;; SEAL-1 &#8220;Jurisdictional Sovereignty&#8221;; SEAL-2 &#8220;Data Sovereignty&#8221;; SEAL-3 &#8220;Digital Resilience&#8221;; SEAL-4 &#8220;Full Digital Sovereignty.&#8221;</p><p>[8] <em>Cloud Sovereignty Framework</em>, Version 1.2.1, October 2025. Document cover dated &#8220;Oct. 2025&#8221;; contemporaneous coverage and Commission register place publication on or about October 20, 2025.</p><p>[9] <em>Cloud Sovereignty Framework</em>, Version 1.2.1, Section 5, page 6 weighting table. Per-objective weights: SOV-1 Strategic Sovereignty 15%, SOV-2 Legal &amp; Jurisdictional Sovereignty 10%, SOV-3 Data &amp; AI Sovereignty 10%, SOV-4 Operational Sovereignty 15%, SOV-5 Supply Chain Sovereignty 20%, SOV-6 Technology Sovereignty 15%, SOV-7 Security &amp; Compliance Sovereignty 10%, SOV-8 Environmental Sustainability 5%. SOV-5 is the largest single weight; SOV-8 is the smallest and, notably, the only objective titled &#8220;Sustainability&#8221; rather than &#8220;Sovereignty.&#8221;</p><p>[10] <em>Cloud Sovereignty Framework</em> v1.2.1, SOV-2 contributing factors. The framework enumerates the US CLOUD Act and the Chinese Cybersecurity Law and adds a catch-all reference to &#8220;non-EU laws with cross-border reach&#8221; without naming specific Commonwealth jurisdictions.</p><p>[11] <em>R. v. OVH Group SA and H&#233;bergement OVH Inc.</em>, Ontario Court of Justice, Court File 24-000659, Ottawa, Heather E. Perkins-McVey J., decision released September 25, 2025. Crown counsel: Michael Fawcett; OVH counsel: Scott Spencer (Miller Thomson). The ruling is a statutory review under the Criminal Code s. 487.0193(4) of a Production Order issued under s. 487.014. The underlying Production Order was issued in April 2024; paragraph [1] of the ruling gives April 11, and paragraph [24] gives April 19 with specific IP/date associations, the latter treated as authoritative. Investigation sealed (national security).</p><p>[12] Summary of OVH&#8217;s position per paragraphs [3]&#8211;[9] of the ruling. French blocking statute at Loi 68-678 of July 26, 1968, with Article 1 bis added by Loi n&#176; 80-538 of 16 July 1980, strengthened by Decree No. 2022-207 of February 18, 2022. Statutory penalty scheme under Article 3, as amended: imprisonment up to 6 months and fines up to &#8364;18,000 for individuals; for legal persons, the fine is fixed at 5&#215; the individual fine under Article 131-38 of the French Code p&#233;nal, producing a maximum of &#8364;90,000.</p><p>[13] <em>British Columbia (Attorney General) v. Brecknell</em>, 2018 BCCA 5 (CanLII); <em>R v Love</em>, 2022 ABCA 269 (CanLII); <em>In the Matter of textPlus Inc.</em>, 2022 ONSC 7413 (CanLII); <em>In re TD Bank Production Order</em>, 2025 QCCS 2094; <em>R v Binance Holdings Ltd.</em>, 2025 ONSC 7113. Per <em>R. v. OVH</em> ruling paragraphs [40]&#8211;[53]. The Newfoundland authority <em>In the Matter of an application to obtain a Production Order</em>, 2018 Carswell Nfld 19, is distinguished as non-binding at paragraph [55]. The doctrine remains contested within the Canadian privacy bar &#8212; see David Fraser, Canadian Privacy Law Blog, December 5, 2025, arguing that <em>Brecknell</em> itself was wrongly decided on its facts (Craigslist had voluntarily accepted jurisdiction, a fact the <em>OVH</em> court did not preserve as a limiting principle). The piece&#8217;s framing reflects the doctrine as applied by the courts, not uniform agreement among Canadian commentators.</p><p>[14] Ruling at paragraph [60]: &#8220;OVH Parent has acknowledged that it controls the data sought and is capable of responding to a lawful court order (OVH Factum at para. 34).&#8221; Jurisdictional analysis at paragraphs [56]&#8211;[63].</p><p>[15] Ruling at paragraphs [79]&#8211;[123]. &#8220;Empty vessel&#8221; standard adopted from Butcher J in <em>Tugushev v. Orlov</em>, [2021] EWHC 1514 (Comm) at paragraph [33], applied to Loi 68-678 in <em>Thomas John Joshua et al v. Renault S.A.</em>, [2024] EWHC 1424 (KB) at paragraphs [77]&#8211;[78]. Sole reported Article 1 bis conviction is the &#8220;Christopher X&#8221; case, Cour de cassation chambre criminelle, 12 December 2007, pourvoi n&#176; 07-83.228, addressed at ruling paragraph [85]. Australian authority on the French Blocking Statute: <em>ACCC v. Prysmian Cavi e Sistemi Energia S.R.L.</em> (No 4), [2012] FCA 1323. US authority: <em>Societe Nationale Industrielle Aerospatiale v. US District Court</em>, 482 U.S. 522 (1987). The 2022 SISSE reforms under Decree No. 2022-207 are discussed at paragraphs [86]&#8211;[87]. The SISSE letter to OVH dated May 27, 2024 appears as Appendix B to the Barri&#232;re Affidavit (ruling at paragraph [108]); secondary press reports reference a further letter in January 2025 not directly quoted in the ruling.</p><p>[16] Ruling at paragraphs [97]&#8211;[105]. Canadian authority on MLAT&#8217;s permissive nature: <em>R v Strong</em>, 2020 ONSC 7528 at paragraphs 103, 112. The court&#8217;s conclusion that MLAT is not mandatory and does not preclude Canadian production orders appears at paragraphs [17] and [99].</p><p>[17] Ruling disposition at paragraphs [124]&#8211;[126]. <a href="https://www.heise.de/en/news/Canadian-Court-OVHcloud-from-France-must-hand-over-user-data-11092029.html">Appeal filing confirmed by Heise Online</a>, November 26, 2025, citing OVH filings. As of April 2026, no Ontario Superior Court ruling on the judicial review has been released.</p><p>[18] Ruling at paragraph [12], quoting OVH Canada&#8217;s &#8220;About Us&#8221; and &#8220;Our Values&#8221; pages as reproduced in the Crown&#8217;s compendium at Tab 4.</p><p>[19] <em>Wilson v. Servier Canada Inc.</em>, 2000 CanLII 22407 (ONSC), cited at ruling paragraph [95]. An Ontario class action against a French pharmaceutical company, concerning French Civil Code Article 15 (a civil blocking statute asserting exclusive French jurisdiction). Civil procedural context, not criminal production orders; the analogy holds on Canadian judicial posture toward French extraterritorial protection in cross-border contexts, but does not establish criminal production-order precedent.</p><p>[20] S3NS (Thales Cloud S&#233;curis&#233; SAS), RCS Paris 908 211 980, qualified SecNumCloud 3.2 by ANSSI on December 17, 2025. Thales/Google joint venture structure per <a href="https://www.s3ns.io/cgu">S3NS General Terms of Use</a>.</p><p>[21] Proximus NV, 2025 Integrated Annual Report, capital structure section. The Belgian state economic stake held via SFPI-FPIM (Soci&#233;t&#233; F&#233;d&#233;rale de Participations et d&#8217;Investissement) is stated at 53.51%. Integrator scope: Proximus NXT SECaaS1 press materials reference 10 key government entities and 70,000+ users served under the federal framework; the Commission has not published a specific tender-linked agency count.</p><p>[22] SecNumCloud v3.2 referential (ANSSI, 2022) caps non-EU ownership of qualified entities at 24% individually and 39% collectively, applied to both share capital and voting rights, whether held directly or indirectly. Thales is the controlling shareholder of S3NS; the exact Google equity stake has not been publicly disclosed by Thales or Google. ANSSI&#8217;s December 17, 2025, qualification of S3NS 3.2 is the formal finding that the referential&#8217;s ownership test is met.</p><p>[23] European Commission press release, April 17, 2026, describing the Proximus-led consortium&#8217;s technical architecture: &#8220;Proximus leverages capacities of partners S3NS, Clarenc,e and Mistral from a technical environment based on Google Cloud technology, exclusively operated by EU companies.&#8221;</p><p>[24] Proximus NV press release, &#8220;<a href="https://www.proximus.com/news/2025/20250303-proximus-sells-its-datacenters.html">Proximus sells its data centers to Datacenter United</a>,&#8221; March 3, 2025. Transaction structure: &#8364;128 million cash consideration to Proximus; combined enterprise value &#8364;200.5 million. Post-sale Proximus remainsthe  anchor tenant under a 10-year master services agreement with annual pricing review.</p><p>[25] <a href="https://www.cordiantcap.com/cordiant-digital-infrastructure-limited-to-acquire-stakes-in-two-belgian-data-center-providers-expanding-its-presence-in-western-europe/">Datacenter United ownership structure</a> per TINC NV announcement and Cordiant Digital Infrastructure Limited regulatory disclosure, March 2025. TINC NV: 47.5% economic / 50% voting. Cordiant Digital Infrastructure Limited, together with a second Cordiant-managed fund: 47.5% economic / 50% voting, with aggregate equity consideration of &#8364;92.3 million across both Cordiant vehicles. Friso Haringsma (CEO): 5% non-voting.</p><p>[26] Cordiant Digital Infrastructure Limited is listed on the London Stock Exchange (ticker CORD) since February 2021. Investment management performed by Cordiant Capital Inc., a private markets asset manager headquartered in Montreal, Quebec, with additional offices in London, Luxembourg, and S&#227;o Paulo. Per Cordiant Capital Inc. corporate disclosures.</p><p>[27] <em>In re TD Bank Production Order</em>, 2025 QCCS 2094 (Quebec Superior Court). The Court found TD Bank Canada had &#8220;possession or control&#8221; over records held by its US subsidiary on the strength of corporate control, without requiring direct technical access by Canadian employees. See also <em>R v Binance Holdings Ltd.</em>, 2025 ONSC 7113, extending <em>Brecknell</em>&#8216;s &#8220;real and substantial connection&#8221; test to restraint and management orders. Per <em>R. v. OVH</em> ruling paragraph [53].</p><p>[28] Post Telecom corporate structure: Post Telecom is a subsidiary of POST Luxembourg, which is 100% owned by the Luxembourg state. DEEP is Post Telecom&#8217;s data center subsidiary, consolidating the former EBRC, Digora, Elgon, and POST Telecom cloud units under a single brand. Three data centers at Windhof, Kayl, and Betzdorf; Tier IV certification status per Uptime Institute public database: Kayl (Tier IV Design + Constructed Facility, 2013), Betzdorf (Tier IV Design + Constructed Facility, 2015), Windhof (Tier IV Design Documents only).</p><p>[29] OVH Groupe SA, <a href="https://corporate.ovhcloud.com/en/newsroom/news/fy2024-annual-results/">FY2024 Annual Results</a>, October 23, 2024. Klaba family's share capital increased from ~68% to ~81% following the 2024 buyback completion, per OVHcloud's press release and Bredin Prat's deal notes. Voting rights are materially higher through French double-voting loyalty provisions, controlling provision for listed companies at Article L.22-10-46 of the Code de commerce since the 2020 recodification (Ordonnance n&#176; 2020-1142), cross-referencing Article L.225-123; loi Florange (Loi n&#176; 2014-384 of 29 March 2014); voting rights reported at approximately 82% as of late 2025.</p><p>[30] Schwarz Group data center operations: DC01 Neckarsulm, DC08 Ellhofen (Germany), DC10 Ostermiething (Austria). L&#252;bbenau facility: 200 MW planned capacity, &#8364;11 billion capex commitment cited in Schwarz Group materials, target completion 2027.</p><p>[31] Schwarz Group FY24 press communication, May 22, 2025. Group total sales &#8364;175.4 billion for fiscal year 2024 (March 1, 2024 to February 28, 2025); Schwarz Group reports on a March&#8211;February fiscal year. Schwarz Digits segment revenue &#8364;1.9 billion; Schwarz Digits includes StackIT, XMCloud, and other digital operations &#8212; StackIT portion not separately disclosed. StackIT&#8217;s L&#252;bbenau facility is the fourth German site; additional capacity sites, including Berlin and the Austrian Ostermiething facility, bring the operational footprint to 4&#8211;7 data centers, depending on the definitional scope.</p><p>[32] InfraVia Capital Partners and Iliad Group press release, &#8220;<a href="https://infraviacapital.com/the-iliad-group-and-infravia-partner-to-develop-a-major-european-hyperscale-data-center-platform/">The Iliad Group and InfraVia partner to develop a major European hyperscale data center platform</a>,&#8221; December 4, 2024 (announcement); closing March 2025. OpCore was carved out of Scaleway on July 1, 2024, as a standalone Iliad data-center vehicle; the 50/50 Iliad/InfraVia JV structure was announced on December 4, 2024, and closed in March 2025. Enterprise value &#8364;860 million at JV formation; capex commitment stated as &#8220;more than &#8364;2.5 billion&#8221; over a decade, structured as joint capital deployment commitments of the 50/50 shareholders rather than a contractually enforceable obligation to a third party. InfraVia Capital Partners registered as an Alternative Investment Fund Manager with the French AMF (GP 08 000018) under the EU&#8217;s AIFMD framework.</p><p>[33] EUCS (European Union Cybersecurity Certification Scheme for Cloud Services) consultation history per ENISA public records. The High+ assurance level&#8217;s &#8220;Annex J&#8221; sovereignty requirements &#8212; EU headquarters, EU ownership, immunity from non-EU law &#8212; were contested between 2021 and 2024. A July 2022 joint non-paper opposing the sovereignty requirement was signed by at least Denmark, Estonia, Greece, Ireland, the Netherlands, Poland, and Sweden, with Lithuania sometimes added. France, supported by Italy and Spain, championed the requirement; Germany&#8217;s position was mixed. By mid-2023, EUISS reporting cited approximately twelve Member States opposing. The March 22, 2024 ENISA draft dropped sovereignty requirements in favor of an International Company Profile Attestation (ICPA). The scheme had not been formally adopted as of April 2026.</p><p>[34] <a href="https://www.europarl.europa.eu/legislative-train/theme-a-new-plan-for-europe-s-sustainable-prosperity-and-competitiveness/file-cloud-and-ai-development-act">Cloud and AI Development Act (CADA)</a>, status &#8220;Announced&#8221; on the European Parliament Legislative Train as of March 20, 2026. The expected Q1 2026 tabling window closed without a Commission proposal.</p><p>[35] Emmanuel Macron, Artificial Intelligence Action Summit, Grand Palais, Paris, February 10, 2025. &#8220;France is back in the AI race&#8221; per <a href="https://www.france24.com/en/france/20250210-live-macron-speaks-on-the-future-of-ai-at-global-summit-in-paris">France 24 live coverage</a>, February 10, 2025. &#8220;Plug baby, plug&#8221; quotation per <a href="https://www.csis.org/analysis/frances-ai-action-summit">CSIS analysis, &#8220;France&#8217;s AI Action Summit&#8221;</a>. &#8364;109 billion investment figure and &#8364;50 billion UAE contribution per contemporaneous summit coverage. The Paris AI Action Summit took place February 10&#8211;11, 2025; the EUCS Annex J sovereignty requirements were dropped from the ENISA draft in March 2024.</p><p>[36] <a href="https://cispe.cloud/">CISPE (Cloud Infrastructure Services Providers in Europe)</a>, &#8220;No Such Thing as &#8216;75% Sovereign&#8217;: CISPE Responds to the European Commission&#8217;s Sovereign Cloud Framework,&#8221; October 24, 2025. Francisco Mingorance's quotation per interview with Incyber News, November 2025. CISPE subsequently issued a January 2026 position paper expanding the critique.</p>]]></content:encoded></item><item><title><![CDATA[The Price of “Primary”]]></title><description><![CDATA[Jensen Huang said the quiet part out loud on April 15. Amazon just priced it.]]></description><link>https://www.airealist.ai/p/the-price-of-primary</link><guid isPermaLink="false">https://www.airealist.ai/p/the-price-of-primary</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Tue, 21 Apr 2026 11:37:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!w-xw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w-xw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w-xw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!w-xw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!w-xw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!w-xw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w-xw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2349740,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/194903098?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w-xw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!w-xw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!w-xw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!w-xw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d7ae39a-4627-476e-89ec-9e1eb91de978_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On April 15, Jensen Huang told Dwarkesh Patel that Trainium and TPU external growth was &#8220;one hundred percent Anthropic&#8221; &#8212; that Anthropic &#8220;is a unique instance, not a trend.&#8221;[1] The market read that as reassurance about Nvidia&#8217;s position. <a href="https://www.airealist.ai/">Yesterday&#8217;s piece</a> argued it was the opposite: a concession, on the record, that the entire alt-silicon market is one customer, and that the customer exists because its equity-for-compute need lined up with a hyperscaler that had both the silicon and the checkbook.</p><p>Six days later, and nine days before earnings, Amazon priced that concession.</p><p>Today&#8217;s announcement pushes the investment structure to a potential cumulative position of $33 billion: Amazon invests $5 billion in Anthropic now and up to $20 billion more &#8220;tied to certain commercial milestones,&#8221; on top of the $8 billion already committed.[2] What the $25 billion of incremental equity buys, under the contract, is one word: <em>primary.</em></p><h2>Three rounds, three times the same word</h2><p><strong>September 2023:</strong> Amazon invests $4 billion. Anthropic names AWS &#8220;its primary cloud provider.&#8221;[3]</p><p><strong>November 2024:</strong> Amazon invests another $4 billion. Anthropic names AWS &#8220;our primary cloud and training partner.&#8221;[4]</p><p><strong>April 2026:</strong> Amazon invests $5 billion now, with up to $20 billion more contingent on milestones. Anthropic&#8217;s language: &#8220;We continue to choose AWS as our primary training and cloud provider for mission-critical workloads.&#8221;[5]</p><p>Three rounds. Three times the same word. At $4 billion cumulative, &#8220;primary.&#8221; At $8 billion cumulative, &#8220;primary.&#8221; At up to $33 billion cumulative, still &#8220;primary&#8221; &#8212; now narrowed by a qualifier that did not exist in either previous agreement. &#8220;For mission-critical workloads&#8221; is not a deepening of exclusivity. It is a ring-fence around which AWS retains claim to workloads, and, by extension, which it does not.</p><p>The reason the ring-fence appears in this round and not the previous two is disclosed in the same document: Claude is now &#8220;the only frontier AI model available to customers on all three of the world&#8217;s largest cloud platforms: AWS (Bedrock), Google Cloud (Vertex AI), and Microsoft Azure (Foundry).&#8221;[5] &#8220;Primary&#8221; in 2026 refers to one of three formally first-class hyperscaler relationships; in 2023, it referred to the only such relationship.</p><h2>The triangulation mechanics</h2><p>The $25 billion reads as defensive spending when compared with the two deals that preceded it.</p><p>On November 18, 2025, Microsoft invested up to $5 billion in Anthropic, and Nvidia committed up to $10 billion, against a $30 billion Anthropic commitment to Azure compute and up to 1 GW of Grace Blackwell and Vera Rubin capacity.[6] On April 6, 2026, Broadcom disclosed in an 8-K that Anthropic had committed to approximately 3.5 gigawatts (GW) of Google TPU capacity starting in 2027, on top of the 1 GW already coming online in 2026 through the October 2025 Google Cloud agreement.[7] Mizuho estimates Broadcom&#8217;s Anthropic-attributed revenue at $21 billion in 2026 and $42 billion in 2027.[8]</p><p>Amazon&#8217;s response, three weeks later, commits up to $33 billion in cumulative equity against $100 billion in Anthropic AWS spend over 10 years and 5 GW of Trainium capacity, with roughly 1 GW expected by the end of 2026.[2] Across the three hyperscalers, Anthropic now holds up to $40 billion in potential equity commitments against compute spend exceeding $130 billion flowing back to them.</p><p>AWS remains primary by the measure that matters operationally &#8212; 5 GW is five times the initial Azure commitment and one-and-a-half times the 2027 Google ramp. That is what Jassy&#8217;s quote points to, and it is the real commercial substance: capacity, Claude Platform native integration inside AWS accounts, and the Annapurna engineering loop on Trainium chip design.[2] None of that is nothing. It is also not what Amazon has historically paid for. The equity-to-capacity ratio has moved from $4 billion per &#8220;primary cloud provider&#8221; to $4 billion per &#8220;primary cloud and training partner,&#8221; and now up to $25 billion per &#8220;primary training and cloud provider for mission-critical workloads.&#8221; Capacity scaled. The word narrowed.</p><p>The round-trip is affordable for each hyperscaler because of the other side of Anthropic&#8217;s balance sheet. Anthropic disclosed today that run-rate revenue has &#8220;surpassed $30 billion, up from approximately $9 billion at the end of 2025&#8221; &#8212; a 3.3x expansion in four months.[5] At that growth rate, losing primary status to Azure or Google Cloud is not a loss of a billion-dollar customer. It is a tens-of-billions-per-year revenue stream compounding on the capex already committed.[9]</p><h2>What the $20 billion actually is</h2><p>The $20 billion tranche is not an investment commitment. It is a commercial-milestone-gated option on further equity issuance, with the milestones undisclosed in today&#8217;s filing. Broadcom&#8217;s April 8-K used parallel language from the other direction: &#8220;The consumption of such expanded AI compute capacity by Anthropic is dependent on Anthropic&#8217;s continued commercial success.&#8221;[7]</p><p>Both sides of every deal in the triangle are now demand-contingent. The equity flows if Anthropic keeps compounding. The compute flows if Anthropic keeps compounding. The hyperscaler revenue flows if Anthropic keeps compounding. None of it flows if the $9 billion to $30 billion trajectory from end-2025 to April-2026 is a pull-forward rather than a durable curve.</p><p>That is the unstated reason &#8220;for mission-critical workloads&#8221; appeared in the 2026 language. If growth continues, the qualifier is decorative &#8212; everything becomes mission-critical. If growth pauses, the qualifier is that Anthropic retains operational flexibility to shift non-critical workloads across the three clouds based on price and capacity, which is exactly what a compute buyer with three first-class supplier relationships should do.</p><p>The qualifier hovers over a second distribution boundary that took shape two weeks earlier. Amazon Bedrock launched Claude Mythos Preview on April 7 as a &#8220;gated research preview&#8221; &#8212; available only in the US East (N. Virginia) region and only to allow-listed organizations.[10] Anthropic declined to make Mythos generally available, citing its autonomous hacking capabilities. Internal testing surfaced a 27-year-old OpenBSD vulnerability and chained Linux kernel privilege-escalation exploits, among other findings across major operating systems and browsers.[11] </p><p>Eight days after the Bedrock launch, the European Commission opened a formal inquiry into the model, invoking the EU general-purpose AI Code of Practice that Anthropic has signed: assessment obligations apply to services &#8220;that may or may not be offered in Europe.&#8221;[12] The &#8220;meaningful expansion of international inference in Asia and Europe&#8221; that today&#8217;s Amazon press release cites applies to the standard Claude product line. Mythos &#8212; the most mission-critical workload Anthropic ships &#8212; sits inside Bedrock but outside that expansion. &#8220;Primary&#8221; is now narrowing on three axes: the linguistic qualifier in the contract, the product-geography tiering that restricts the frontier cybersecurity model to a single US region within Bedrock, and the regulatory boundary that keeps that tier out of the EU.</p><h2>Jensen&#8217;s concession, priced</h2><p>Yesterday&#8217;s piece argued that custom AI silicon is additive to Nvidia rather than substitutive because the only external customer who matters is Anthropic &#8212; and Anthropic exists because of a difficult-to-replicate capital-structure arrangement. Today&#8217;s announcement confirms the diagnosis from both sides. Trainium&#8217;s external pull-through is so concentrated in Anthropic that Amazon paid up to $25 billion in incremental equity to hold contract language already in place at $8 billion cumulative, plus a narrowing qualifier. The lab, which was a structural anomaly in November 2024, has extracted up to $40 billion in potential hyperscaler equity in six months, even as compute commitments exceed $130 billion. The capital-structure arrangement that built the one customer is now the default funding model for the labs behind the leaders.</p><p>Yesterday&#8217;s piece closed on who the next Anthropic would be. Today prices the first one: three rounds of &#8220;primary,&#8221; a narrowing to &#8220;mission-critical,&#8221; a second tier carved out at the product-geography boundary, and $33 billion of potential equity held by a single hyperscaler that cannot afford to let the relationship downgrade.</p><p>The market had all day Monday to read the Amazon press release as a capital raise. The number it should read instead is $25 billion &#8212; the price of a word, in a document that simultaneously narrows the word&#8217;s scope.</p><div><hr></div><h3>Notes</h3><p>[1] <a href="https://www.dwarkesh.com/p/jensen-huang">&#8220;Jensen Huang &#8211; TPU competition, why we should sell chips to China, &amp; Nvidia&#8217;s supply chain moat,&#8221;</a> Dwarkesh Podcast, April 15, 2026. Huang stated that Anthropic &#8220;is a unique instance, not a trend&#8221; and that Trainium and TPU external growth is &#8220;one hundred percent Anthropic.&#8221; See <a href="https://www.airealist.ai/">&#8220;Anthropic Is Not a Trend,&#8221;</a> The AI Realist, April 20, 2026, for the Additive-vs-Substitutive Test framework and the capital-structure argument referenced throughout this piece.</p><p>[2] <a href="https://www.aboutamazon.com/news/company-news/amazon-invests-additional-5-billion-anthropic-ai">&#8220;Amazon and Anthropic expand strategic collaboration,&#8221;</a> Amazon, April 21, 2026. Structure: $5 billion disbursed now, up to $20 billion more &#8220;tied to certain commercial milestones,&#8221; on top of the previously committed $8 billion. Commercial commitment: Anthropic to spend more than $100 billion on AWS technologies over ten years, including Trainium2, Trainium3, Trainium4, and future generations; up to 5 GW capacity. Claude Platform on AWS provides native Anthropic console access through AWS accounts without additional credentials or billing relationships. The Annapurna engineering collaboration is described in the same document as daily communication between the Anthropic and AWS engineering teams regarding the Trainium chip design.</p><p>[3] <a href="https://www.anthropic.com/news/anthropic-amazon">&#8220;Expanding access to safer AI with Amazon,&#8221;</a> Anthropic, September 25, 2023. Original announcement of $4 billion with AWS as &#8220;primary cloud provider.&#8221; Language is precise and worth preserving in quotation: &#8220;primary cloud provider,&#8221; not &#8220;exclusive&#8221; and not &#8220;sole.&#8221; Google was already an Anthropic investor at this time; the &#8220;primary cloud provider&#8221; designation was formally applied to AWS alone.</p><p>[4] <a href="https://www.anthropic.com/news/anthropic-amazon-trainium">&#8220;Powering the next generation of AI development with AWS,&#8221;</a> Anthropic, November 22, 2024. An additional $4 billion, bringing the cumulative total to $8 billion. AWS designation upgraded to &#8220;primary cloud and training partner.&#8221; The Trainium training commitment appears here for the first time.</p><p>[5] <a href="https://www.anthropic.com/news/anthropic-amazon-compute">&#8220;Anthropic and Amazon expand collaboration for up to 5 gigawatts of new compute,&#8221;</a> Anthropic, April 20, 2026 (dated one day prior to the Amazon press release). Full quote: &#8220;We continue to choose AWS as our primary training and cloud provider for mission-critical workloads.&#8221; The &#8220;mission-critical workloads&#8221; qualifier is new in the 2026 language and does not appear in the 2023 or 2024 announcements. The triple-hyperscaler framing (&#8221;Claude remains the only frontier AI model available to customers on all three of the world&#8217;s largest cloud platforms&#8221;) is also new. Revenue disclosure: &#8220;Our run-rate revenue has now surpassed $30 billion, up from approximately $9 billion at the end of 2025.&#8221; Consumer strain disclosure: Our unprecedented consumer growth, in particular, has impacted reliability and performance for free, Pro, Max, and Team users, especially during peak hours.&#8221; &#8212; Tthe operational case for capacity expansion sits alongside the capital-structure analysis and does not displace it.</p><p>[6] <a href="https://blogs.microsoft.com/blog/2025/11/18/microsoft-nvidia-and-anthropic-announce-strategic-partnerships/">&#8220;Microsoft, NVIDIA and Anthropic announce strategic partnerships,&#8221;</a> Microsoft Official Blog, November 18, 2025. Anthropic commits to $30 billion of Azure compute capacity and up to 1 GW of capacity on Nvidia Grace Blackwell and Vera Rubin systems. Microsoft invests up to $5 billion; Nvidia invests up to $10 billion. Anthropic valuation at the time of this deal: approximately $350 billion, up from $183 billion in September 2025. See also <a href="https://www.cnbc.com/2025/11/18/anthropic-ai-azure-microsoft-nvidia.html">CNBC coverage</a> of the valuation trajectory.</p><p>[7] <a href="https://www.sec.gov/Archives/edgar/data/1730168/000119312526144028/d87999d8k.htm">Broadcom Form 8-K,</a> filed April 6, 2026. Direct quote: &#8220;Anthropic, beginning in 2027, will access through Broadcom approximately 3.5 gigawatts as part of the multiple gigawatts of nex- generation TPU-based AI compute capacity committed by Anthropic. The consumption of such expanded AI compute capacity by Anthropic is dependent on Anthropic&#8217;s continued commercial success.&#8221; This is the disclosure that formalizes demand-contingency on the compute side, mirroring the milestone-gating on Amazon&#8217;s $20 billion tranche.</p><p>[8] <a href="https://www.cnbc.com/2026/04/06/broadcom-agrees-to-expanded-chip-deals-with-google-anthropic.html">&#8220;Broadcom agrees to expanded chip deals with Google, Anthropic,&#8221;</a> CNBC, April 6, 2026. Mizuho analyst Vijay Rakesh estimates: $21 billion in Broadcom AI revenue from Anthropic in 2026, $42 billion in 2027. Broadcom&#8217;s role isa  TPU manufacturing partner to Google; the Anthropic relationship routes through Google Cloud with Broadcom as the silicon supplier.</p><p>[9] Amazon&#8217;s 2026 capital expenditure guidance of approximately $200 billion was disclosed on the Q4 2025 earnings call in February 2026, as referenced by CNBC in <a href="https://www.cnbc.com/2026/04/20/amazon-invest-up-to-25-billion-in-anthropic-part-of-ai-infrastructure.html">&#8220;Amazon to invest up to another $25 billion in Anthropic,&#8221;</a> April 20, 2026. The $5 billion disbursed today is roughly 2.5 percent of guided 2026 capex; the $25 billion incremental equity ceiling, spread over the unknown milestone period, would represent a materially smaller annualized percentage. The comparison is given to scale the current-year cash impact, not to conflate multi-year equity ceilings with annualized capex.</p><p>[10] <a href="https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-bedrock-claude-mythos/">&#8220;Amazon Bedrock now offers Claude Mythos Preview (Gated Research Preview),&#8221;</a> AWS What&#8217;s New, April 7, 2026. AWS description: Mythos Preview is available only in the US East (N. Virginia) region through Amazon Bedrock, with access &#8220;limited to an initial allow-list of organizations.&#8221; AWS frames the release posture as a &#8220;deliberately cautious approach to release, prioritizing internet-critical companies and open-source maintainers.&#8221;</p><p>[11] <a href="https://www.anthropic.com/glasswing">&#8220;Claude Mythos Preview and Project Glasswing,&#8221;</a> Anthropic, April 7, 2026. Anthropic announced Mythos on the same day as the Bedrock launch, citing a material capability jump over Claude Opus 4.6 and declining to make the model generally available. The core Project Glasswing partners are eleven organizations (Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks), with approximately forty organizations total in the broader testing cohort. Anthropic is providing up to $100 million in usage credits to Glasswing partners and $4 million to open-source security organizations. Technical details of the autonomous vulnerability discovery &#8212; including the twenty-seven-year-old OpenBSD bug, chained Linux kernel privilege-escalation exploits, and a browser sandbox escape &#8212; are documented in the <a href="https://red.anthropic.com/2026/mythos-preview/">&#8220;Mythos Preview Technical Report,&#8221;</a> Anthropic Frontier Red Team, April 2026.</p><p>[12] <a href="https://www.reuters.com/business/media-telecom/anthropic-talks-eu-including-its-cyber-security-models-commission-says-2026-04-17/">&#8220;Anthropic talks to EU, including on its cyber security models, Commission says,&#8221;</a> Reuters, April 17, 2026. European Commission spokesman Thomas Regnier confirmed the April 15 briefing and invoked the EU general-purpose AI Code of Practice, to which Anthropic is a signatory: &#8220;In this framework, there is an obligation to assess and mitigate risks that could come from a service that may or may not be offered in Europe.&#8221; See also <a href="https://www.pymnts.com/artificial-intelligence-2/2026/anthropic-briefs-eu-regulators-on-mythos-cybersecurity-concerns/">&#8220;Anthropic Briefs EU Regulators on Mythos Cybersecurity Concerns,&#8221;</a> PYMNTS, April 17, 2026, citing Agence France-Presse coverage of the same briefing.</p>]]></content:encoded></item><item><title><![CDATA[Anthropic Is Not a Trend]]></title><description><![CDATA[Jensen Huang said the quiet part out loud. The market is misreading what he said.]]></description><link>https://www.airealist.ai/p/anthropic-is-not-a-trend</link><guid isPermaLink="false">https://www.airealist.ai/p/anthropic-is-not-a-trend</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Mon, 20 Apr 2026 05:31:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nTp6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nTp6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nTp6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!nTp6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!nTp6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!nTp6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nTp6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2256415,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/194710536?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nTp6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!nTp6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!nTp6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!nTp6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F214cc136-2960-4e16-b622-f847aac8fcb6_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On April 15, 2026, Jensen Huang sat down with Dwarkesh Patel for what was meant to be a measured defense of Nvidia&#8217;s competitive position [1]. Nvidia had just closed fiscal 2026 with $215.9 billion in revenue, $193.7 billion in data center, and a full-year GAAP gross margin of 71.1% [2]. The brief was clear: acknowledge the alternatives, reassert the moat, reassure the analysts. Asked why Anthropic had committed to multi-gigawatt deployments on Google TPU and Broadcom-designed silicon in October 2025 [3] &#8212; when Nvidia&#8217;s own benchmarks claim the best price-performance in the market &#8212; Jensen reached for the simplest possible denial.</p><p>&#8220;Anthropic is a unique instance, not a trend,&#8221; he said [1]. Then, warming to the argument, he went further. Without Anthropic, he asked, why would there be any TPU growth at all? Why would there be any Trainium growth at all? One hundred percent Anthropic, in both cases. One customer.</p><p>Jensen was speaking in the context of external, third-party commercial demand &#8212; the only demand that tells you whether hyperscaler custom silicon has broken out of single-company use. Google&#8217;s internal TPU footprint (Search, Ads, YouTube, Gemini serving) dwarfs any outside customer and was never in question. The concession that matters is that outside Google&#8217;s and AWS&#8217;s own walls, TPU and Trainium growth as training platforms is one lab.</p><p>The quote was meant to shrink the alt-silicon story. It succeeded in shrinking it so completely that it confirmed the single-customer concentration the &#8220;everyone&#8217;s a chip company&#8221; narrative was designed to obscure. Two of the most-hyped custom silicon programs in the industry are, by the CEO of the market leader&#8217;s own admission, dependent on a single AI lab. If that lab changes its capital structure, slows down, or rebalances its next generation toward Nvidia, external TPU, and external Trainium revenue go with it. The CEO of the market leader said this out loud, on the record, twice across two segments of the same interview. A careful communicator could argue this was deliberate narrative-setting rather than Streisand &#8212; Jensen is meticulous, and the framing locks in a clean &#8220;second trend starting&#8221; storyline if a second lab emerges. Either read sharpens the point. Deliberate or accidental, the concession is that alt-silicon growth is a capital-structure phenomenon concentrated in one customer, not a technology wave.</p><p>The piece that follows takes Jensen seriously on the concentration, and then asks the question he did not want to be asked. <strong>If alt-silicon growth is one customer, what produced that customer? And what conditions would produce the next one?</strong></p><h2>The engineering is real. The question is what it replaces.</h2><p>The press cycle around custom AI silicon has been continuous since late 2025. Meta is shipping multiple MTIA generations at scale for ranking and is now deploying them for some generative-AI inference [4]. Microsoft rolled Maia 200 into production Azure clusters through the winter [5]. Anthropic&#8217;s engineers are writing low-level kernels that interface directly with AWS Trainium and contributing to the Neuron stack [6]. OpenAI&#8217;s Broadcom-designed accelerator has been taped out and is scheduled for deployment at 10 gigawatts starting in 2026 [7]. Tesla taped out its AI5 chip on April 15 and announced AI6 and Dojo 3 are in development [8]. Broadcom closed Q1 FY2026 with $8.4 billion in AI revenue, up 106% YoY, and disclosed a $73 billion AI backlog [9]. Google&#8217;s Gemini 3, launched November 18, 2025, was trained on Google&#8217;s TPUs per Google&#8217;s public framing and Jeff Dean&#8217;s November 22 Stanford presentation; Google has not published a granular silicon-mix disclosure [10].</p><p>These are not press releases. They are real engineering, at real silicon teams, shipping on real timelines. Dismissing them is a mistake. So is assuming they substitute for what came before.</p><p>The market has been pricing the second possibility. Nvidia stock fell roughly 12% across November 2025, with a single-day drawdown of as much as 7% intraday on November 25 that wiped out roughly $250 billion in market cap, triggered by The Information&#8217;s report that Meta was evaluating TPU deployment from 2027, compounding the Gemini 3 launch a week earlier [11]. The sell-side was not unified: Bernstein&#8217;s Stacy Rasgon, Melius&#8217;s Ben Reitzes, and Evercore&#8217;s Mark Lipacis each published notes acknowledging additivity at the megawatt layer while flagging concentration and pull-forward risk. The bearish version the drawdown priced in more aggressively than any published note: Broadcom&#8217;s $73 billion AI backlog &#8212; disclosed December 11, 2025 as spanning &#8220;over the next six quarters&#8221; and flagged by Hock Tan as &#8220;a minimum&#8221; [9] &#8212; as the counterweight to Nvidia&#8217;s $95.2 billion in purchase and capacity commitments in the FY2026 10-K [12]. The implicit trade is that custom silicon is substituting for Nvidia at the hyperscaler level, and Nvidia&#8217;s growth has to slow.</p><p>The deployment data says otherwise. On February 24, 2026, Meta announced a multi-year, multi-generation Instinct agreement with AMD for 6 gigawatts of GPUs, with the first 1-gigawatt tranche on MI450 and subsequent tranches spanning future Instinct generations &#8212; at a press-estimated $60 billion or so over five years (AMD has not disclosed a dollar value; CFO Jean Hu described it as &#8220;double-digit billions per gigawatt&#8221;) and a performance-based warrant granting Meta up to 160 million AMD shares, gated on three concurrent conditions (GPU shipment milestones, AMD share-price thresholds with the final tranche at $600, and Meta technical and commercial milestones), worth roughly 10% AMD equity at full vesting [13]. </p><p>One week earlier, Meta signed a separate multi-year commitment with Nvidia for millions of GPUs [14]. Meta&#8217;s forward 2026 capex guidance is $115 to $135 billion, up from $72 billion in actuals in 2025 [15]. Both bets grow simultaneously. </p><p>Microsoft&#8217;s Q2 FY2026 capex came in at $37.5 billion &#8212; a quarterly record. Two-thirds went to GPUs and CPUs, according to Amy Hood [16]. Microsoft is deploying Maia 200 in production while buying Nvidia at a pace that would have been unthinkable eighteen months ago. </p><p>Nvidia itself grew fiscal 2026 data center revenue 68% year over year to $193.7 billion [2]. Its forward supply commitments grew faster: from $16.1 billion at the prior year-end to $95.2 billion at the close of FY2026 &#8212; roughly 5.9x, or nearly nine times the revenue growth rate [12]. That gap is itself interpretable. Read charitably, it reflects Nvidia pre-booking TSMC CoWoS-L capacity for H2 2026 Vera Rubin deliveries and HBM4 supply through 2027. Read skeptically, it is pull-forward that has to be digested if AI capex growth decelerates before commitments convert to shipments. Either read is compatible with the additive thesis; neither vindicates it automatically.</p><p>This is the core fact the market is misreading, and the argument this piece is built around. <strong>Custom silicon is real. It is not displacing Nvidia. It is being added on top of Nvidia purchases, at every hyperscaler except one.</strong> That distinction &#8212; between substitution and addition &#8212; determines whether Nvidia&#8217;s concentration risk is easing or about to break. Additive custom silicon leaves Nvidia growth intact as the AI workload itself grows faster than custom silicon can absorb. Substitutive custom silicon caps Nvidia growth even as the workload grows. Right now, the data is almost entirely additive. Call this the <strong>Additive-vs-Substitutive Test</strong> &#8212; the first filter every hyperscaler silicon announcement should pass through, answered every ninety days by the quarterly capex breakouts. So far, with one exception, the answer is addition. And Jensen&#8217;s Dwarkesh admission, read carefully, explains why.</p><p>Two refinements sharpen the picture. First, additive deployment depends on workload growth outrunning custom silicon&#8217;s absorption rate &#8212; a bet on the AI capex supercycle continuing. If frontier training hits a scaling plateau in 2027 and AI capex growth drops from roughly 50% to 15% year over year, the same custom silicon plans that are additive today become substitutive tomorrow &#8212; same MTIA, same Maia, same Trainium2, cutting into an Nvidia share that can no longer grow its way around them. The additive story is growth-rate-dependent, not baked into the architecture. </p><p>Second, additive in megawatts is not additive in Nvidia&#8217;s revenue pool. Much of what hyperscalers spend on Broadcom XPU alongside Blackwell is spending Nvidia would otherwise have captured, though a portion of custom silicon demand &#8212; Meta&#8217;s ad-ranking workloads in particular &#8212; was never competitive for Nvidia anyway. Broadcom&#8217;s consolidated gross margin ran at 77% in Q1 FY2026, but full AI rack systems &#8212; where Broadcom resells the expensive HBM memory, substrates, and advanced packaging it buys from third-party suppliers rather than earning its usual margin on chip design alone &#8212; carry materially lower margins. CFO Kirsten Spears guided on the December 11, 2025, Q4 FY2025 call to roughly 100 basis points of sequential gross-margin compression entering Q1 FY2026 as rack-scale mix grew; on the March 4 Q1 call, she softened it, telling analysts the impact &#8220;is actually not going to be substantial at all.&#8221; UBS&#8217;s Timothy Arcuri pressed Hock Tan on that same Q1 call, framing rack-scale margins at &#8220;maybe 45%, 50%&#8221;; Tan rejected the framing, telling Arcuri he &#8220;must be a bit hallucinating&#8221; [9]. </p><p>Nvidia&#8217;s data center segment carries a materially higher gross-margin structure than a pass-through-heavy XPU system [2]. Compute deployment can grow at every hyperscaler while Nvidia&#8217;s share of the capex dollar compresses, which is the scenario the market is actually pricing when it sells Nvidia on a Google announcement. Additivity in megawatts does not mean additivity in margin.</p><h2>Google is the exception that proves the rule.</h2><p>Google is that exception. Gemini 3 was trained entirely on TPUs, and this is load-bearing not because the chips are better &#8212; Nvidia still insists it wins on price-performance, and the alt-silicon camp has not yet accepted Jensen&#8217;s public invitation to publish comparative results on MLPerf or InferenceMAX [1]. It is load-bearing because Google has had a decade to do so. The first TPU was deployed internally in 2015. Ten years of hardware generations, the XLA compiler, and JAX. Gemini 3 was not the moment Google pivoted away from Nvidia. It was the moment the public caught up to a migration that had already been completed within the company, long before the model launched.</p><p>No other hyperscaler has a decade. AWS announced Trainium in 2020 and shipped its first Trn1 instances in October 2022. Microsoft began talking about Athena, the project that became Maia, in 2022. Meta&#8217;s MTIA program began serious deployment in 2023, and its first use of generative AI training is only now starting [4]. All of these programs are four years old or less. Google&#8217;s is ten. The gap is not technical genius; it is that Google built the software stack before anyone else thought they needed to, and had the internal workload volume to justify hardware iteration when no one else would have seen the return.</p><p>The Google exception matters because it sets the bar for what &#8220;full-stack silicon exit&#8221; actually requires: a decade of compounding software investment, a vertically integrated model organization that designs training code against the hardware rather than PyTorch-on-CUDA defaults, and the balance sheet to absorb the opportunity cost of running below Nvidia performance during the transition. That is a short list. It contains exactly one company. Everyone else operates inside a different constraint. Why a partial exit happened at all is a different question, and the answer flips from engineering to finance.</p><h2>Why there is only one Anthropic.</h2><p>Which brings us back to Jensen&#8217;s concession. If TPU and Trainium external growth is one customer, why that customer? The answer is not that Anthropic found Trainium and TPU to be objectively better chips than Blackwell. The answer, as Jensen himself explained in the next breath on the Dwarkesh podcast, is capital structure.</p><p>Jensen&#8217;s account is worth reconstructing because it is the most candid public explanation by a Nvidia executive of how the alt-silicon market was created. By his telling, Anthropic in its early growth phase needed five to ten billion dollars of equity investment to fund its compute consumption &#8212; a scale no venture capital firm would commit to an unprofitable AI lab. Google could write that check. Amazon could write that check. Nvidia, at the time, could not. Nvidia had never done large equity investments and had not internalized that the labs&#8217; capital structure was inseparable from their silicon choice. By the time Nvidia understood this, Anthropic had already signed the equity-for-compute deals that locked its training on TPU and, later, on Trainium. Jensen described this as his miss [1].</p><p>The implication is the thesis of the piece. Anthropic&#8217;s silicon path was not decided by chip quality. It was decided by who could write the equity-for-compute check at the moment the check was needed. <strong>The alt-silicon market was not a technology event. It was a capital-structure event</strong> &#8212; the <strong>Capital-Structure-Not-Chip-Quality Mechanism</strong>, the second framework the reader can take away. Large labs needing billions in equity against compute consumption do not evaluate silicon options on FLOPs per dollar. They evaluate which counterparty can fund them for the next 18 months, and they accept whatever silicon comes with it.</p><p>Read through this lens, Jensen&#8217;s &#8220;unique instance&#8221; line flips meaning. Anthropic is not unique because its leadership preferred exotic silicon. Anthropic is unique because its capital needs and timing lined up with a hyperscaler that had both the chips and the checkbook. OpenAI went the other way &#8212; taking Microsoft&#8217;s capital, contingent on Azure compute, which at the time was overwhelmingly Nvidia. Azure&#8217;s own alt-silicon program (Athena, later Maia) had been in development since 2022 but was not production-ready for frontier training at the decisive capital moments. Compounding the capital path, OpenAI&#8217;s training stack had co-evolved with Nvidia from GPT-2 onward: CUDA kernels, Nvidia-native distributed training, and inference tuned to Nvidia's topology. Anthropic, founded in 2021 with a clean start, had no comparable stack to port. The result: OpenAI trained on Nvidia, and its aggregate compute posture, as Jensen conceded, remains vastly Nvidia-weighted even with the AMD MI450 deal of October 2025 and Broadcom-designed custom silicon now in production [7][14].</p><p>The forward-looking question is: who is the next Anthropic? Not in the sense of capability, but in the sense of the capital-structure setup &#8212; a lab needing five to ten billion in equity for compute, arriving at a moment when a non-Nvidia hyperscaler can write the check and Nvidia cannot. The answer, increasingly, is: nobody, because the arbitrage is closing on both sides.</p><p>It is closing from Nvidia&#8217;s side because Nvidia is now writing those checks itself. The $30 billion equity stake Nvidia finalized in OpenAI as part of that company&#8217;s $110 billion round earlier in 2026 &#8212; a scaled-back structure that succeeded the &#8220;up to $100 billion&#8221; infrastructure letter of intent Nvidia signaled in September 2025 &#8212; and the up-to-$10 billion Anthropic stake announced November 18, 2025 alongside Microsoft&#8217;s own up-to-$5 billion commitment and a $30 billion Anthropic-Azure compute deal [17] are the mechanism&#8217;s correction. Jensen told the Morgan Stanley Technology, Media &amp; Telecom Conference on March 4, 2026, that both investments are likely Nvidia&#8217;s last before those companies go public, and returned to the theme on Dwarkesh a month later [17]. </p><p>The arbitrage is also closing from the other side &#8212; not because Google and Amazon stopped writing compute-linked equity checks, but because the number of counterparties willing to write them proliferated. Microsoft&#8217;s commercial RPO backlog reached $625 billion by year-end 2025, with approximately 45% tied to OpenAI per analyst estimates (Microsoft has not broken this out) [16]. Oracle took its slice through Stargate. Google continues writing checks to Anthropic, linked to the TPU [3]. The arbitrage is no longer a narrow window between Nvidia&#8217;s old reluctance and its new willingness. It is a multi-counterparty equity-for-compute market that any sufficiently capital-hungry frontier lab can draw against, with silicon attached to whichever counterparty wins the allocation. The next Anthropic does not need to exist as a single, uniquely situated actor, because the arrangement that created the first one has become the default funding model. And labs that draw against this market are multi-sourced from day one &#8212; training on TPU, inferring on Trainium, committing to Nvidia for next-generation capacity. That spreads silicon share across vendors; it does not concentrate it on any single alt-silicon platform.</p><p>This is why &#8220;Anthropic is not a trend&#8221; may well be right, but not for the reason Jensen intended. The concentration is not concentrating further because labs with Anthropic&#8217;s capital profile are now routinely multi-sourced &#8212; a training cluster on TPU, an inference posture on Trainium, a compute commitment on Nvidia, every vendor paid. Whether Anthropic itself rebalances back toward Nvidia in its next generation &#8212; the company has simultaneously committed to multi-gigawatt Grace Blackwell and Vera Rubin deployments alongside its Trainium and TPU expansions [17] &#8212; will determine whether the &#8220;one customer&#8221; even stays at one in the way Jensen meant.</p><h2>What is actually being deployed.</h2><p>With that frame in place, the deployment picture becomes legible. Each hyperscaler silicon program deserves a clean, honest reading, because press coverage has systematically blurred the distinction between inference and training, between announced and shipped products, and between generative-AI workloads and older recommendation workloads that happen to use the same silicon family.</p><p>Meta&#8217;s MTIA program is the most mature of the non-Google efforts, with four generations shipped and hundreds of thousands of units deployed for ads ranking and feed personalization. MTIA is not yet the primary training platform for Meta&#8217;s frontier generative models &#8212; that remains Nvidia and increasingly AMD MI450-class silicon under the 6-gigawatt February agreement [13]. Meta&#8217;s own engineers describe MTIA as a ranking workhorse progressing toward generative-AI inference. Meta formalized this trajectory on April 14, 2026 &#8212; one day before Jensen recorded with Dwarkesh &#8212; announcing an expanded Broadcom partnership through 2029 covering multiple MTIA generations, with a 1-gigawatt commitment Meta described as the opening installment of a multi-gigawatt buildout, and with MTIA &#8212; per Broadcom&#8217;s announcement &#8212; becoming the first AI silicon on a 2nm process [4]. All of that capacity sits inside Meta&#8217;s 2026 capex envelope alongside, not in place of, the continued Nvidia commitments and the 6-gigawatt AMD Instinct deployment. Calling Meta a chip company is accurate. Calling Meta&#8217;s chip program a substitute for its Nvidia and AMD commitments misreads the capex guide: $115 to $135 billion in 2026, with growth across both custom and merchant silicon [15].</p><p>Microsoft&#8217;s Maia 200 entered production Azure clusters through the winter, in the US Central region, supplying only a fraction of Azure&#8217;s AI capacity. The deployment-level truth of the Microsoft bet is the $37.5 billion quarterly capex figure, two-thirds of which is allocated to GPU and CPU, disclosed on the Q2 FY2026 earnings call [16]. Satya Nadella disclosed on that same call that Microsoft added nearly one gigawatt of AI capacity in the quarter. Almost all of that gigawatt is not Maia.</p><p>AWS Trainium is the alt-silicon platform with the deepest model-level co-design story. Anthropic&#8217;s engineers are explicitly contributing to the Neuron software stack and writing kernels that interface directly with Trainium silicon [6]. Project Rainier &#8212; the 500,000-chip Trainium2 cluster in New Carlisle, Indiana &#8212; was activated in October 2025 and, per AWS CEO Matt Garman at launch, is running and training Anthropic&#8217;s models today. AWS is committed to doubling it to one million Trainium2 chips by the end of 2025 [18]. This is the strongest non-Google case for model-platform convergence on non-Nvidia silicon. It is also, per Jensen&#8217;s admission, the entire external Trainium adoption story.</p><p>OpenAI&#8217;s custom silicon program with Broadcom, announced October 2025 as a term sheet for 10 gigawatts starting in 2026, is the largest single non-Nvidia commitment by any AI lab &#8212; though Hock Tan on the Q1 FY2026 call specified OpenAI contributions of only &#8220;over 1 gigawatt&#8221; in 2027, back-weighting the remaining nine gigawatts into 2028 and 2029 [7]. OpenAI is also the anchor customer for a 6-gigawatt multi-generation AMD Instinct deployment announced on October 6, 2025 (the first gigawatt on MI450, with future generations to follow) [19], and the counterparty to Nvidia&#8217;s $30 billion equity investment, accompanied by a Grace Blackwell/Vera Rubin compute commitment [17]. Aggregating publicly announced OpenAI compute commitments across counterparties &#8212; Microsoft Azure, Oracle&#8217;s Stargate arrangement, AMD, Broadcom, and Nvidia &#8212; produces a disclosed sum somewhere in the $800 billion to $1.2 trillion range over 2025&#8211;2035, depending on how much weight the reader gives Stargate&#8217;s $500 billion headline (aspirational: Musk has publicly disputed its funding and SoftBank&#8217;s underwriting remains uncertain) [20]. Even discounting Stargate entirely, the aggregate is well north of half a trillion dollars. Every major vendor is represented. Nvidia remains the largest line item. This is additivity on an Olympian scale.</p><p>Tesla&#8217;s AI5 tape-out on April 15, 2026 is a dual-purpose chip Musk has positioned for both Optimus inference and data-center training &#8212; in his own words, &#8220;AI5, AI6 and subsequent chips will be excellent for inference and at least pretty good for training,&#8221; with Dojo&#8217;s training mission persisting &#8220;in the form of a large number of AI6 SoCs on a single board&#8221; [8]. It is not a pure FSD accelerator, but it is also not yet shipping at scale or training any xAI frontier model. xAI&#8217;s actual frontier training cluster, Colossus 2, is Nvidia-based: approximately 555,000 Nvidia GPUs across H100, H200, and Blackwell generations at the Memphis complex, with Musk targeting 1.5 gigawatts by April 2026 (independent satellite-based observers have flagged delivered capacity as materially below the stated target) [21]. The company most stylistically associated with the &#8220;build our own chips&#8221; narrative is, at the frontier training layer today, entirely Nvidia.</p><p>Pulling these five programs together, the pattern is consistent. Real engineering. Real silicon. Narrow deployments relative to the overall AI capacity being stood up. Additive to Nvidia, not substitutive. The Google exception is not evidence the pattern is changing; it is evidence of what it takes to be the exception, and no other hyperscaler has the ingredients.</p><p>The deployment picture explains where we are. The real question is: where are we going?</p><h2>The lock-in is migrating up the stack.</h2><p>The silicon story does not end here. The deeper argument &#8212; the one Jensen made on the same Dwarkesh podcast, in a completely different context, without noticing what he was conceding &#8212; is that the competitive battleground is moving from the kernel layer to the model architecture layer.</p><p>For most of 2023 and 2024, the dominant framing of Nvidia&#8217;s moat was CUDA. The argument ran: even if competitors build comparable silicon, the CUDA ecosystem, with its libraries, kernels, compiler toolchain, and fifteen years of accumulated developer familiarity, is what keeps frontier labs on Nvidia hardware. Switching costs were described at the level of writing new attention kernels, porting custom CUDA extensions, and rewriting inference serving infrastructure.</p><p>This argument has become progressively less true. OpenAI wrote Triton to abstract kernel generation across backends; Triton&#8217;s backend, per Jensen&#8217;s own description, contains substantial Nvidia-contributed technology, and it is also the path through which OpenAI compiles for non-Nvidia targets [1]. Anthropic writes kernels directly to Trainium with AWS's help and feeds architectural input into Trainium3 [6]. Google&#8217;s JAX and XLA stack is co-designed primarily against TPU. The hyperscaler labs have staff who can write low-level kernel code for multiple silicon targets, and the institutional capacity to do it. Jensen acknowledged this on Dwarkesh: Nvidia&#8217;s kernel engineers are deeply embedded inside their AI lab partners&#8217; stacks, and &#8220;It&#8217;s not unusual that by the time we&#8217;re done optimizing their stack or optimizing a particular kernel, their model sped up by 3x, 2x, 50%&#8221; [1]. That is a description of a joint engineering relationship, not a lock-in through opacity. An Nvidia reading would note that the embedding itself is the moat &#8212; the density of engineering relationships does not require CUDA exclusivity to function. Both readings are defensible. What is not defensible is the 2023-era claim that CUDA kernels themselves are the primary barrier. The CUDA kernel moat is real but smaller than it was.</p><p>The real moat is now forming at the model architecture layer, and the evidence comes from the places the industry pretends not to see. In August 2025, DeepSeek released V3.1 with a numerical format alignment called UE8M0 FP8, co-designed with forthcoming Chinese domestic silicon. In a top-pinned comment on its official WeChat account, DeepSeek clarified that UE8M0 FP8 is, in the company&#8217;s own words, &#8220;designed for the next generation of domestically produced chips to be released soon&#8221; [22]. A frontier Chinese lab is designing its model&#8217;s numerical representation around a specific upcoming domestic silicon architecture. The model is being co-designed with the hardware. The portability story &#8212; train on Nvidia, serve on anything &#8212; does not hold at this level of precision co-design.</p><p>The DeepSeek case is not isolated. Gemini 3 is co-designed with TPU&#8217;s interchip interconnect topology. Anthropic is not just writing kernels for Trainium; it is feeding architectural requirements into Trainium3 [6]. Each case is a model whose training path is tuned to the primitives of a specific hardware family.</p><p>The industry tried to standardize low-precision formats in 2023 under the Open Compute Project&#8217;s Microscaling (MX) specification, published September 2023, with scale-factor formats including UE8M0 &#8212; the unsigned 8-bit power-of-2 scale DeepSeek referenced. That effort partially worked and partially fragmented. Both Nvidia&#8217;s Blackwell and forthcoming Chinese domestic accelerators implement MX-compatible FP8. But the model-layer decision of which MX variant to tune to, and which hardware&#8217;s tensor-core quirks to target, still produces silicon-specific optimization paths that do not port cleanly. Nvidia&#8217;s NVFP4 on Blackwell, the MX-standard MXFP4 across AMD and Intel, and DeepSeek&#8217;s explicit alignment of UE8M0 scale factors to upcoming Chinese domestic silicon are three model-layer decisions that each lock trained weights to a specific silicon family [22].</p><p>This is the <strong>Model-Layer Lock-In Migration</strong>, the third framework the reader can take away. The moat is moving from the kernel up. A model trained with UE8M0 quantization on Ascend-class silicon does not run day one at equivalent quality on Blackwell. A model trained with sparse MoE routing optimized for TPU&#8217;s 3D torus does not run day one on an Nvidia NVL72 optimized for NVLink Switch. Silicon-specific optimizations creep up the stack, out of the kernel, into the model itself. Switching cost is no longer kernel rewrite time. It is a training run costing hundreds of millions of dollars and months at the frontier scale.</p><p>This has a consequence that CUDA lock-in never had. CUDA had a switching-cost moat at the ecosystem and habit levels. Painful to switch away from, but in principle reproducible &#8212; build enough libraries, fund enough kernel ports, give it a decade, and a competitor could assemble a comparable stack. That is roughly what Triton, XLA, and the Neuron SDK are doing. The CUDA moat is not disappearing, but it is being incrementally eroded by compiler-layer investment, and Jensen knows it. It is why he spent a third of the Dwarkesh interview repositioning Nvidia&#8217;s advantage as the density of engineering relationships inside the AI lab partners&#8217; stacks rather than the kernel ecosystem itself [1].</p><p>Model-layer co-design is a different kind of moat. It does not sit in a rebuildable library. It sits inside trained weights that cost nine-figure sums to reproduce. A lab that trains its next-generation model natively against TPU topology, or against Trainium&#8217;s interconnect, or against a Chinese domestic chip&#8217;s UE8M0 scale format, has embedded the silicon dependency into the model artifact itself. The switching cost is not &#8220;rewrite your kernels&#8221;; it is &#8220;rerun your training&#8221;&#8212;a cost that cannot be amortized by compiler work, only paid by running the training again on different silicon for another hundred-million-plus and three-to-six months. Compiler abstraction &#8212; the Triton and MLIR lineage &#8212; can bridge kernel-level differences. It cannot undo the numerical format in which a model was trained, or the routing topology against which its experts were tuned. The moat is moving to the layer where abstraction cannot reach.</p><p>The most striking evidence is not from Anthropic or DeepSeek. It is from Jensen. On the same podcast, arguing for why Nvidia should be allowed to keep selling compute to Chinese customers despite U.S. export controls, he made an almost startling case: if Chinese open-weight models end up optimized for Huawei&#8217;s silicon architecture rather than Nvidia&#8217;s, that would be a real strategic loss, because those models would then diffuse to markets outside China and establish Huawei as the reference platform [1]. This is the model-silicon co-design argument, as this piece frames it, stated exactly by the CEO who stands to lose the most from it. Jensen sees it happening. He is warning about it, out loud, against himself, because he sees it happening to him too.</p><h2>What would have to break.</h2><p>Every thesis has to be falsifiable to be useful. This one has four specific tests to watch through 2026 and 2027.</p><p>The first test is the additive vs. substitutive test applied to quarterly capex breakouts. As long as Meta, Microsoft, AWS, and Google all continue to grow both their custom silicon deployments and their Nvidia purchases in parallel, the additive story holds. If any single hyperscaler reports a quarterly capex breakout showing custom silicon growth and a simultaneous absolute reduction in Nvidia purchases &#8212; not slower growth, but a cut &#8212; the picture changes. Nothing in the data through Q1 2026 suggests this is imminent.</p><p>The second test is whether any frontier lab outside Google publishes a training run conducted entirely on non-Nvidia silicon, at a scale and benchmark level comparable to Gemini 3 or the current Claude generation. Anthropic is the candidate most likely to clear this bar, given the kernel-level Trainium work now in production at Project Rainier [18]. Reuters reported April 9, 2026, that Anthropic is exploring the design of its own silicon alongside its Trainium, TPU, and Nvidia commitments &#8212; active but early, with no specific design or dedicated chip team publicly committed [18]. Pre-committed threshold: if Anthropic&#8217;s next flagship Claude model, shipping in 2026 or 2027, is disclosed as trained majority or entirely on Trainium, on TPU, or on a hybrid Trainium-TPU configuration with no Nvidia in the loop, the &#8220;one customer is all there is&#8221; thesis weakens. If that model also runs day one with equivalent quality on Nvidia inference fleets, the model-layer lock-in thesis weakens with it.</p><p>The third test is whether NVLink Fusion &#8212; Nvidia&#8217;s initiative to allow its interconnect to integrate with non-Nvidia accelerators, including AWS&#8217;s forthcoming Trainium4 [23] &#8212; actually ships on schedule. If it does, the re-coupling of AWS to Nvidia at the networking layer is explicit, and the partial alt-silicon exit becomes narrower than the commitments suggest. If it fails to ship or ships with technical constraints, AWS&#8217;s alt-silicon story becomes more independent than it currently appears.</p><p>The fourth test is the portability bar at the model layer. If an open-source frontier model ships with documented day-one parity across five or more silicon backends &#8212; Nvidia, Google TPU, AWS Trainium, AMD Instinct, and a Chinese domestic target &#8212; the model-layer lock-in migration is arrested. This has not yet happened at frontier quality. The trajectory, per the DeepSeek UE8M0 example, is in the opposite direction.</p><p>If all four tests resolve the way they have so far, the picture is stable. Nvidia remains the dominant silicon vendor in a market growing faster than any custom silicon program can absorb. Broadcom captures the non-Nvidia design-house layer, with a $73 billion AI backlog &#8212; disclosed on the December 11, 2025 Q4 FY2025 call as spanning &#8220;over the next six quarters&#8221; and flagged by Hock Tan as &#8220;a minimum&#8221; &#8212; concentrated across six XPU customer relationships, four publicly named (Google, Meta, Anthropic, OpenAI) and two unnamed, with one widely reported by analysts to be ByteDance [9]. Tan&#8217;s stated line of sight to chip revenue &#8220;significantly in excess of $100 billion&#8221; in 2027 frames the forward opportunity, though Tan was explicit that the figure is chips-only and excludes rack and system revenue [9]. TSMC&#8217;s CoWoS-L packaging capacity for leading-edge AI silicon, alongside HBM4 supply from Micron, SK hynix, and Samsung, together form the binding constraint at the manufacturing layer &#8212; and the primary reason every player&#8217;s announced timelines slip six to twelve months relative to their press releases. Google remains the one full-stack exception. Anthropic remains the one candidate to cross over. Everyone else runs Nvidia as the first-class platform and accumulates custom silicon as an additive hedge.</p><p>The two-layer story is this. Near term, over the next three to five years: deployment is Nvidia-plus-custom, with &#8220;plus&#8221; being the operative word. Longer term, five to ten years: the model-layer lock-in migration is the real battleground. Models are being co-designed with specific silicon in ways that make training runs non-portable, and the vendor that captures the largest installed base of frontier models co-designed for its silicon inherits the switching costs of the CUDA used to carry them. Today, that is still Nvidia, by a wide margin. Tomorrow, it is a contest between Nvidia&#8217;s NVFP4, Google&#8217;s TPU-native training paths, AWS&#8217;s Trainium-Anthropic co-design, and China&#8217;s UE8M0-plus-Huawei combination &#8212; the model layer fragmenting along silicon-specific lines faster than the silicon itself diversifies at the chip layer.</p><p>The concentration Jensen described on April 15 was real. The reassurance he tried to offer was not. And the part of the moat he spent a decade building at the CUDA kernel layer is not the part that will decide the next cycle, because the lock-in is moving to where the trained weights are.</p><div><hr></div><h3>Notes</h3><p>[1] Dwarkesh Patel, <a href="https://www.dwarkesh.com/p/jensen-huang">&#8220;Jensen Huang &#8211; TPU competition, why we should sell chips to China, &amp; Nvidia&#8217;s supply chain moat,&#8221;</a> Dwarkesh Podcast, April 15, 2026. All Jensen Huang statements cited in this piece are drawn from this transcript and the associated video/audio.</p><p>[2] NVIDIA Corporation, <a href="https://www.sec.gov/Archives/edgar/data/0001045810/000104581026000021/nvda-20260125.htm">Form 10-K for fiscal year ended January 25, 2026</a>, SEC filing; see also Q4 FY2026 CFO commentary, February 2026. Revenue of $215.9B, data center revenue $193.7B, Q4 GAAP gross margin 75.0%, full-year 71.1%, operating cash flow $102.7B, free cash flow $96.6B, $4.5B charge in Q1 FY26 for H20 excess inventory following April 2025 U.S. export license requirements.</p><p>[3] Anthropic, <a href="https://www.anthropic.com/news/expanding-our-use-of-google-cloud-tpus-and-services">&#8220;Expanding our use of Google Cloud TPUs and Services,&#8221;</a> corporate blog post, October 23, 2025; Anthropic, <a href="https://www.anthropic.com/news/google-broadcom-partnership-compute">&#8220;Expanding our partnership with Google and Broadcom,&#8221;</a> corporate blog post, April 7, 2026; Broadcom Q4 FY2025 earnings disclosure of $11 billion additional Anthropic custom silicon order following $10 billion earlier in FY2025. The October 2025 commitment spans up to one million TPUs, bringing over a gigawatt of compute capacity online in 2026; the April 2026 expansion adds 3.5 gigawatts of next-generation TPU capacity routed via Broadcom, coming online from 2027.</p><p>[4] Meta Platforms corporate engineering blog, MTIA deployment updates through 2025; Meta, <a href="https://about.fb.com/news/2026/04/meta-partners-with-broadcom-to-co-develop-custom-ai-silicon/">&#8220;Meta Partners With Broadcom to Co-Develop Custom AI Silicon,&#8221;</a> April 14, 2026. MTIA v1 was announced in 2023; v2 entered broad ranking deployment in 2024; v3 and v4 extended capabilities through 2025. MTIA 300, announced March 2026, is already running Meta&#8217;s ranking and recommendation workloads, with the remaining chips in the new four-chip family slated through 2027. Training for Llama frontier models remains primarily on Nvidia and, under the February 24, 2026 agreement, on AMD Instinct (MI450 first gigawatt, future generations following). The April 14 expansion commits to more than 1 gigawatt of MTIA compute as the opening installment of a multi-gigawatt buildout through 2029, with Hock Tan stepping off Meta&#8217;s board into an advisory role focused on the custom silicon roadmap. Broadcom&#8217;s accompanying press release characterized MTIA as the &#8220;industry&#8217;s first 2nm AI compute accelerator&#8221; &#8212; an attribution to Broadcom&#8217;s own marketing, not independent verification.</p><p>[5] Microsoft Azure blog and press releases on Maia 200 deployment, January 2026. Initial production clusters in US Central Azure region.</p><p>[6] Anthropic, <a href="https://www.anthropic.com/news/anthropic-amazon-trainium">&#8220;Powering the next generation of AI development with AWS,&#8221;</a> corporate blog post, November 22, 2024; AWS re:Invent 2024 keynote by Matt Garman; Project Rainier activation coverage, October 29, 2025. Anthropic&#8217;s own published description: &#8220;we&#8217;re writing low-level kernels that allow us to directly interface with the Trainium silicon, and contributing to the AWS Neuron software stack to strengthen Trainium. Our engineers work closely with Annapurna&#8217;s chip design team to extract maximum computational efficiency from the hardware.&#8221;</p><p>[7] OpenAI and Broadcom, <a href="https://openai.com/index/openai-and-broadcom-announce-strategic-collaboration/">&#8220;OpenAI and Broadcom announce strategic collaboration to deploy 10 gigawatts of OpenAI-designed AI accelerators,&#8221;</a> co-announcement, October 13, 2025; Broadcom Q1 FY2026 earnings call, March 4, 2026, confirming OpenAI as sixth XPU customer with deployment beginning H2 2026 and targeted completion by end of 2029. Only a term sheet was signed at announcement, not a binding purchase order. On the Q1 FY26 call, Hock Tan specified OpenAI&#8217;s 2027 contribution at &#8220;&gt;1 gigawatt,&#8221; effectively back-weighting the bulk of the 10-gigawatt commitment into 2028 and 2029.</p><p>[8] Electrek, <a href="https://electrek.co/2026/04/15/tesla-ai5-chip-taped-out-musk-ai6-dojo3/">&#8220;Tesla taped out AI5 chip, Musk says &#8212; nearly 2 years behind schedule,&#8221;</a> April 15, 2026; Tom&#8217;s Hardware, &#8220;Elon Musk demonstrates first sample of Tesla AI5 processor,&#8221; April 15, 2026; Musk public statements on X, August 8 and August 10, 2025, including &#8220;AI5, AI6 and subsequent chips will be excellent for inference and at least pretty good for training&#8221; and &#8220;Dojo 3 arguably lives on in the form of a large number of AI6 SoCs on a single board.&#8221; AI5 is positioned for Optimus inference and for Tesla data-center clusters as Dojo&#8217;s training-mission successor, manufactured split between TSMC Arizona and Samsung Taylor, Texas, with mass production expected late 2026 to 2027. AI5 is not currently training any xAI frontier model; xAI&#8217;s Colossus 2 remains Nvidia-based.</p><p>[9] Broadcom Q4 FY2025 earnings call, December 11, 2025 (source for $73 billion AI backlog figure, which Hock Tan characterized as spanning &#8220;over the next six quarters&#8221; and flagged as &#8220;a minimum&#8221;); Broadcom Q1 FY2026 earnings release and conference call, March 4, 2026; Futurum Group, <a href="https://futurumgroup.com/insights/broadcom-q1-fy-2026-earnings-driven-by-xpu-momentum/">&#8220;Broadcom Q1 FY 2026 Earnings Driven by XPU Momentum,&#8221;</a> March 5, 2026. Q1 FY26 AI revenue $8.4B (+106% YoY); Q2 guidance $10.7B (+140% YoY). Hock Tan on Q1 FY26 call stated line of sight to chip revenue &#8220;significantly in excess of $100 billion&#8221; in 2027, and was explicit that the figure is chips-only (XPUs, switch chips, DSPs) and excludes rack and system revenue. Six XPU customer relationships disclosed, four publicly named by Broadcom (Google TPU, Meta, Anthropic, OpenAI) and two unnamed; one of the unnamed customers is widely reported by sell-side analysts (Cantor Fitzgerald, CNBC-cited analysts, The Information) to be ByteDance &#8212; an analyst attribution, not a Broadcom confirmation. On gross margin: Q1 FY26 consolidated non-GAAP gross margin was 77%; CFO Kirsten Spears on the December 11, 2025 Q4 FY25 call guided to approximately 100 basis points of sequential compression entering Q1 FY26, tied to the higher mix of AI system-level sales that include third-party pass-through costs. On the March 4, 2026 Q1 FY26 call, Spears softened the framing, telling analysts &#8220;the impact relative to our overall mix is actually not going to be substantial at all.&#8221; UBS analyst Timothy Arcuri pressed Hock Tan on that same Q1 FY26 call, framing rack-scale gross margins at &#8220;maybe 45%, 50%&#8221; and asking whether blended margin could drop 500 basis points as racks scale. Tan rejected the framing, telling Arcuri he &#8220;must be a bit hallucinating.&#8221;</p><p>[10] Gemini 3 launched November 18, 2025, announced by Jeff Dean, chief scientist at Google DeepMind and Google Research, on X and in Google&#8217;s official product blog. Dean&#8217;s subsequent Stanford University presentation on November 22, 2025 framed Gemini 3 as the culmination of Google&#8217;s decade-long TPU program, with training path disclosed publicly as TPU-based. Google has not published a granular silicon-mix model card disclosure, so &#8220;trained on TPUs&#8221; is based on Google&#8217;s public framing and Dean&#8217;s presentation rather than on a formal quantitative disclosure. On-premises Gemini 2.5 deployments via Google Distributed Cloud do run on Nvidia Blackwell; the TPU training claim applies to the primary cloud training and serving path, not to all Gemini deployments.</p><p>[11] Coverage of Nvidia&#8217;s November 2025 drawdown: <a href="https://fortune.com/2025/11/25/google-gemini-3-versus-chatgpt-market-reaction-nvidia-selloff/">Fortune, &#8220;Markets wipe $250 billion off Nvidia as they digest Google&#8217;s revenge,&#8221;</a> November 25, 2025; <a href="https://www.cnbc.com/2025/11/25/nvidia-shares-today-google-meta-ai-chip-report.html">CNBC, &#8220;Nvidia stock falls 4% on report Meta will use Google AI chips,&#8221;</a> November 25, 2025. Nvidia stock fell approximately 12% across November 2025, with a single-day decline on November 25 that ranged from 4% (close) to 7% (intraday low), wiping out approximately $250 billion in market capitalization. The primary catalyst for the November 25 move was The Information&#8217;s report that Meta was evaluating TPU deployment from 2027, compounding Gemini 3&#8217;s launch on November 18. Nvidia publicly defended its position, asserting &#8220;greater performance, versatility, and fungibility than ASICs.&#8221;</p><p>[12] NVIDIA Corporation <a href="https://www.sec.gov/Archives/edgar/data/0001045810/000104581026000021/nvda-20260125.htm">Form 10-K, fiscal year 2026</a>, SEC filing. Direct language from the filing: &#8220;the Company&#8217;s consolidated outstanding inventory purchase and long-term supply and capacity obligations balance was $95.2 billion.&#8221; Prior-year balance was $16.1 billion, implying approximately 5.9x year-over-year increase. The 10-K notes a significant portion of this balance relates to inventory purchase obligations.</p><p>[13] AMD, <a href="https://www.amd.com/en/newsroom/press-releases/2026-2-24-amd-and-meta-announce-expanded-strategic-partnersh.html">&#8220;AMD and Meta Announce Expanded Strategic Partnership to Deploy 6 Gigawatts of AMD GPUs,&#8221;</a> February 24, 2026 press release; AMD Q4 2025 earnings call with CEO Lisa Su and CFO Jean Hu; <a href="https://www.themarketsdaily.com/2026/02/27/advanced-micro-devices-expands-meta-ai-deal-6gw-instinct-gpu-plan-custom-mi450-and-160m-share-warrant.html">Markets Daily coverage of the warrant and GPU plan,</a> February 27, 2026. Five-year agreement for up to 6 gigawatts of AMD Instinct GPUs across multiple generations: first 1-gigawatt tranche built on MI450 architecture, subsequent tranches spanning future Instinct generations. AMD did not disclose a dollar value; CFO Jean Hu described economics as &#8220;double-digit billions per gigawatt.&#8221; The $60 billion over five years figure is a press and analyst estimate (AP, Deseret News) rather than an AMD-disclosed number. Performance-based warrant grants Meta up to 160 million AMD shares, vesting gated on three concurrent conditions: GPU shipment milestones, AMD share-price thresholds (with the final tranche at $600), and Meta technical and commercial milestones. Full vesting corresponds to roughly 10% AMD equity.</p><p>[14] Reporting on Meta&#8217;s February 2026 multi-year agreement with Nvidia preceding the AMD announcement; see synthesis in humai.blog, &#8220;Meta Is Buying Millions of Nvidia Chips,&#8221; March 9, 2026. Meta described the two deals as a supplier diversification strategy rather than substitution.</p><p>[15] Meta Platforms Q4 2025 earnings call, late January 2026; DataCenterDynamics, <a href="https://www.datacenterdynamics.com/en/news/meta-estimates-2026-capex-to-be-between-115-135bn/">&#8220;Meta estimates 2026 capex to be between $115-135bn,&#8221;</a> March 11, 2026. 2025 capex $72.2 billion; 2026 guidance $115-135 billion. Meta also established a new Meta Compute division in early 2026 to consolidate AI data center operations.</p><p>[16] Microsoft, <a href="https://www.microsoft.com/en-us/investor/events/fy-2026/earnings-fy-2026-q2">FY26 Second Quarter Earnings Conference Call transcript</a>, Microsoft Investor Relations; CNBC, &#8220;Microsoft (MSFT) Q2 earnings report 2026,&#8221; January 28, 2026. Amy Hood on the call: capital expenditures $37.5 billion, roughly two-thirds on short-lived assets primarily GPUs and CPUs. Satya Nadella confirmed nearly one gigawatt of total capacity added in the quarter. RPO of $625 billion, up 110% YoY. The ~45% tied to OpenAI is a sell-side analyst attribution rather than a Microsoft disclosure; Microsoft has not broken out OpenAI&#8217;s share of its commercial RPO.</p><p>[17] CNBC, <a href="https://www.cnbc.com/2026/03/04/nvidia-huang-openai-investment.html">&#8220;Nvidia CEO Huang says $30 billion OpenAI investment &#8216;might be the last,&#8217;&#8221;</a> March 4, 2026 (Morgan Stanley Technology, Media &amp; Telecom Conference). The &#8220;might be the last&#8221; characterization of the OpenAI and Anthropic investments was made at Morgan Stanley TMT on March 4, 2026; Jensen returned to the same theme on the Dwarkesh podcast on April 15, 2026. The $30 billion Nvidia equity stake in OpenAI is part of OpenAI&#8217;s $110 billion round, finalized in early 2026 &#8212; a restructuring of the earlier September 2025 &#8220;up to $100 billion&#8221; Nvidia-OpenAI infrastructure letter of intent. On the November 2025 Anthropic transaction: per Microsoft&#8217;s official announcement and Anthropic&#8217;s corresponding blog post, <a href="https://blogs.microsoft.com/blog/2025/11/18/microsoft-nvidia-and-anthropic-announce-strategic-partnerships/">&#8220;Microsoft, NVIDIA and Anthropic announce strategic partnerships,&#8221;</a> November 18, 2025, Nvidia committed to invest up to $10 billion in Anthropic, while Microsoft committed to invest up to $5 billion. Anthropic simultaneously committed to $30 billion in Azure compute over time and a multi-gigawatt Nvidia Grace Blackwell / Vera Rubin commitment.</p><p>[18] AWS press and Matt Garman, <a href="https://datacenter.news/story/aws-s-11bn-indiana-data-centre-powers-anthropic-s-ai-growth">Project Rainier launch coverage</a>, October 29, 2025; AWS Indiana data center in New Carlisle, running approximately 500,000 Trainium2 chips at launch, with AWS&#8217;s stated target of doubling to one million Trainium2 chips by end of 2025 (independent confirmation of the end-of-2025 target achievement is not publicly available). Garman at launch described the cluster as running and training Anthropic&#8217;s models. Total Amazon stake in Anthropic reached $8 billion prior to the November 2025 Azure transaction. On the Anthropic-own-silicon exploration: Reuters exclusive by Cherney &amp; Seetharaman published April 9, 2026 and widely republished through April 10; see <a href="https://www.siliconrepublic.com/machines/anthropic-reportedly-mulls-designing-own-chips-amid-shortage">Silicon Republic summary, &#8220;Anthropic reportedly mulls designing own chips amid shortage,&#8221;</a>. The reporting cites three sources and frames the effort as early-stage, with no publicly committed design or dedicated chip team, occurring alongside Anthropic&#8217;s existing multibillion-dollar compute commitments with Nvidia, AWS, Google, and Microsoft.</p><p>[19] AMD and OpenAI, <a href="https://www.amd.com/en/newsroom/press-releases/2025-10-6-amd-and-openai-announce-strategic-partnership-to-d.html">&#8220;AMD and OpenAI Announce Strategic Partnership to Deploy 6 Gigawatts of AMD GPUs,&#8221;</a> corporate press release, October 6, 2025. First 1-gigawatt deployment on MI450 architecture; subsequent tranches span future AMD Instinct generations. Equity warrant structure parallel to the Meta deal (up to 160 million shares, milestone-gated on shipment, AMD share-price thresholds, and OpenAI technical milestones).</p><p>[20] Author aggregation of publicly announced OpenAI compute commitments across counterparties, per company disclosures and financial press coverage, late 2025 through early 2026. Components include: Microsoft ($250 billion multi-year Azure compute commitment, confirmed via Microsoft&#8217;s October 28, 2025 OpenAI recapitalization disclosure); Oracle OCI via the SoftBank-led Stargate infrastructure initiative (announced aspirational scale of $500 billion &#8212; Musk has publicly disputed the funding, SoftBank&#8217;s underwriting remains uncertain, and Oracle&#8217;s contract scope has been reported at closer to $300 billion over five years); AMD (press-estimated $60 billion over 6 gigawatts of multi-generation Instinct GPUs per the October 6, 2025 strategic partnership [19]); Broadcom (custom silicon over 10 gigawatts, commitment value not separately disclosed); and Nvidia (Grace Blackwell / Vera Rubin compute commitment accompanying the $30 billion equity investment [17], successor to the September 2025 &#8220;up to $100 billion&#8221; infrastructure LOI); AWS ($38 billion over seven years, announced November 3, 2025). Figures reflect announced commitments at varying degrees of bindingness, not disbursed spending. The defensible aggregate range is $800 billion to $1.2 trillion across 2025-2035, depending on how much weight the reader assigns to Stargate&#8217;s aspirational $500 billion headline. The &#8220;over a trillion dollars&#8221; figure is an author calculation from these components, not a single officially reported number, and includes the full Stargate headline value.</p><p>[21] xAI Memphis complex reporting, Q1 2026; Musk public statements on X, January 17, 2026 and April 15, 2026; Tom&#8217;s Hardware and Epoch AI satellite-imagery analysis of delivered capacity, January&#8211;February 2026. The Memphis complex (Colossus 1 + Colossus 2 + the MACROHARDRR building purchased December 30, 2025) collectively houses approximately 555,000 Nvidia GPUs spanning H100, H200, and Blackwell generations &#8212; not a single homogeneous Blackwell cluster. Musk&#8217;s stated target for 1.5-gigawatt capacity by April 2026 (per January 17 X post) is a target, not independently verified; satellite-based analyses through early 2026 estimated delivered cooling capacity materially below 1 gigawatt. Dojo (custom D-series training chip) was wound down August 2025 (Bloomberg, August 7; Musk confirmation, August 10, 2025); the training mission persists via AI5/AI6-based board clusters per Musk&#8217;s own framing.</p><p>[22] DeepSeek official WeChat account, top-pinned comment accompanying V3.1 release, August 21, 2025; CNBC, <a href="https://www.cnbc.com/2025/08/22/deepseek-hints-latest-model-supported-by-chinas-next-generation-homegrown-ai-chips.html">&#8220;DeepSeek hints latest model will be compatible with China&#8217;s &#8216;next generation&#8217; homegrown AI chips,&#8221;</a> August 22, 2025; South China Morning Post, <a href="https://www.scmp.com/tech/big-tech/article/3322688/tech-war-deepseek-hints-china-close-unveiling-home-grown-next-generation-ai-chips">&#8220;DeepSeek hints China close to unveiling home-grown next-generation AI chips,&#8221;</a> August 21, 2025. DeepSeek&#8217;s own statement, in translated form: &#8220;UE8M0 FP8 is designed for the next generation of domestically produced chips to be released soon.&#8221; DeepSeek did not name the chip vendor. Technical precision: UE8M0 is the unsigned 8-bit power-of-2 scale-factor format defined within the Open Compute Project&#8217;s Microscaling (MX) specification v1.0, published September 2023 &#8212; not a Chinese-invented format. Element types in MXFP8 are E4M3 or E5M2; UE8M0 is the shared per-block scale. Nvidia Blackwell also natively supports MXFP8 with E8M0 scales. DeepSeek&#8217;s alignment is therefore of its model&#8217;s scale-factor behavior to the forthcoming Chinese domestic silicon (reportedly Moore Threads MUSA 3.1 and VeriSilicon VIP9000 in some variants) that implements MXFP8 &#8212; a model-layer co-design decision, not the invention of a new numerical format. The V3.1 technical paper states the model was trained &#8220;using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.&#8221; Separately, FT and Reuters reported (August 13&#8211;14, 2025) that DeepSeek attempted to train its R2 model on Huawei Ascend accelerators in mid-2025 but reverted to Nvidia H20 after encountering training-stability issues.</p><p>[23] NVIDIA developer communications and industry coverage of NVLink Fusion integration with AWS Trainium4, disclosed in late 2025 around AWS re:Invent. Architectural integration is described at the fabric layer, allowing NVLink-compatible interconnect to integrate with non-Nvidia accelerators. Specific keynote venue and shipping date subject to verification against primary AWS and Nvidia disclosures.</p>]]></content:encoded></item><item><title><![CDATA[The HAI Numbers Moved — What It Means]]></title><description><![CDATA[Five shifts in Stanford&#8217;s 2026 AI Index. Here are the forces behind them.]]></description><link>https://www.airealist.ai/p/the-hai-numbers-moved-what-it-means</link><guid isPermaLink="false">https://www.airealist.ai/p/the-hai-numbers-moved-what-it-means</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Wed, 15 Apr 2026 08:51:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Dx9h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dx9h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dx9h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!Dx9h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!Dx9h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Dx9h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dx9h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2479526,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/194274245?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dx9h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 424w, https://substackcdn.com/image/fetch/$s_!Dx9h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 848w, https://substackcdn.com/image/fetch/$s_!Dx9h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 1272w, https://substackcdn.com/image/fetch/$s_!Dx9h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F07cfa05e-b59c-4b64-9c78-9d0a6e14e2b5_1376x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On April 14, 2026, the <a href="https://hai.stanford.edu/">Stanford Institute for Human-Centered AI</a> published its ninth annual AI Index report. Four hundred and twenty-three pages. Nine chapters. Fifteen top takeaways. Data from Epoch AI, McKinsey, the IEA, Zeki, Brookings, the NRC, and dozens of academic studies. It is the most comprehensive, independently sourced picture of AI&#8217;s trajectory available anywhere.[1]</p><p>It is also, almost entirely, a report about <em>what</em>. What was spent? What was built? What was measured? What countries produced how many models, filed how many patents, and deployed how many robots? The data is rigorous, the charts are clean, and the sourcing is transparent.</p><p>What the report does not do &#8212; what an annual data report is not designed to do &#8212; is explain <em>why</em> any of it is happening, or <em>what breaks</em> when the underlying structure shifts. Stanford provides the measurement. The diagnosis requires a different kind of analysis.</p><p>This piece starts where the AI Index stops. It takes the five most consequential shifts between the 2025 and 2026 editions, layers on the analytical frameworks this publication has been building for the past year, and delivers the analysis the data requires, but the report doesn&#8217;t attempt. The data is Stanford&#8217;s. The diagnosis is mine.</p><h2>Five shifts in twelve months</h2><p>Reading the 2025 and 2026 AI Index reports back-to-back produces an uncanny effect. The format is nearly identical. The chapters follow the same structure. But the numbers have moved &#8212; in some cases, dramatically &#8212; and the direction of movement tells a story the report itself doesn&#8217;t narrate.</p><p><strong>Shift 1: The U.S.-China model performance gap closed.</strong></p><p>The 2025 report noted the gap was narrowing. On MMLU, MATH, and HumanEval, margins that had been 17&#8211;32 percentage points at the end of 2023 compressed to low single digits by late 2024.[2] The 2026 report delivers the conclusion: &#8220;U.S. and Chinese models have traded the lead multiple times since early 2025.&#8221;[3] The top U.S. model leads by 2.7% as of March 2026 &#8212; a margin that fluctuated throughout the year and briefly went to zero in February 2025, when DeepSeek-R1 matched the top American model.[4]</p><p>This is not a marginal update. It is a reclassification. Twelve months ago, the framing was &#8220;U.S. leads in models.&#8221; Today, the framing is &#8220;the lead changes hands monthly.&#8221;</p><p><strong>Shift 2: AI talent stopped moving to the United States.</strong></p><p>The number of AI researchers and developers moving to the U.S. dropped 89% between 2017 and 2025. The decline accelerated &#8212; down 80% in the last year alone.[5] Net talent flow to the U.S. fell from 324.6 (rolling average, 2022) to 26.0 in 2025.[6] India&#8217;s net outflow was -16.9. Canada went negative at -7.1. These are not fluctuations. The U.S. talent magnet is losing its field strength at exactly the moment the model performance advantage has evaporated.</p><p>A necessary qualification: the U.S. talent <em>stock</em> remains enormous &#8212; 220,520 AI authors and inventors, four times that of any other country. The flow is collapsing, but the base is still dominant. The real question is how long the stock sustains competitiveness without the flow. Talent is not a static asset. It depreciates through attrition, retirement, and the speed at which the frontier moves. A stockpile that isn&#8217;t replenished is a wasting asset on a long enough timeline.</p><p>The 2025 report lacked the data to flag this trend. It became visible only when the Zeki longitudinal dataset appeared in the 2026 edition.</p><p><strong>Shift 3: Capex went exponential. Revenue followed, but on a different curve.</strong></p><p>The 2025 report&#8217;s most cited finding was the cost decline: inference prices fell 280&#215; in eighteen months for GPT-3.5-equivalent performance.[7] That finding survives in the 2026 edition. But the 2026 data introduces the other side of the ledger.</p><p>OpenAI&#8217;s annual compute spend rose from roughly $280 million in 2022 to $16.3 billion in 2025.[8] Anthropic went from around $420 million to $8.3 billion over the same period.[9] Google&#8217;s annual capital expenditure exceeded $150 billion in 2025, per Citi Research estimates cited in the report.[10] The Stargate venture announced a $100&#8211;500 billion range.[11] The HAI report&#8217;s 2025 data captures the acceleration, but the 2026 guidance announced since has escalated further &#8212; the Big Five hyperscalers&#8217; aggregate 2026 capex is now projected at $660&#8211;690 billion, with CreditSights estimating as high as $750 billion.</p><p>Private AI investment in the U.S. alone reached $285.9 billion in 2025 (private investment events, including venture capital and private equity, per Quid &#8212; a distinct measure from infrastructure capex). That is more than 23 times China&#8217;s $12.4 billion in private investment, though, as the report notes, this likely understates China&#8217;s total AI spending given government-guided funds.[12]</p><p>Revenue is growing, too. OpenAI reached an estimated annualized $25 billion. Anthropic reached an estimated $19 billion.[13] The report positions these as &#8220;historically fast&#8221; growth trajectories &#8212; OpenAI outpacing Uber, Cheniere, and Moderna after crossing $1 billion in annual revenue.[14]</p><p>But the report does not perform the operation that matters: dividing the spend by the revenue and asking whether the structure is sustainable. It does not ask how any of these companies finance their compute. It does not check the depreciation schedules. It does not distinguish committed from disbursed. The numbers are presented. The financial architecture is absent.</p><p><strong>Shift 4: &#8220;AI sovereignty&#8221; became an official policy category.</strong></p><p>The 2025 report had no sovereignty section. The 2026 report introduces a new analytical framework that breaks sovereignty into five layers: infrastructure, data, models, applications, and talent.[15] Across those layers, the report tracks state-backed AI supercomputing clusters (Europe: 3 to 44 between 2018 and 2025), data localization measures (East Asia Pacific leads with 77; North America has 3), and model production by region (U.S.: 1,618 cumulative; China: 849; Europe and Central Asia: 666; South Asia: 21; Latin America: 2).[16]</p><p>The report maps Nvidia&#8217;s AI Factory partnerships and OpenAI&#8217;s Stargate country-level agreements geographically.[17] It notes that &#8220;private firms are playing an increasingly central role in building what many governments designate as national AI infrastructure.&#8221;[18]</p><p>This is data. It is useful data. But it describes the distribution of assets, not the structure of dependencies. It counts the switches without mapping who holds them.</p><p><strong>Shift 5: The labor market cracked along generational lines.</strong></p><p>The 2025 report noted productivity gains across several occupations. The 2026 report adds the cost side: employment for U.S. software developers aged 22&#8211;25 has fallen nearly 20% from its 2022 peak, even as headcount for older developers continues to grow.[19] Customer support agents show the same generational pattern.[20] AI agent deployment remains in single digits across nearly all business functions.[21] One-third of surveyed organizations expect workforce reductions in the coming year, with the reductions concentrated in service operations, supply chain, and software engineering.[22]</p><p>The productivity data is simultaneously positive (14&#8211;15% gains in customer support, 26% in software development, 50% in marketing output) and negative (the METR study found experienced developers became 19% slower with AI assistance, though the team has been unable to replicate the finding in later work).[23] The clearest pattern: gains are largest in structured, measurable work. For now, they shrink in tasks requiring deeper reasoning.[24]</p><p>These five shifts are not independent. The talent collapse feeds the performance convergence &#8212; if researchers stop flowing to the U.S. while China&#8217;s output scales, the gap closes. The convergence drives the capex arms race &#8212; when the model is no longer the differentiator, the infrastructure bet becomes the competitive move. The capex arms race creates the energy demand. And the energy demand, combined with the sovereignty imperative, drives the infrastructure buildout that the HAI sovereignty section now measures. The report presents these dynamics in separate chapters. The story is the chain between them.</p><h2>What the data means &#8212; and what it doesn&#8217;t</h2><p>Stanford&#8217;s contribution is the measurement. The five shifts above are real, sourced, and reproducible. What follows is the explanation the data requires.</p><h3>The performance convergence is a capex problem, not a capability story</h3><p>The U.S.-China model convergence sounds like a technology story. It isn&#8217;t. It&#8217;s a financial architecture story.</p><p>When models converge at the frontier &#8212; when the top four organizations are separated by just 22 Elo points (a chess-inspired rating system) on the Arena Leaderboard, and six sit within 79[25] &#8212; the competitive differentiator shifts from capability to cost, reliability, and distribution. The report notes this shift but doesn&#8217;t trace its financial consequence. If the model is no longer the moat, then the infrastructure beneath it is. And infrastructure is a capex game with a specific financial architecture.</p><p><a href="https://www.airealist.ai/p/welcome-to-hotel-abilene">Hotel Abilene</a> and the capex series that followed it mapped this dynamic. When AI companies compete on infrastructure rather than intelligence, the financial dynamics change: depreciation schedules matter more than benchmark scores. Free cash flow after capex determines survival. The Commitment-vs-Spend Gap &#8212; the ratio between announced investment and actual capital expenditure &#8212; becomes the diagnostic metric.[26]</p><p>The HAI report provides the ingredients for this analysis. Google&#8217;s estimated $150 billion in annual capex (per Citi Research). OpenAI&#8217;s compute spend trajectory. The Stargate $100&#8211;500 billion range. But it does not apply the Depreciation Lens (has Google extended server useful life assumptions to flatter the income statement while capex rises?), the FCF Sustainability Test (is the incremental AI capex funded from operating cash flow or from debt?), or the Revenue Attribution Problem (when Google adds AI features to existing products, is the incremental revenue &#8220;AI revenue&#8221;?).[27]</p><p>The consumer surplus finding &#8212; $172 billion in annual value to U.S. consumers (estimated via willingness-to-accept survey, N=2,000), with a median per-user value that tripled between 2025 and 2026[28] &#8212; is the most revealing number in the entire report. If most of the value from generative AI accrues to consumers through free tools, and innovators historically capture only ~3% of total social returns (the Nordhaus finding the report itself cites[29]), then the capex-to-revenue conversion problem isn&#8217;t a timing issue. It&#8217;s a built-in feature of the economy. The money goes in at the infrastructure layer. The value emerges at the consumer level. The distance between those two layers is the entire AI financial problem.</p><h3>The talent collapse validates the doom loop &#8212; across multiple countries simultaneously</h3><p>The 89% decline in AI talent moving to the United States is not an immigration policy footnote. It is a systemic shift in the global talent market with direct implications for every country in the AI Realist&#8217;s national analysis series.</p><p>For Japan, the HAI data confirms what the doom loop predicts.[30] Japan has 6,280 AI authors and inventors &#8212; fewer than Singapore&#8217;s 6,610, despite having more than 20 times the population. Japan&#8217;s net talent flow: 0.0.[31] Not positive, not negative. Zero. The doom loop&#8217;s terminal state: a system that neither produces nor attracts the talent it needs, and has reached equilibrium at a level far below what its economic scale would suggest.</p><p>For India, the data confirms the services equilibrium thesis. India has the largest net outflow of any country tracked (-16.9).[32] Fifty thousand AI researchers and developers are in the system; the system exports them. The HAI data makes the &#8220;building for everyone but themselves&#8221; diagnosis quantitative.</p><p>For France, the model count is the verdict. France produced one notable model in 2025. Europe as a whole produced two.[33] The total cumulative model count for Europe and Central Asia (666) is less than half of China&#8217;s (849).[34] <a href="https://www.airealist.ai/p/mistral-succeeded-frances-ai-strategy">Mistral Succeeded. France&#8217;s AI Strategy Didn&#8217;t</a> was a diagnosis of a sixty-year state apparatus pattern &#8212; the tax credit system, the Grandes &#201;coles pipeline, the state investment bank. The HAI data is the annual physical.</p><p>The per capita talent density numbers reveal a different cut: Switzerland leads the world at 110.5 per 100,000 inhabitants, Singapore at 109.5.[35] These are countries that appear in the HAI data as punching above their weight &#8212; and they&#8217;re the same countries the sovereignty analysis identifies as facing the trilemma most acutely. High talent density, deep foreign platform dependency, no viable path to full-stack sovereignty.</p><h3>The sovereignty section describes the map. It doesn&#8217;t show the switches.</h3><p>HAI&#8217;s new five-layer sovereignty framework (infrastructure, data, model, application, talent) is a taxonomic contribution. It organizes the conversation. This is not a failing of the report &#8212; HAI is a measurement exercise, and a rigorous one. The gap exists because the questions that matter most for investment and policy decisions require a different kind of analysis, one that starts where measurement ends. Specifically: mapping not who <em>has</em> what, but who can <em>take it away</em>.</p><p>The coercion stack &#8212; the three-switch model published in <a href="https://www.airealist.ai/p/access-disable-destroy">Access, Disable, Destroy</a> &#8212; operates at a different analytical level.[36] HAI counts state-backed supercomputing clusters. The coercion stack asks: if the chips inside those clusters are fabricated by one Taiwanese foundry (which the HAI report itself flags as a dependency[37]), and the cloud services running on those chips are operated by companies subject to U.S. jurisdiction, and the models trained on that compute are licensed under terms that permit geographic restrictions &#8212; then what is the &#8220;sovereignty&#8221; of the cluster?</p><p>The report maps Nvidia AI Factory partnerships across dozens of countries.[38] <a href="https://www.airealist.ai/p/every-country-needs-sovereign-ai">Every Country Needs Sovereign AI. Jensen Is Selling It.</a> mapped the same partnerships and asked the question HAI doesn&#8217;t: is the Nvidia AI Factory model sovereignty or capture? The &#8220;closed orbit&#8221; framework &#8212; Nvidia&#8217;s ecosystem as a black hole that routes all activity back to Nvidia hardware[39] &#8212; directly explains the pattern the HAI map displays. More countries are standing up &#8220;sovereign&#8221; compute infrastructure. Nearly all of it runs on Nvidia silicon, managed through Nvidia&#8217;s software stack, with maintenance dependencies that create ongoing operational leverage.</p><p>The data localization count (77 measures in East Asia Pacific, 3 in North America) is similarly descriptive without being diagnostic. The Entity Test asks a different question: does localization create legal sovereignty, or does it create a data residency requirement that the CLOUD Act&#8217;s compelled disclosure provision (18 U.S.C. &#167; 2713) renders moot?[40] A data center in Frankfurt running on U.S.-controlled cloud infrastructure, using U.S.-fabricated chips, operated by a subsidiary of a U.S. parent company, does not become &#8220;sovereign&#8221; because German data localization law requires the bits to stay in Germany. The bits comply. The legal exposure doesn&#8217;t.</p><p>HAI acknowledges that &#8220;open-source development is starting to redistribute participation.&#8221; But the data supports the closed orbit thesis: open source redistributes <em>activity</em> without redistributing <em>sovereignty</em>, because the hardware dependency persists through every fork, every download, and every deployment.[41]</p><h3>The labor market data confirms the process thesis, not the tool thesis</h3><p>The productivity studies the report compiles &#8212; a 26% gain in software development, 14&#8211;15% in customer support, 50% in marketing output[42] &#8212; are study-level findings. They describe outcomes in specific contexts with specific implementations. The report presents them as evidence that &#8220;AI&#8217;s productivity effects are highly context dependent&#8221; and that &#8220;gains are strongest when work can be divided into well-defined, repeatable tasks.&#8221;[43]</p><p>The observation is correct. But &#8220;context dependent&#8221; is a description, not an explanation. <a href="https://www.airealist.ai/p/ai-tools-work-does-your-engineering">AI Tools Work. Your Engineering Process May Not.</a> supplied the mechanism: AI coding tools amplify existing organizational strengths and weaknesses.[44] Organizations with specification discipline, test-driven development practices, and senior engineer judgment as governance layers get the gains. Organizations without those structures end up with debt.</p><p>The junior developer employment decline &#8212; nearly 20% among ages 22&#8211;25 [45] &#8212; is the predictable consequence. There is no evidence that AI replaces junior developers. It is evidence that organizations, when given a tool that produces code faster, reduce the hiring they perceive as most substitutable. The METR finding that experienced developers were 19% <em>slower</em> with AI assistance[46] is not a contradiction &#8212; it&#8217;s the same thesis from the other end. Experienced developers, whose value lies in judgment and architectural decisions rather than code velocity, gain less from a tool that accelerates code generation. Their productivity metric is not lines per hour.</p><p>The learning penalty finding &#8212; software engineers who relied heavily on AI for learning showed no measurable speed improvement[47] &#8212; closes the loop. If junior developers are hired less, and the juniors who are hired learn less because they lean on AI tools, the senior talent pipeline degrades. The result is a doom loop in the labor market, identical in mechanism to the country-level doom loops in the national analysis series.</p><h3>The energy data shows the demand. Nobody is showing the supply.</h3><p>AI data center power capacity reached 29.6 GW, comparable to New York state at peak demand.[48] Grok 4&#8217;s estimated training emissions reached 72,816 tons of CO&#8322; equivalent.[49] Annual GPT-4o inference water consumption, per HAI estimates, may exceed the drinking water needs of 12 million people.[50] The IEA projects data center electricity consumption will roughly double between 2024 and 2030, with the U.S. accounting for the largest share.[51]</p><p>These are the demand numbers.</p><p>What no measurement exercise can do &#8212; not this one, not any &#8212; is apply the Nuclear Delivery Test: reactor status, fuel type, distance. Every hyperscaler nuclear announcement has a credibility score against those three questions, and most score poorly on them.[52] The 29.6 GW demand figure appears without any assessment of whether the supply pipeline &#8212; SMRs that require HALEU fuel that doesn&#8217;t exist at a commercial scale, sited in locations that don&#8217;t overlap with existing datacenter clusters, on timelines that don&#8217;t close before 2030 &#8212; will produce operational power.[53]</p><p>The bridge technology question &#8212; what powers AI datacenters through 2030 while the nuclear press releases age &#8212; is the analytical gap <a href="https://www.airealist.ai/p/the-half-life-of-a-press-release">The Half-Life of a Press Release</a> fills. The answer is natural gas, and the companies that benefit from the nuclear-datacenter timeline mismatch are not the ones in the press releases.[54]</p><p>The energy efficiency data is worth noting: DeepSeek V3&#8217;s training emissions were 597 tons, compared to Grok 4&#8217;s 72,816.[55] Per-query inference energy for Claude 4 Opus (5&#8211;6 Wh) is a quarter of DeepSeek V3.2 (23 Wh).[56] These efficiency spreads are enormous &#8212; two orders of magnitude at the training level, four to one at inference &#8212; and they directly affect the economic viability of the Reasoning Tax framework.[57] A reasoning model that generates 40,000 output tokens per query at 23 Wh per query falls into a different economic and environmental category than one that generates the same tokens at 5 Wh per query.</p><h2>What was the thesis is now data</h2><p>The most useful exercise in reading the 2026 report alongside a year of published analysis is watching arguments that were thesis-stage become data-confirmed &#8212; and noting where the data complicate the thesis rather than confirm it.</p><p><strong>Confirmed: Model convergence shifts competition to infrastructure.</strong> Published in &#8220;Hotel Abilene&#8221; and &#8220;<a href="https://www.airealist.ai/p/open-source-closed-orbit">Open Source, Closed Orbit</a>.&#8221; HAI 2026 data: four companies within 22 Elo points at the frontier, six within 79; competitive pressure &#8220;shifting toward cost, reliability, and domain-specific performance.&#8221;[58]</p><p><strong>Confirmed: European AI sovereignty is a policy aspiration without industrial substance.</strong> Published in the France piece and subsequent sovereignty analysis. HAI 2026 data: Europe produced two notable models in 2025. France produced one. Europe and Central Asia&#8217;s cumulative model count (666) is under half of China&#8217;s. Meanwhile, Europe has 44 state-backed supercomputing clusters, second only to China. Infrastructure without intelligence.[59]</p><p><strong>Confirmed: The U.S. structural advantage is at the chip and cloud layers, not the model layer.</strong> Published in <a href="https://www.airealist.ai/p/access-disable-destroy">Access, Disable, Destroy</a>. HAI 2026 data: model performance gap at 2.7%; TSMC fabricates nearly every leading AI chip; the U.S. hosts 5,427 data centers, more than 10 times any other country.[60] The advantage is physical and legal, not algorithmic.</p><p><strong>Complicated: The process-vs-tool distinction is blurrier than &#8220;AI Tools Work&#8221; assumed.</strong> The METR study found experienced developers became 19% slower with AI assistance &#8212; consistent with the thesis that AI tools amplify existing process weaknesses.[61] But the 2026 report adds a wrinkle: METR has been unable to replicate the finding, primarily because developers now refuse to work without AI tools.[62] If the tool has become indispensable even though it measurably slows people down, the organizational-process explanation may be necessary but not sufficient. The tool is changing the process faster than the process can be measured. This doesn&#8217;t break the thesis: specification discipline and test-driven development still separate organizations that get gains from those that don&#8217;t. But it means the baseline has shifted under the measurement.</p><p>Three of these were published months before the HAI data confirmed them. That&#8217;s what the analysis does: it identifies the forces before the measurement catches up. The doom loops, the coercion stack, the capex frameworks &#8212; these aren&#8217;t predictions. They&#8217;re descriptions of systems that produce predictable outcomes. The HAI report provided 423 pages of those outcomes. The frameworks were already waiting for the data.</p><h3>Where the data pushes back</h3><p>An honest reading requires asking whether the 2026 report <em>contradicts</em> anything this publication has argued. Two findings deserve scrutiny.</p><p>The first is the productivity J-curve. U.S. productivity growth reached 2.7% in 2025, nearly double the prior decade&#8217;s average.[63] A study of 12,000 European firms found a 4% boost in labor productivity from AI adoption.[64] The OECD projects annual labor productivity growth of 0.2 to 1.3 percentage points for G7 economies over the next decade.[65] These are not vendor claims. They are peer-reviewed macro findings. The capex thesis &#8212; that the financial structure of AI investment contains risks the market underprices &#8212; does not assume AI doesn&#8217;t work. But the bull case is accumulating data. A note of intellectual honesty: the consumer surplus finding used earlier in this piece to frame the value-capture gap, and the J-curve hypothesis cited here as the strongest challenge to the capex thesis, come from the same researcher &#8212; Erik Brynjolfsson, who sits on HAI&#8217;s steering committee. His data supports both readings. The bear case says the gap between value created and value captured is permanent. The bull case says it closes as organizations learn to monetize AI. Both are defensible; the data has not yet resolved the question. The risks in the financial architecture remain (depreciation manipulation, debt-funded capex, commitment-vs-spend gaps), but the &#8220;what if it works?&#8221; scenario is no longer speculative. It&#8217;s showing up in the productivity statistics.</p><p>The second is France. <a href="https://www.airealist.ai/p/mistral-succeeded-frances-ai-strategy">Mistral Succeeded. France&#8217;s AI Strategy Didn&#8217;t.</a> argued that France cannot build frontier AI. The HAI adoption data shows France ranking fifth globally in population-level AI diffusion at 44% &#8212; ahead of the United Kingdom, Germany, and the United States.[66] France can&#8217;t build. But it adopts more aggressively than the countries that can. This doesn&#8217;t break the thesis. It sharpens it. France is a structurally enthusiastic consumer of technology; it is structurally incapable of producing it. The sixty-year state apparatus pattern optimizes for adoption and regulation, not for production. The HAI data makes that distinction crisper than the original piece did.</p><p>Neither finding overturns a published thesis. But both add complexity that the next iteration of the analysis must absorb.</p><h2>Where the data stops</h2><p>The HAI AI Index is the best longitudinal dataset on AI&#8217;s trajectory. It is also, by design, a measurement exercise. It cannot diagnose causal mechanisms. It cannot trace legal pathways. It cannot run financial sustainability tests. It cannot model coercion scenarios.</p><p>These are not criticisms. They are boundary conditions. The AI Index measures what can be counted. This publication explains what the counts mean.</p><p>Five questions the data raises but doesn&#8217;t answer:</p><p>The <strong>financial architecture</strong> of AI investment &#8212; depreciation schedules, FCF sustainability, debt vs. equity financing of capex, the gap between committed and disbursed capital &#8212; is the difference between knowing how much was spent and knowing whether the spend survives a revenue disappointment.</p><p>The <strong>legal architecture</strong> of sovereignty &#8212; CLOUD Act pathways, entity test analysis, the distinction between data residency and data access jurisdiction. Without it, counting localization measures tells you nothing about whether they provide legal protection.</p><p>The <strong>coercion topology</strong> &#8212; who holds the off switch at each layer, through what legal and commercial mechanism, and on what timeline. Without it, a map of AI asset distribution is mistaken for an understanding of AI power.</p><p>The <strong>process layer</strong> beneath productivity statistics &#8212; what separates the enterprises that achieve 26% gains from those where experienced developers slow down. Without it, outcomes are measured, but mechanisms are invisible.</p><p>The <strong>supply-side energy analysis</strong> &#8212; reactor status, fuel availability, geographic mismatch, bridge technology economics &#8212; is the difference between knowing the demand and knowing whether the demand gets met.</p><p>Stanford measures the field. This publication maps the forces underneath it. The 2026 data confirms that the forces we&#8217;ve been mapping are real &#8212; and the two places the data pushes back make the analysis sharper, not weaker.</p><p>You can wait twelve months for the 2027 report to tell you what happened. Or you can read the structural diagnosis here every week, while there&#8217;s still time to act on it.</p><div><hr></div><h3>Notes</h3><p>[1] Stanford Institute for Human-Centered Artificial Intelligence, &#8220;Artificial Intelligence Index Report 2026,&#8221; April 2026. Ninth edition. 423 pages, 9 chapters. All HAI citations in this piece reference this edition unless noted as the 2025 edition. Available at <a href="https://hai.stanford.edu/ai-index/2026-ai-index-report">https://hai.stanford.edu/ai-index/2026-ai-index-report</a>.</p><p>[2] Stanford HAI AI Index Report 2025, Chapter 2, &#8220;Technical Performance.&#8221; MMLU gap: 17.5 pp; HumanEval gap: 31.6 pp at end of 2023; all compressed to single digits by end of 2024.</p><p>[3] HAI 2026, Top Takeaways, #2.</p><p>[4] HAI 2026, Top Takeaways, #2. Arena Elo ratings as of March 2026: Anthropic 1,503; xAI 1,495; Google 1,494; OpenAI 1,481; Alibaba 1,449; DeepSeek 1,424. Chapter 2 Highlights.</p><p>[5] HAI 2026, Chapter 1, Section 1.8. Source: Zeki Data, 2025. The 89% figure is cited in both Top Takeaways (#7) and Chapter 1 Highlights (#9).</p><p>[6] HAI 2026, Section 1.8, &#8220;Mobility,&#8221; Figure 1.8.6. Rolling 12-month average.</p><p>[7] HAI 2025, Chapter 1, Report Highlights #6. Cost to query a model scoring 64.8 on MMLU fell from $20.00 to $0.07 per million tokens between November 2022 and October 2024.</p><p>[8] HAI 2026, Section 4.2, Figure 4.2.20. Source: Epoch AI, 2026. OpenAI compute spend includes R&amp;D, inference, and unattributed categories. The 2022 figure is approximately $280 million; the 2025 figure is $16.3 billion.</p><p>[9] HAI 2026, Section 4.2, Figure 4.2.20. Anthropic's 2022 figure is approximately $420 million; its 2025 figure is $8.3 billion.</p><p>[10] HAI 2026, Section 4.2, &#8220;Capital Expenditures.&#8221; Google &#8220;reporting more than $150 billion in annual capex in 2025.&#8221; Source: Citi Research, Figure 4.2.21.</p><p>[11] HAI 2026, Section 4.1, timeline entry January 21, 2025. &#8220;Between $100 billion and $500 billion&#8221; by 2029. Post-HAI data: Q4 2025 and Q1 2026 earnings guidance from the Big Five hyperscalers (Amazon, Alphabet, Microsoft, Meta, Oracle) produced aggregate 2026 capex projections of $660&#8211;690 billion (CreditSights, Futurum, multiple analyst estimates, February 2026). Amazon&#8217;s $200 billion single-year commitment was the largest in corporate history. CreditSights subsequently raised its aggregate estimate to ~$750 billion. Morgan Stanley projects hyperscaler borrowing exceeding $400 billion in 2026, more than double 2025&#8217;s $165 billion. Capex-to-revenue ratios for major hyperscalers reached 45&#8211;57% levels previously associated with industrial or utility companies, not technology firms.</p><p>[12] HAI 2026, Top Takeaways, #7 and Chapter 4 Highlights, #2. The report explicitly notes that &#8220;private investment figures likely understate China&#8217;s total AI spending, as government guidance funds have deployed an estimated $184 billion into AI firms between 2000 and 2023.&#8221;</p><p>[13] HAI 2026, Section 4.2, Figure 4.2.18. Source: Epoch AI, 2026. Described as &#8220;annualized revenue estimates&#8221; from &#8220;direct company statements or established media reporting&#8221; &#8212; i.e., these are not from audited filings. The report cautions that figures are &#8220;directional rather than precise.&#8221;</p><p>[14] HAI 2026, Section 4.2, Figure 4.2.19.</p><p>[15] HAI 2026, Section 8.3, &#8220;AI Sovereignty.&#8221; The five layers: infrastructure sovereignty, data sovereignty, model sovereignty, application sovereignty, and talent sovereignty.</p><p>[16] HAI 2026, Section 8.3. Supercomputing clusters: Figure 8.3.1 (Epoch AI data). Data localization measures: Figure 8.3.3 (Ferracane et al., 2026). Model production: Figure 8.3.4 (Epoch AI cumulative model releases, 2018&#8211;2025). Note these are all publicly documented models, not just &#8220;notable&#8221; models per Epoch AI criteria &#8212; a broader count than Chapter 1&#8217;s notable model dataset.</p><p>[17] HAI 2026, Section 8.3, Figure 8.3.2. Geographic map of Nvidia AI Factory and OpenAI Stargate country-level partnerships.</p><p>[18] HAI 2026, Section 8.3, &#8220;Infrastructure Sovereignty.&#8221;</p><p>[19] HAI 2026, Section 4.4, &#8220;Workforce Impact,&#8221; Figure 4.4.29. Source: Brynjolfsson et al., 2025. &#8220;Employment for software developers ages 22&#8211;25 had fallen close to 20% from its 2022 peak.&#8221; Also cited in Top Takeaways (#9).</p><p>[20] HAI 2026, Section 4.4, Figure 4.4.29. Customer service agents show a parallel generational pattern.</p><p>[21] HAI 2026, Section 4.3, &#8220;Deployment Stages,&#8221; Figure 4.3.7. &#8220;Scaled use was in the single digits for nearly all functions.&#8221;</p><p>[22] HAI 2026, Chapter 4 Highlights, #8. McKinsey survey.</p><p>[23] HAI 2026, Section 4.4, &#8220;Productivity Trends,&#8221; Figure 4.4.27. Customer support: Brynjolfsson et al., 2025 (14&#8211;15%). Software development: Cui et al., 2025 (26%). Marketing: Ju &amp; Aral, 2025 (50%). METR: Becker et al., 2025 (-19%). The report notes METR &#8220;has not been able to replicate the results in a later study, primarily due to a growing reluctance among developers to work without AI.&#8221;</p><p>[24] HAI 2026, Section 4.4. &#8220;Gains are strongest when work can be divided into well-defined, repeatable tasks with clear quality monitoring.&#8221;</p><p>[25] HAI 2026, Chapter 2 Highlights, #2. Arena Elo ratings as of March 2026 cited in note [4]: Anthropic (1,503) to OpenAI (1,481) spans 22 points; Anthropic (1,503) to DeepSeek (1,424) spans 79 points.</p><p>[26] Frameworks published in <a href="https://www.airealist.ai/p/hotel-abilene">Hotel Abilene</a> (March 2026), <a href="https://www.airealist.ai/p/cloud-vs-clout">Cloud vs. Clout</a> (March 2026), and <a href="https://www.airealist.ai/p/train-deploy-write-down">Train, Deploy, Write Down</a> (March/April 2026), The AI Realist, www.airealist.ai. The Commitment-vs-Spend Gap is defined in the capex-finance vertical as the ratio of announced AI investment commitments to actual capital expenditure in the most recent filing.</p><p>[27] The Depreciation Lens, FCF Sustainability Test, and Revenue Attribution Problem are analytical frameworks published across the CapEx-Finance series. Specific definitions: Microsoft extended the useful life of servers from 4 to 6 years in 2022 (the reference case for the Depreciation Lens); Meta, Google, and Amazon followed suit. The Revenue Attribution Problem asks whether &#8220;AI revenue&#8221; is genuinely new or reclassified revenue from products that now contain AI features.</p><p>[28] HAI 2026, Section 4.2, Highlight: &#8220;What Is Generative AI Worth?,&#8221; Figure 4.2.22. Brynjolfsson et al., 2026. Consumer surplus grew from $112B to $172B annually. The median value per user tripled, from $3.40 to $11.40.</p><p>[29] HAI 2026, Section 4.2. &#8220;This pattern is consistent with findings by Nordhaus (2004) that innovators historically capture only ~3% of total social returns from major technologies.&#8221;</p><p>[30] <a href="https://www.airealist.ai/p/japan-built-the-bullet-train-why">Japan Built the Bullet Train. Why Can&#8217;t It Build an LLM?</a>, The AI Realist. The doom loop: Low AI salaries &#8594; talent leaves &#8594; companies can&#8217;t build AI &#8594; companies buy from U.S. &#8594; domestic ecosystem stays small &#8594; no market pressure to raise salaries &#8594; low AI salaries.</p><p>[31] HAI 2026, Section 1.8, Figure 1.8.6. Japan&#8217;s 2025 net flow: 0.0. Total AI authors and inventors: 6,280 (Figure 1.8.1). Singapore: 6,610.</p><p>[32] HAI 2026, Section 1.8, Figure 1.8.6. India 2025 net flow: -16.9.</p><p>[33] HAI 2026, Chapter 1, Section 1.1, Figure 1.1.1. France: 1 notable model in 2025. Europe total: 2 notable models in 2025. Note: this is based on the Epoch AI &#8220;notable models&#8221; dataset, which applies criteria such as state-of-the-art performance and high citation counts. The broader model count in Chapter 8 (666 cumulative for Europe and Central Asia) covers all publicly documented models.</p><p>[34] HAI 2026, Section 8.3, Figure 8.3.4.</p><p>[35] HAI 2026, Section 1.8, Figure 1.8.2. Switzerland: 110.5 AI authors and inventors per 100,000 inhabitants. Singapore: 109.5.</p><p>[36] <a href="https://www.airealist.ai/p/access-disable-destroy">Access, Disable, Destroy</a>, The AI Realist, March 2026. The coercion stack: three layers (chips, cloud, models), mapping who holds the off switch, through what legal/commercial mechanism, and on what timeline.</p><p>[37] HAI 2026, Top Takeaways, #3. &#8220;A single company, TSMC, fabricates almost every leading AI chip, making the global AI hardware supply chain dependent on one foundry in Taiwan.&#8221;</p><p>[38] HAI 2026, Section 8.3, Figure 8.3.2.</p><p>[39] <a href="https://www.airealist.ai/p/open-source-closed-orbit">Open Source, Closed Orbit</a>, The AI Realist, March 2026, 6,439 words, 100 footnotes. The &#8220;black hole&#8221; framework: Nvidia&#8217;s ecosystem is centripetal &#8212; every contribution routes activity back to Nvidia hardware, converting open-source adoption into hardware lock-in.</p><p>[40] CLOUD Act compelled disclosure provision, 18 U.S.C. &#167; 2713. The Entity Test is published in The AI Realist's sovereignty vertical. Key factors: ownership chain, incorporation jurisdiction, personnel with technical access, contractual dependencies, and management control. If any factor creates a link to U.S. jurisdiction, the compelled disclosure provision may apply. Note: data localization (where data physically resides) is distinct from data governance (who controls processing decisions under GDPR&#8217;s controller/processor framework). The Entity Test applies to both, because the question is jurisdiction over the entity, not the location of the bits.</p><p>[41] <a href="https://www.airealist.ai/p/open-source-closed-orbit">Open Source, Closed Orbit</a>. Open-weight models can be downloaded, fine-tuned, and deployed on any hardware. But the most performant deployment requires Nvidia hardware, Nvidia&#8217;s inference stack (TensorRT-LLM, NIM), and Nvidia&#8217;s optimization libraries. The fork is free. The performance depends on the hardware.</p><p>[42] See note [23].</p><p>[43] HAI 2026, Section 4.4, &#8220;Productivity Trends.&#8221;</p><p>[44] <a href="https://www.airealist.ai/p/ai-tools-work-your-engineering-process">AI Tools Work. Your Engineering Process May Not.</a>, The AI Realist, March 2026, 5,380 words, 24 footnotes.</p><p>[45] See note [19].</p><p>[46] See note [23]. Becker et al., 2025 (METR). -19% speed for experienced open-source developers using AI assistance.</p><p>[47] HAI 2026, Section 4.4, Figure 4.4.27. Shen and Tamkin, 2025. &#8220;Software engineers who relied heavily on AI for learning showed no measurable speed improvement.&#8221;</p><p>[48] HAI 2026, Top Takeaways, #10.</p><p>[49] HAI 2026, Section 1.4, Figure 1.4.3. Epoch AI estimate.</p><p>[50] HAI 2026, Section 1.4, Figure 1.4.8. &#8220;Annual estimates for GPT-4o inference range from about 1.3 to 1.6 kiloliters [presumably billion liters], which, at the high end, exceeds the annual drinking water needs of 12 million people.&#8221; The units, as stated in the report, appear internally inconsistent with the figure and comparison &#8212; verify against the original source (de Vries and Gao, 2025) before citing the absolute number. The order-of-magnitude comparison (AI inference water use &#8776; drinking water for 12M people) appears consistent between the figure and the text.</p><p>[51] HAI 2026, Section 1.4, Figure 1.4.13. Source: IEA, 2025. Note: &#8220;Data in this chart reflects IEA projections rather than observed consumption.&#8221;</p><p>[52] <a href="https://www.airealist.ai/p/the-half-life-of-a-press-release">The Half-Life of a Press Release</a>, The AI Realist, March 2026, 4,896 words, 88 footnotes. The Nuclear Delivery Test: three questions (reactor status, fuel type, distance) producing a credibility score for any hyperscaler-nuclear deal announcement.</p><p>[53] The four-layer mismatch framework: geography mismatch (nuclear sites &#8800; datacenter sites), timeline mismatch (NRC licensing path exceeds the 2025&#8211;2028 crisis window), cost mismatch (nuclear construction costs have risen faster than inflation for four decades), fuel supply mismatch (HALEU commercial production at scale does not exist as of 2026).</p><p>[54] <a href="https://www.airealist.ai/p/the-half-life-of-a-press-release">The Half-Life of a Press Release</a>, bridge technology analysis. Natural gas peaker plants and combined-cycle gas can be permitted and built in 18&#8211;36 months, compared to 5&#8211;20 years for nuclear.</p><p>[55] HAI 2026, Section 1.4, Figure 1.4.3. DeepSeek V3: 597 tons CO&#8322;eq. Grok 4: 72,816 tons CO&#8322;eq. Both figures are Epoch AI estimates. The DeepSeek figure is dramatically lower than peer models at comparable parameter count; this may reflect differences in training efficiency, hardware utilization, energy grid carbon intensity, or disclosure methodology. Treat the absolute number as directional.</p><p>[56] HAI 2026, Section 1.4, Figure 1.4.5. Claude 4 Opus: 5.13 Wh per medium-length prompt. DeepSeek V3.2: 23.13 Wh. Source: Jegham et al., 2025. A medium-length prompt is approximately 1,000 input tokens and 1,000 output tokens.</p><p>[57] The Reasoning Tax: total cost multiplier for reasoning-heavy workloads versus standard inference. Published in the AI tooling vertical of The AI Realist. The formula accounts for token inflation (reasoning models generate 10&#8211;40&#215; more output tokens per query) and price premium.</p><p>[58] HAI 2026, Chapter 2 Highlights, #2.</p><p>[59] HAI 2026, Chapter 1 Highlights, Figures 1.1.1 and 8.3.1 and 8.3.4.</p><p>[60] HAI 2026, Top Takeaways, #2 and #3.</p><p>[61] See note [23]. Becker et al., 2025 (METR). The -19% finding for experienced open-source developers was the most widely cited negative productivity result in the field.</p><p>[62] HAI 2026, Section 4.4. The report notes METR &#8220;has not been able to replicate the results in a later study, primarily due to a growing reluctance among developers to work without AI, and that developers in late 2025 were likely sped up by AI relative to the original study period.&#8221;</p><p>[63] HAI 2026, Section 4.4, Figure 4.4.28. Brynjolfsson, 2026. 2.7% U.S. productivity growth in 2025, framed through the &#8220;J-curve&#8221; hypothesis &#8212; organizations absorb the costs of adopting AI before larger productivity gains materialize.</p><p>[64] HAI 2026, Section 4.4, Figure 4.4.28. Aldasoro et al., 2026. Study of 12,000 European firms (2019&#8211;2024). 4% increase in labor productivity from AI adoption; 5.9 percentage point gain for every 1% spent on training.</p><p>[65] HAI 2026, Section 4.4, Figure 4.4.28. Filippucci et al. (OECD, 2025). Projected annual gains: +0.4 to +1.3 pp (U.S./UK) vs. +0.2 to +0.8 pp (Italy/Japan).</p><p>[66] HAI 2026, Section 4.3, Highlight: &#8220;Measuring Signals of AI Diffusion,&#8221; Figure 4.3.12. Source: Microsoft AI Economy Institute, 2025. France: 44.0% AI diffusion (second half 2025), ranked 5th globally. United States: 28.3%, ranked 24th. UAE: 64.0%, ranked 1st. Singapore: 60.9%, ranked 2nd.</p>]]></content:encoded></item><item><title><![CDATA[The Quiet Tax on Parallel AI Coding Work — and Why I Built Canopy]]></title><description><![CDATA[Claude Code wants you to do one thing at a time. Real engineering doesn't work that way.]]></description><link>https://www.airealist.ai/p/the-quiet-tax-on-parallel-ai-coding</link><guid isPermaLink="false">https://www.airealist.ai/p/the-quiet-tax-on-parallel-ai-coding</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Tue, 14 Apr 2026 06:24:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HjjB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HjjB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HjjB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 424w, https://substackcdn.com/image/fetch/$s_!HjjB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 848w, https://substackcdn.com/image/fetch/$s_!HjjB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 1272w, https://substackcdn.com/image/fetch/$s_!HjjB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HjjB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png" width="1287" height="1287" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/753db764-764e-4793-bacf-55f6b538564f_1287x1287.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1287,&quot;width&quot;:1287,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1763193,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/194155287?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae4cc99d-bc0d-4f5a-83da-8ebc4420e5d8_1544x1450.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HjjB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 424w, https://substackcdn.com/image/fetch/$s_!HjjB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 848w, https://substackcdn.com/image/fetch/$s_!HjjB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 1272w, https://substackcdn.com/image/fetch/$s_!HjjB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F753db764-764e-4793-bacf-55f6b538564f_1287x1287.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://github.com/juliensimon/canopy">https://github.com/juliensimon/canopy</a></figcaption></figure></div><p>I spend a lot of my working day (and weekends too) inside Claude Code. Not demoing it, not writing about it: using it, on real code, for real projects. When you live inside a tool at that depth, you stop noticing its features and start noticing its friction. The rough edges you&#8217;d forgive in a weekend experiment become the thing you think about in the shower.</p><p>Here&#8217;s the one I couldn&#8217;t stop thinking about: Claude Code is the most productive tool I&#8217;ve added to my workflow in years, and it quietly assumes you only do one thing at a time.</p><p>Real engineering doesn&#8217;t work like that. You&#8217;re refactoring a module when a production bug lands. A teammate pings you for a review while you&#8217;re in the middle of a test. An idea for a side experiment shows up right as you&#8217;re about to ship. The work is parallel. The tools are serial.</p><p>You already know the workaround. Stash your changes. Check out a new branch. Maybe clone the repo again. Maybe reach for git worktree add, copy your .env by hand, run your setup script, open another terminal, start another Claude session, and, three minutes later, try to remember what you were doing in the first place.</p><p>None of this is hard. That&#8217;s the trap. Each step is small, so you don&#8217;t notice the cost. But at the end of the day, you&#8217;ve paid it fifty times. Terminal tabs you can&#8217;t identify. Worktrees rotting in ~/code/worktrees/. Claude sessions whose IDs you&#8217;ll never find again. A quiet tax on every task.</p><p>I got tired of paying for it. So I built <a href="https://github.com/juliensimon/canopy">Canopy</a>.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;2fc94ffc-106b-40a2-871e-4cc3045882db&quot;,&quot;duration&quot;:null}"></div><p><a href="https://github.com/juliensimon/canopy">Canopy</a> is a native macOS app that turns parallel Claude Code sessions into a first-class workflow. One window.</p><p>Tabbed worktrees. &#8984;1&#8211;9 to jump between them. Conversations that auto-resume when you reopen the app &#8212; no --resume flags, no session archaeology. Per-project config for the files you always copy and the setup commands you always run. One click to merge a worktree back and clean it up, so nothing rots.</p><p>The thing I find most interesting, now that I&#8217;ve been using it for a couple of months, isn&#8217;t any single feature. It&#8217;s what disappears. The friction between &#8220;I should look at this&#8221; and actually looking at it collapses to nothing. Context switches stop feeling like context switches. You stop batching work to avoid the overhead, because there isn&#8217;t any overhead left to avoid.</p><p>It&#8217;s built in SwiftUI &#8212; no Electron, no web view &#8212; signed and notarized, and free under the AGPL-3.0 license. If you use Claude Code seriously, I think it&#8217;s worth ten minutes of your afternoon.</p><p>&#10145;&#65039; Download: <strong><a href="https://github.com/juliensimon/canopy/releases/latest/download/Canopy.dmg">https://github.com/juliensimon/canopy/releases/latest/download/Canopy.dmg</a></strong></p><p>&#10145;&#65039; Homebrew: <strong>brew install --cask juliensimon/canopy/canopy</strong></p><p>&#10145;&#65039; Source: <strong><a href="https://github.com/juliensimon/canopy">https://github.com/juliensimon/canopy</a></strong></p><p>If you try it, please give it a &#11088;&#65039; and tell me what to build next. That&#8217;s how this keeps getting better!</p><div><hr></div><p></p>]]></content:encoded></item><item><title><![CDATA[The Verification Tax]]></title><description><![CDATA[The GPU ate the AI narrative. The CPU kept doing the work.]]></description><link>https://www.airealist.ai/p/the-verification-tax</link><guid isPermaLink="false">https://www.airealist.ai/p/the-verification-tax</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Fri, 10 Apr 2026 15:13:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!R8s8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R8s8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R8s8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 424w, https://substackcdn.com/image/fetch/$s_!R8s8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 848w, https://substackcdn.com/image/fetch/$s_!R8s8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!R8s8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R8s8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2071400,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193799578?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R8s8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 424w, https://substackcdn.com/image/fetch/$s_!R8s8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 848w, https://substackcdn.com/image/fetch/$s_!R8s8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!R8s8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F964af7c0-feb9-4977-92ff-323a324d6536_4096x2304.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In April 2025, a team at Together AI and Agentica tried to teach a 14-billion-parameter model to write better code. The method: generate candidate solutions, run them against unit tests, reward the ones that pass. Reinforcement learning from verifiable rewards &#8212; RLVR. The model ran on 32 Nvidia H100 GPUs. The GPUs were not the bottleneck.</p><p>The bottleneck was the CPUs. Every training step required the model to generate 16 candidate solutions per problem, then execute each in a sandboxed environment against 5 or more unit tests. At training scale &#8212; a thousand problems per iteration &#8212; that meant over sixteen thousand separate code executions per step, each requiring execution, output comparison, and cleanup. Together AI had to build a custom verification service capable of processing over 1,000 code executions per minute across 100 concurrent sandboxes.[1] The verification layer, not the generation layer, determined how fast the model could learn.</p><p>This is the pattern the industry hasn&#8217;t named yet. GPU compute for AI generation scales linearly and benefits from every hardware generation Nvidia ships. CPU compute for AI verification scales superlinearly but resists acceleration. The workload is an arbitrary program execution that requires operating system services, file system access, and process isolation, which GPU architectures do not provide.[2] As AI shifts from &#8220;generate&#8221; to &#8220;generate and verify,&#8221; the verification layer becomes the bottleneck on how fast models improve through reinforcement learning. The Verification Tax is the CPU-side cost multiplier that grows with every increase in RL training ambition &#8212; more completions per prompt, more complex verification, longer execution times &#8212; and it compounds in ways the GPU scaling curves do not capture.</p><p>The CPU&#8217;s role in AI datacenters is not shrinking. It is differentiating into three distinct jobs &#8212; feeding GPUs, verifying RL outputs, and running inference for smaller models &#8212; and all three are growing simultaneously. The most consequential of the three is verification, because it creates genuinely new demand that cannot be served by the hardware the industry spent the last three years stockpiling.</p><h2>Three jobs the GPU narrative erased</h2><h3>The feeder</h3><p>Every GPU training run depends on CPUs for data loading, preprocessing, tokenization, and batch assembly. When the CPU cannot keep pace, GPUs idle. Nvidia&#8217;s own documentation acknowledges that dense multi-GPU systems &#8220;train models much faster than data can be provided by the input pipeline, leaving GPUs starved for data.&#8221;[3]</p><p>The industry&#8217;s response has been to throw more cores at the problem. In the DGX-2 (2018), each V100 GPU had roughly 3 CPU cores. In the DGX A100 (2020), sixteen. The GB300 NVL72 &#8212; Nvidia&#8217;s current flagship &#8212; deploys thirty-six Grace ARM cores per Blackwell Ultra GPU, connected via NVLink-C2C &#8212; Nvidia&#8217;s chip-to-chip coherent link &#8212; at 900 GB/s, a twelve-fold increase in cores-per-GPU from the DGX-2 era.[4] At hyperscale, this investment has worked: Meta&#8217;s 54-day Llama 3 405B training run on 16,384 H100 GPUs recorded 419 unexpected interruptions, but only two were CPU failures: the binding constraint was storage throughput, not CPU compute.[5]</p><p>The feeder role is being architecturally resolved. The verification role is not.</p><h3>The verifier</h3><p>Reinforcement learning from verifiable rewards &#8212; the method behind DeepSeek-R1's reasoning capabilities, DeepCoder's coding performance, and an increasing share of frontier model post-training &#8212; works in three steps. First, the model generates candidate answers on GPU. Second, each answer is tested for correctness on CPU: run the code, check the math, compare the output. Third, the results feed back to the model as a training signal: reinforce what worked, penalize what didn't. Generation dominates wall-clock time, though pipelining increasingly overlaps it with verification.[6]</p><p>The majority of RLHF for general instruction following still uses learned reward models that run entirely on GPU, adding VRAM pressure but no CPU verification cost. Constitutional AI, RLAIF, and reward model ensembles &#8212; the dominant approach at Anthropic, Google, and OpenAI for non-code tasks &#8212; are GPU-on-GPU pipelines.[7] The Verification Tax applies where correctness is checked by execution, not by a model.</p><p>For math RL &#8212; the dominant form in DeepSeek-R1&#8217;s primary training &#8212; verification is trivially cheap: extracting a numerical answer and comparing to ground truth costs under a millisecond on any CPU. Format compliance via regex is similarly lightweight. The Verification Tax is negligible for these workloads. But code verification &#8212; executing generated programs in sandboxed environments against unit test suites &#8212; costs one to ten seconds per execution, a thousand to ten thousand times more expensive per check.[8] For agentic verification, the cost compounds further: each check may require loading test fixtures, spinning up mock APIs, populating environment state, and comparing final outcomes &#8212; a setup that can exceed the execution time itself. Multi-step agent tasks requiring environment interaction can take minutes. The tax is workload-dependent, and the workloads where it is highest are exactly the workloads the industry is scaling toward: code generation, agent training, and tool use.</p><p>GRPO &#8212; Group Relative Policy Optimization, the algorithm that powered DeepSeek-R1 &#8212; amplifies CPU demand through group sampling.[9] For each prompt, GRPO generates &#8216;G&#8217; candidate completions, typically four to sixteen, with well-funded labs pushing to sixty-four. Rewards are normalized within each group. At G=8 &#8212; a common configuration for cost-constrained code RL &#8212; a training step with 1,024 prompts requires 8,192 separate verifications. For code RL with five unit tests per problem, that becomes approximately 41,000 test executions per step. At G=16, the number doubles to 82,000. With ten-second execution timeouts, completing this within a reasonable window demands hundreds of concurrent CPU cores running sandboxed environments &#8212; each consuming 500 MB to 2 GB of RAM for its isolated process, loaded dependencies, and test fixtures.[10]</p><p>The DeepCoder training run is the best-documented example at scale. Together AI trained a 14B model on 32 H100 GPUs for two and a half weeks using GRPO+. Each RL iteration verified over a thousand coding problems, each against multiple unit tests, requiring the custom Together Code Interpreter to run more than 100 concurrent sandboxes at over 1,000 executions per minute.[11] Two leading open-source RL training frameworks &#8212; ByteDance&#8217;s veRL and Hugging Face&#8217;s TRL &#8212; confirm the pattern: veRL&#8217;s documentation shows that allocating greater CPU resources for concurrent code verification reduces the reward computation stage by 10&#8211;30%, and TRL&#8217;s GRPOTrainer delegates verification entirely to user-provided reward functions, leaving the verification infrastructure as a gap the user must fill.[12]</p><p>The verification layer can also be decoupled from the training cluster and scaled independently on commodity compute. Together AI&#8217;s TCI service is exactly this &#8212; a separate verification endpoint. In theory, the approximately 87% of cloud CPU capacity that sits idle on average could absorb verification workloads at near-zero marginal cost.[13] In practice, large RL training runs operate in dedicated datacenter clusters where idle general-purpose cloud instances are not co-located, and the latency penalty of routing verification through distant commodity compute degrades RL training efficiency. veRL&#8217;s architecture interleaves reward computation with sampling for precisely this reason: the tighter the generation-verification loop, the faster training converges.[14] The decoupled approach works, but at a cost to convergence speed, and speed is what frontier labs compete on.</p><p>The structural problem: GPU compute for generation scales roughly linearly with model size and output length. CPU compute for verification scales with the product of completions &#215; tests &#215; execution time. As models improve, they write more complex code, which requires longer execution times. As training matures, more diverse test suites are needed to prevent reward hacking. Models learn to exploit weak test suites &#8212; writing code that passes the tests without solving the problem. The response is more tests, more edge cases, more adversarial inputs. That arms race directly increases verification cost. The result is a superlinear CPU scaling requirement that increases with the ambition of RL training, and no hardware accelerator addresses it.</p><p>Jensen Huang named the consequence on Nvidia&#8217;s most recent earnings call: &#8220;The number of tokens that are being generated has really, really gone exponential, and so we need to inference at a much higher speed.&#8221;[15] Dion Harris, Nvidia&#8217;s head of AI infrastructure, was more specific at GTC 2026: &#8220;CPUs are becoming the bottleneck in terms of growing out this AI and agentic workflow.&#8221;[16] Nvidia anticipated this: Grace shipped in 2023 as a GPU companion, and the March 2026 launch of Vera &#8212; marketed as the first CPU &#8220;purpose-built for agentic AI&#8221; &#8212; positions Nvidia as the only vendor selling both the generation silicon and the verification silicon in the same rack. Bank of America projects the datacenter CPU market could more than double, from $27 billion in 2025 to $60 billion by 2030 &#8212; driven substantially by AI inference and verification demand.[17]</p><h3>The inference engine</h3><p>The third CPU job is the one vendors talk about most and practitioners deploy least: running LLM inference directly on CPUs without a GPU.</p><p>The economics are real but bounded. For quantized models with fewer than 7 billion parameters on existing infrastructure, where CPU cycles are essentially free at the margin, CPU inference is often the cheapest option. Intel&#8217;s Xeon processors with AMX (Advanced Matrix Extensions) can run quantized Llama 3.2 3B at up to 57 tokens per second &#8212; double the throughput without AMX.[18] AMD&#8217;s PACE framework with speculative decoding achieves approximately 380 tokens per second on Llama 3.1 8B using EPYC 9575F processors, per AMD&#8217;s published benchmarks.[19] ARM-based servers are competitive: AWS Graviton instances running llama.cpp show up to four times the performance of x86 alternatives in favorable configurations.[20]</p><p>The validation that matters came in February 2026, when Meta became the first hyperscaler to deploy Nvidia Grace CPUs as standalone processors at scale &#8212; without GPU companions &#8212; for agentic AI workloads.[21] Nvidia&#8217;s Ian Buck confirmed that Grace delivers &#8220;2x the performance per watt on those backend workloads&#8221; in Meta&#8217;s datacenters.[22] The deployment targets workloads that are memory-bandwidth-intensive rather than compute-intensive: agent orchestration, tool calling, context management, and sequential reasoning chains that waste GPU parallelism.</p><p>The boundary is clear. With roughly 7 billion parameters, CPU inference is cost-effective on existing infrastructure. For up to 20 billion, the economics depend on utilization and latency requirements.[23] Above thirty billion, CPUs cannot deliver acceptable latency for interactive use cases. AMD&#8217;s own benchmarks show a 70-billion-parameter model producing first-token latency of 76 seconds for 32 concurrent requests on EPYC 9965 &#8212; functional for offline processing, not for a chatbot.[24]</p><p>The harder constraint is software, not silicon. Red Hat has stated explicitly that vLLM &#8220;is not intended for CPU-based inference and has not been optimized for CPU performance.&#8221;[25] Intel and AMD are contributing CPU backends &#8212; Intel through SGLang with native AMX support, AMD through ZenDNN &#8212; but production maturity lags GPU serving stacks by years. The most advanced CPU inference stack in existence, Apple&#8217;s Metal and MLX framework, is highly optimized for Apple Silicon &#8212; and unavailable in the datacenter. The hardware is arriving faster than the software to run it.</p><h2>The Verification Tax</h2><p>A lab training a coding model with GRPO runs 1,024 problems per iteration at G=8, with 5 tests per run. That is roughly 41,000 sandbox executions per training step. At ten seconds per execution, the verification workload is 114 CPU-hours per step &#8212; hundreds of thousands of CPU-hours across a training run of several thousand steps. Scale to G=16 and the numbers double. The GPU cluster generating those completions may be 32 H100s; the CPU cluster verifying them needs hundreds of cores running continuously.</p><p>Now increase the ambition. Double the batch size. Move from math verification (under one millisecond per check) to code verification (one to ten seconds). Raise G from 4 to 16.</p><p>Moving from math RL to code RL with larger batches can increase CPU verification demand by a factor of 20,000 to 80,000, while GPU demand for the generation phase increases only eightfold.[26]</p><p>The GPU-to-CPU compute ratio inverts.</p><p>The tax compounds along three dimensions that each grow with training ambition. Completions per prompt (G in GRPO): increasing G from 4 to 16 to 64 improves the reward signal quality but linearly multiplies CPU verification demand.[27] Verification complexity per completion: the cost gap between math checking and code execution spans orders of magnitude; agent verification adds another. Batch scale: every additional prompt multiplies verification demand by G.</p><p>This is why the Verification Tax matters for infrastructure planning. A CTO designing an RL training cluster who provisions CPUs based on supervised training ratios will discover, mid-training, that verification is the pacing constraint. The GPUs will generate completions faster than the CPUs can verify them. Training will slow to the speed of the verification layer &#8212; not because the GPUs are expensive, but because the CPUs were free and nobody budgeted for them. The required CPU-to-GPU core ratio depends on verification complexity: math RL needs no more than the supervised training ratio, while code RL at G=16 can require an order of magnitude more CPU cores than the GPU cluster has.</p><h2>Who benefits from the demand shift?</h2><h3>Intel&#8217;s accidental position</h3><p>Intel&#8217;s AI accelerator strategy is a graveyard. Ponte Vecchio struggled. Rialto Bridge was shelved. Falcon Shores was demoted to an internal test chip in January 2025.[28] Gaudi 3 missed its revenue targets. Intel&#8217;s interim co-CEO Michelle Johnston Holthaus conceded: &#8220;We&#8217;re not yet participating in the cloud-based AI data center market in a meaningful way.&#8221;[29]</p><p>This leaves Xeon as Intel's de facto AI product &#8212; an Infrastructure Reversion by default rather than design.[30] Intel's strategic pivot reflects the recognition. In February 2026, the company announced a multi-year collaboration with SambaNova, backed by approximately $50 million in investment from Intel Capital. The deal positions Xeon as the CPU foundation for heterogeneous inference paired with SambaNova's RDU accelerators.[31] Intel is the only CPU vendor submitting standalone CPU results to MLPerf Inference benchmarks, and Xeon 6 has been selected as the host CPU for Nvidia's next-generation DGX Rubin NVL8.[32]</p><p>The irony is that the Verification Tax creates exactly the demand profile Intel&#8217;s remaining asset can serve. RL verification requires high core counts, large memory, and general-purpose compute &#8212; precisely what Xeon does. Intel&#8217;s upcoming Clearwater Forest (288 E-cores, Intel 18A, H1 2026) and Diamond Rapids (up to 192 P-cores, H2 2026) are positioned for the density and single-thread performance, respectively, that verification workloads demand.[33] Intel did not plan for this to be its primary AI opportunity. Diamond Rapids includes AI-specific features, but the CPU was never the flagship bet. The market arrived anyway. Whether Intel can convert volume demand into margin improvement is a different question: Xeon carries lower margins than accelerators would have, and AMD and ARM are eroding Intel&#8217;s share of the CPU market that the Verification Tax is expanding.</p><h3>AMD and ARM are eating from both sides</h3><p>AMD&#8217;s datacenter share is accelerating. Mercury Research data shows AMD reaching approximately 35.5% of x86 server revenue by Q4 2024, with supply chain estimates suggesting it may approach 40% by early 2025.[34] Datacenter revenue hit $3.7 billion in Q1 2025, up 57% year-over-year.[35] AMD&#8217;s dual strategy &#8212; EPYC for inference on models with fewer than 20 billion parameters, Instinct GPUs for larger models &#8212; is coherent and gaining traction. EPYC Turin&#8217;s 192 cores, 12 DDR5 channels, and 160 PCIe Gen5 lanes give it a raw density advantage over Intel for both the feeder role and CPU inference.[36]</p><p>ARM is expanding from approximately 15% of datacenter CPUs at the end of 2024 toward an ambitious 50% target.[37] Nvidia&#8217;s Grace CPUs dominate the GPU-companion role &#8212; both the GB200 and GB300 NVL72 racks are all-Grace platforms &#8212; and Meta&#8217;s standalone deployment validates Grace for inference and agentic workloads independent of GPUs.[38] In March 2026, ARM launched its first silicon product, the AGI CPU, co-designed with Meta: 136 Neoverse V3 cores that claimed 2x the performance per rack versus x86.[39] SoftBank&#8217;s $6.5 billion acquisition of Ampere Computing &#8212; which absorbed approximately 1,500 employees, the majority in chip design &#8212; further consolidated ARM&#8217;s datacenter position.[40]</p><p>On paper, SoftBank now owns the instruction set (ARM), the leading independent server CPU designer (Ampere), and an AI accelerator design team (Graphcore) &#8212; a vertically integrated silicon stack that parallels Nvidia&#8217;s Grace-to-Blackwell integration from the CPU side. Whether SoftBank can execute on integration remains unproven; its track record with semiconductor acquisitions is mixed at best, and Graphcore generated $4 million in revenue the year before SoftBank bought it.[40]</p><h3>Oracle&#8217;s exit tells you where the puck isn&#8217;t</h3><p>Oracle sold its 32.27% stake in Ampere to SoftBank in November 2025, booking a $2.7 billion pre-tax gain.[41] Larry Ellison framed the exit as &#8220;chip neutrality&#8221; &#8212; Oracle would deploy whatever silicon customers wanted, rather than building its own.[42] The reversal was strikingly rapid: as recently as September 2024, Oracle had disclosed options that could have given it majority control of Ampere by 2027.[43]</p><p>The catalyst was Stargate. Oracle is deploying over 450,000 Nvidia GB200 GPUs at its flagship Abilene campus and has committed to over $300 billion in additional AI infrastructure capacity.[44] At this scale, owning an ARM CPU company introduces perceived bias. Oracle chose agility over integration &#8212; the opposite of Amazon (Graviton), Google (Axion), and Microsoft (Cobalt), all of which design custom ARM server CPUs in-house. Amazon&#8217;s Jassy disclosed this week that its custom chip business &#8212; Graviton, Trainium, Nitro &#8212; generates over $20 billion annually and may be sold externally, validating the integration strategy Oracle rejected.[45]</p><p>Oracle has not stopped using Ampere chips &#8212; it launched A4 instances on AmpereOne M processors in October 2025, claiming 30% better price-performance than AMD EPYC-based alternatives.[46] But under chip neutrality, Ampere is one vendor among many, and Oracle&#8217;s long-term commitment to future Ampere generations remains unstated. The structural signal is that Oracle concluded it could capture more value as a neutral infrastructure provider riding Nvidia&#8217;s GPU wave than as a chip company competing on CPU design.</p><h3>The sovereignty gap</h3><p>The Verification Tax creates CPU demand that someone has to supply. In the West, the competition is between AMD, Intel, and ARM. Outside the West, the question is whether domestic silicon can serve even a fraction of the demand.</p><p>Europe&#8217;s answer is: not yet. Over &#8364;100 billion is committed across the EU Chips Act and IPCEI programs, but the gap between funding and fielded AI silicon remains wide.[47] SiPearl&#8217;s Rhea1, the flagship of the European Processor Initiative, taped out in July 2025 &#8212; with Neoverse V1 cores on a 6nm process that will be two to three generations behind market leaders by the time it ships.[48] Its first deployment, in the JUPITER exascale system at J&#252;lich, will deliver roughly 5 petaFLOPS across 23,536 Nvidia H200 GPUs.[49] The sovereignty showcase runs on American silicon. Europe&#8217;s longer-term RISC-V initiatives target 2028&#8211;2030 at the earliest.[50]</p><p>China is moving faster. Huawei&#8217;s Ascend 910C delivers approximately 60&#8211;80% of Nvidia H100 performance on FP16 training benchmarks, per industry analyst estimates, with a production target of 600,000 units in 2026.[51] Cambricon reported revenue of RMB 4.6 billion ($630 million) for the first three quarters of 2025, with year-over-year growth in the first half exceeding 4,000% from a near-zero base in 2024.[52] But these are GPU-class accelerators, not CPUs. On the CPU side, Alibaba&#8217;s XuanTie C950 RISC-V server chip &#8212; announced March 2026, with a built-in tensor processing engine &#8212; reaches roughly Apple M1 (2020) performance levels.[53] RISC-V&#8217;s strategic appeal lies in its architectural independence from both x86 licensing and SoftBank&#8217;s ownership of ARM. The difference between Europe and China is not ambition &#8212; it is that China is shipping silicon while Europe is funding research programs.</p><p>Neither ecosystem can yet serve the Verification Tax at scale. The workload is general-purpose CPU execution, which means it runs on whatever CPUs are available &#8212; but running it well requires the core density, memory bandwidth, and single-thread performance that only AMD EPYC, Intel Xeon, Nvidia Grace, and ARM&#8217;s newest server cores deliver competitively. The sovereignty gap in CPUs is smaller than in GPUs, but it exists.</p><h2>What would have to break</h2><p>The Verification Tax thesis breaks down under three conditions.</p><p><strong>GPU-native sandboxing becomes viable.</strong> If a vendor develops efficient execution of arbitrary programs &#8212; with OS services, process isolation, and file system access &#8212; on GPU hardware, the verification workload could move to the same silicon that handles generation. This would require GPUs to acquire capabilities they were explicitly designed to lack. Probability: low in the next three years. Not impossible at longer horizons.</p><p><strong>RL shifts entirely to reward-model verification.</strong> If learned reward models running on GPU produce better training signal than verifiable rewards running on CPU, the tax shrinks to near zero. The evidence runs the other way: DeepSeek-R1, DeepCoder, and multiple frontier labs have demonstrated that verifiable rewards produce stronger reasoning capabilities for domains where verification is possible.[54] The trend is toward more verifiable reward, not less.</p><p><strong>Verification workloads stay confined to code and math.</strong> If RL-based post-training remains limited to coding and mathematical reasoning, the Verification Tax affects a significant but bounded workload category. But agentic AI &#8212; models that browse the web, call tools, manage files, and interact with environments &#8212; requires verification that is even more CPU-intensive than code execution. Meta&#8217;s acquisition of Manus, which operates containerized virtual machines where parallel agents write code, debug it, and browse the web autonomously, is a demand signal for exactly the CPU-intensive verification infrastructure the Verification Tax predicts.[55] The workload category is expanding, not contracting.</p><p>The architectural implication is the three-phase compute pipeline &#8212; a disaggregation that the industry has not yet formally recognized. Phase one is generation: prefill and autoregressive decoding on GPU or specialized silicon, already being split by the AWS-Cerebras partnership. Phase two is optimization: gradient computation and GPU-weight updates, well-understood and heavily optimized. Phase three is verification: executing candidate outputs against ground truth in sandboxed environments, on CPU. The first two phases have dedicated hardware, mature software stacks, and billion-dollar investment. Phase three runs on commodity CPUs with no dedicated infrastructure. Companies that treat verification as a first-class infrastructure problem &#8212; purpose-built, separately scaled, independently optimized &#8212; will set the efficiency frontier for the next phase of RL-driven model improvement.</p><p>The CPU never left. It was doing the part of AI that nobody talks about at earnings calls &#8212; the part where you check if the answer is right.</p><div><hr></div><h3>Notes</h3><p>[1] Together AI, &#8220;DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level,&#8221; April 2025. &#8220;We&#8217;ve been working on reliably scaling the Together Code Interpreter to 100+ concurrent sandboxes and 1k+ sandbox executions per minute.&#8221; <a href="https://www.together.ai/blog/deepcoder">together.ai</a></p><p>[2] GPU architectures are optimized for massively parallel matrix operations with thousands of simple cores sharing a SIMT execution model. General-purpose program execution &#8212; spawning processes, accessing file systems, managing memory isolation, handling I/O &#8212; requires operating system services that GPU execution environments do not provide. Research into GPU-accelerated containers exists but does not address the full sandboxing requirements of code verification (process isolation, timeout management, output capture, resource limits). The primary bottleneck in sandboxed verification is execution time, not sandbox startup: lightweight container runtimes (Firecracker microVMs, gVisor) have reduced per-sandbox overhead to milliseconds, but the code itself still takes seconds to run.</p><p>[3] Nvidia, &#8220;Rapid Data Pre-Processing with NVIDIA DALI,&#8221; Nvidia Developer Blog. <a href="https://developer.nvidia.com/blog/rapid-data-pre-processing-with-nvidia-dali/">developer.nvidia.com</a></p><p>[4] CPU-per-GPU ratios: DGX-2 (2&#215;Xeon Platinum 8168, 48 cores / 16 GPUs &#8776; 3 cores/GPU); DGX A100 (2&#215;AMD EPYC 7742, 128 cores / 8 GPUs = 16 cores/GPU); GB200/GB300 NVL72 (36 Grace CPUs &#215; 72 cores / 72 GPUs = 36 cores/GPU &#8212; identical ratio across both generations). NVLink-C2C provides 900 GB/s of bidirectional bandwidth between the Grace CPU and the Blackwell GPU &#8212; roughly 7&#215; PCIe Gen5 (128 GB/s per x16 slot). Vision and multimodal training remain more CPU-intensive due to on-the-fly image decoding and augmentation; Nvidia&#8217;s DALI library offloads some of this to GPU, yielding up to 72% faster ResNet-18 training vs. native PyTorch DataLoader per AWS benchmarks. <a href="https://www.nvidia.com/en-us/data-center/grace-hopper-superchip/">nvidia.com</a></p><p>[5] Meta, &#8220;Building Meta&#8217;s GenAI Infrastructure,&#8221; Meta Engineering Blog, March 2024. 419 interruptions over 54 days; 2 CPU hardware failures (0.5%); GPU issues accounted for 58.7%. <a href="https://engineering.fb.com/2024/03/12/data-center-engineering/building-metas-genai-infrastructure/">engineering.fb.com</a></p><p>[6] Together AI, &#8220;A practitioner&#8217;s guide to testing and running large GPU clusters for training generative AI models.&#8221; The 80&#8211;90% figure is consistent with the veRL documentation, which shows sampling as the primary bottleneck in GRPO training loops. <a href="https://www.together.ai/blog/a-practitioners-guide-to-testing-and-running-large-gpu-clusters-for-training-generative-ai-models">together.ai</a></p><p>[7] The distinction between &#8220;verification by execution&#8221; (CPU-bound) and &#8220;verification by model judgment&#8221; (GPU-bound) is critical for infrastructure planning. Constitutional AI (Anthropic), RLAIF (Google), and reward model ensembles use learned models to score outputs &#8212; these run on GPU and incur no Verification Tax. RLVR and GRPO with code/math verification execute programs or compare answers &#8212; these are CPU-bound. Most general-purpose instruction following still uses the former; math, code, and agentic post-training increasingly use the latter.</p><p>[8] Execution time ranges are based on LiveCodeBench evaluation methodology and Together Code Interpreter documentation. Typical competitive programming problems take 1&#8211;5 seconds; complex system-level tasks can take 10+ seconds. The 1,000&#215;&#8211;10,000&#215; range reflects the gap between sub-millisecond math checking and 1&#8211;10 second code execution &#8212; the specific multiplier depends on program complexity and test suite depth. For compiled languages like Rust or C++, the compilation step alone can take 5&#8211;30 seconds, often exceeding execution time; Python-heavy benchmarks like LiveCodeBench avoid this cost because Python is interpreted.</p><p>[9] DeepSeek-AI, &#8220;DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning,&#8221; January 2025. GRPO generates G completions per prompt and normalizes rewards within each group, eliminating the need for a separate critic model. <a href="https://arxiv.org/abs/2501.12948">arxiv.org</a></p><p>[10] Memory footprint per sandbox: a Python process with typical ML test dependencies (NumPy, PyTorch, standard library) consumes 500 MB&#8211;2 GB of RAM in isolation. With 100+ concurrent sandboxes, verification nodes need 50&#8211;200 GB of RAM dedicated solely to sandbox processes, in addition to memory for the orchestration layer. This makes verification nodes memory-hungry in addition to core-hungry &#8212; a constraint the piece&#8217;s infrastructure planning advice should account for.</p><p>[11] Together AI, &#8220;DeepCoder,&#8221; op. cit. 32 H100 GPUs, 2.5 weeks training, 24K unique problem-tests pairs, $0.03 per problem for the Together Code Interpreter.</p><p>[12] ByteDance veRL documentation, &#8220;Sandbox Fusion Example.&#8221; <a href="https://verl.readthedocs.io/en/latest/examples/sandbox_fusion_example.html">verl.readthedocs.io</a> Hugging Face TRL (Transformer Reinforcement Learning) library&#8217;s GRPOTrainer similarly delegates reward computation to a user-provided function &#8212; the framework handles the training loop but leaves verification infrastructure as a gap the user must build. This pattern is consistent across all major RL training frameworks: generation is well-supported; verification is bespoke. <a href="https://github.com/huggingface/trl">github.com/huggingface/trl</a></p><p>[13] Cloud CPU utilization: Cast AI reports companies use only 13% of provisioned CPU capacity on average; industry estimates put global datacenter CPU utilization under 30%. Per Data Center Dynamics. <a href="https://www.datacenterdynamics.com/en/news/only-13-of-provisioned-cpus-and-20-of-memory-utilized-in-cloud-computing-report/">datacenterdynamics.com</a></p><p>[14] veRL&#8217;s training pipeline interleaves reward calculation with sampling: &#8220;As soon as a request completes, its reward is computed immediately&#8212;reducing the overhead of reward evaluation, especially for compute-heavy tasks like test case execution for coding.&#8221; Together AI, &#8220;DeepCoder,&#8221; op. cit. This architectural choice &#8212; tight coupling over decoupled services &#8212; reflects the trade-off between training speed and verification latency.</p><p>[15] Jensen Huang, Nvidia Q4 FY2026 earnings call, per CNBC, March 13, 2026. <a href="https://www.cnbc.com/2026/03/13/nvidia-gtc-ai-jensen-huang-cpu-gpu.html">cnbc.com</a></p><p>[16] Dion Harris, Nvidia head of AI infrastructure, per CNBC, March 13, 2026. &#8220;CPUs are becoming the bottleneck in terms of growing out this AI and agentic workflow.&#8221; <a href="https://www.cnbc.com/2026/03/13/nvidia-gtc-ai-jensen-huang-cpu-gpu.html">cnbc.com</a></p><p>[17] Bank of America, datacenter CPU market forecast ($27B in 2025 to $60B by 2030), per CNBC, March 13, 2026. Analyst estimate. <a href="https://www.cnbc.com/2026/03/13/nvidia-gtc-ai-jensen-huang-cpu-gpu.html">cnbc.com</a></p><p>[18] OpenMetal, &#8220;Intel AMX Enables High-Efficiency CPU Inference for AI Workloads.&#8221; AMX benchmark on Llama 3.2 3B quantized. Independent test, not vendor-published. <a href="https://openmetal.io/resources/blog/intel-amx-ai-inference-performance/">openmetal.io</a></p><p>[19] AMD, &#8220;Speculative LLM Inference on the 5th Gen AMD EPYC Processors with Parallel Draft Models (PARD) &amp; AMD Platform Aware Compute Engine (AMD PACE).&#8221; Vendor-published benchmark. <a href="https://www.amd.com/en/developer/resources/technical-articles/2025/speculative-llm-inference-on-the-5th-gen-amd-epyc-processors-wit.html">amd.com</a></p><p>[20] ClearML benchmarks of llama.cpp on AWS Graviton instances. Performance advantage varies significantly by model size, quantization, and instance configuration. The 4&#215; figure represents favorable configurations, not a universal comparison. <a href="https://clear.ml/blog/benchmarking-llama-cpp-on-arm-neoverse-based-aws-graviton-instances-with-clearml">clear.ml</a></p><p>[21] CNBC, &#8220;Meta expands Nvidia deal to use millions of AI chips in data center build-out, including standalone CPUs,&#8221; February 17, 2026. Nvidia confirmed &#8220;first large-scale Grace-only deployment.&#8221; <a href="https://www.cnbc.com/2026/02/17/meta-nvidia-deal-ai-data-center-chips.html">cnbc.com</a></p><p>[22] Ian Buck, Nvidia VP and General Manager of Hyperscale and HPC, per The Register, February 17, 2026. <a href="https://www.theregister.com/2026/02/17/meta_nvidia_cpu/">theregister.com</a></p><p>[23] AMD, &#8220;Advance Data Center AI with Servers Powered by AMD EPYC Processors.&#8221; AMD explicitly positions EPYC for &#8220;inference on models up to ~20B parameters.&#8221; <a href="https://www.amd.com/en/products/processors/server/epyc/ai.html">amd.com</a></p><p>[24] AMD whitepaper, EPYC 9965 inference benchmarks. 70B model, 32 concurrent requests at 1,024 tokens: 76-second time-to-first-token. Vendor-published. <a href="https://www.amd.com/en/products/processors/server/epyc/ai/9005-inference.html">amd.com</a></p><p>[25] Red Hat, vLLM documentation: &#8220;vLLM is not intended for CPU-based inference and has not been optimized for CPU performance.&#8221; Intel is contributing SGLang CPU backend with native AMX support; AMD is contributing ZenDNN 5.2 for vLLM on EPYC. Apple&#8217;s Metal Performance Shaders and MLX framework deliver state-of-the-art CPU/Neural Engine inference on Apple Silicon (M-series), but Apple does not sell server hardware &#8212; the most advanced CPU inference stack is confined to consumer devices and developer workstations. <a href="https://docs.redhat.com/en/documentation/red_hat_enterprise_linux_ai/1.4/html-single/serving_models/index">docs.redhat.com</a></p><p>[26] Author calculation. Math verification at &lt;1ms per check vs. code verification at 1&#8211;10 seconds = 1,000&#215;&#8211;10,000&#215; cost increase per check. Batch increase of 2&#215; and G increase from 4 to 16 (4&#215;) compound with the per-check cost increase: 2 &#215; 4 &#215; (1,000 to 10,000) = 8,000&#215; to 80,000&#215; increase in total CPU verification demand. GPU generation demand increases by 2 &#215; 4 = 8&#215; (batch &#215; completions). The range reflects uncertainty in the per-check cost; the structural point &#8212; that verification demand increases by orders of magnitude more than generation demand &#8212; holds across the range.</p><p>[27] DeepSeek-AI, &#8220;DeepSeek-R1,&#8221; op. cit. Group size varies across training stages. The paper describes G values ranging from small (4&#8211;8) in early stages to larger values in later stages. G=64 has been reported in some configurations but may not be representative of the primary training stages.</p><p>[28] TechCrunch, &#8220;Intel won&#8217;t bring its Falcon Shores AI chip to market,&#8221; January 30, 2025. <a href="https://techcrunch.com/2025/01/30/intel-wont-bring-its-falcon-shores-ai-chip-to-market/">techcrunch.com</a></p><p>[29] Michelle Johnston Holthaus, Intel interim co-CEO, per Yahoo Finance. <a href="https://finance.yahoo.com/news/intel-ai-dreams-slip-further-132705872.html">finance.yahoo.com</a></p><p>[30] The Infrastructure Reversion Test: when a company repeatedly fails at the intelligence layer and retreats to infrastructure bets. See &#8220;Chip and Mortar&#8221; (The AI Realist, 2025) for the framework applied to Amazon. Intel&#8217;s pattern &#8212; Ponte Vecchio, Rialto Bridge, Falcon Shores, Gaudi &#8212; fits the test: serial intelligence-layer failures followed by retreat to the infrastructure asset (Xeon) that was never the strategic bet. <a href="https://www.airealist.ai/">airealist.ai</a></p><p>[31] Intel-SambaNova partnership announced in February 2026. Intel Capital investment of approximately $50M, per industry reporting. <a href="https://sambanova.ai/">sambanova.ai</a></p><p>[32] Intel, &#8220;Intel Delivers Open, Scalable AI Performance in MLPerf Inference v6.0,&#8221; Intel Newsroom. Intel is the only vendor submitting standalone CPU results. DGX Rubin NVL8 host CPU selection per Tom&#8217;s Hardware reporting. <a href="https://newsroom.intel.com/artificial-intelligence/intel-delivers-ai-performance-mlperf-inference-v6-0">newsroom.intel.com</a></p><p>[33] Intel roadmap: Clearwater Forest (288 E-cores, Intel 18A, H1 2026) and Diamond Rapids (up to 192 P-cores, H2 2026), per Tom&#8217;s Hardware. <a href="https://www.tomshardware.com/tech-industry/semiconductors/intel-chip-roadmap-2026-2028">tomshardware.com</a></p><p>[34] Mercury Research, per Tom&#8217;s Hardware. The supply chain estimate approaching 40% by Q1 2025 is from TweakTown, not Mercury Research directly &#8212; treat as B-tier. <a href="https://www.tomshardware.com/pc-components/cpus/amd-gained-consumer-desktop-and-laptop-cpu-market-share-in-2024-server-passes-25-percent">tomshardware.com</a></p><p>[35] AMD Q1 2025 datacenter revenue of $3.7 billion, up 57% year-over-year, per AMD earnings release. <a href="https://ir.amd.com/">ir.amd.com</a></p><p>[36] AMD EPYC 9005 Turin specifications: up to 192 cores, 12 DDR5 channels, 160 PCIe Gen5 lanes. <a href="https://www.amd.com/en/products/processors/server/epyc/9005-series.html">amd.com</a></p><p>[37] ARM datacenter share approximately 15% at the end of 2024 and 50% target, per Benzinga, citing ARM&#8217;s own statements. <a href="https://www.benzinga.com/media/25/03/44566020/arm-aims-for-50-data-center-cpu-market-share-by-2025-challenging-intel-and-amd">benzinga.com</a></p><p>[38] Both GB200 and GB300 NVL72 use Grace CPUs (36 per rack) exclusively. GB300 NVL72, with Blackwell Ultra GPUs, is now shipping &#8212; Microsoft has deployed 4,600+ racks. Meta standalone Grace deployment per CNBC, op. cit.</p><p>[39] ARM, &#8220;Arm expands compute platform to silicon products in historic company first,&#8221; Arm Newsroom. AGI CPU with 136 Neoverse V3 cores, co-designed with Meta. <a href="https://newsroom.arm.com/news/arm-agi-cpu-launch">newsroom.arm.com</a></p><p>[40] SoftBank's acquisition of Ampere Computing for $6.5 billion, an all-cash deal, was announced on March 19, 2025, and closed on November 25, 2025. Ampere employs approximately 1,500 people; the majority are in chip design roles, according to industry reports. SoftBank also owns approximately 90% of Arm Holdings and acquired UK AI chip startup Graphcore in July 2024 for an estimated $500&#8211;600M (undisclosed). Graphcore reported &#163;3.4M ($4M) revenue in 2023 against &#163;103M ($131M) losses, per Sifted, citing UK Companies House filings. <a href="https://amperecomputing.com/press/softbank-group-to-acquire-ampere-computing">amperecomputing.com</a></p><p>[41] Oracle 8-K filing, SEC EDGAR. Oracle held 32.27% equity stake; it booked a $2.7 billion pre-tax gain in Q2 FY2026 earnings (quarter ended November 2025). <a href="https://www.sec.gov/Archives/edgar/data/0001341439/000119312525314207/orcl-ex99_1.htm">sec.gov</a></p><p>[42] Larry Ellison, per Oracle Q2 FY2026 earnings call, December 2025: &#8220;Oracle sold Ampere because we no longer think it is strategic for us to continue designing, manufacturing, and using our own chips in our cloud datacenters. We are now committed to a policy of chip neutrality.&#8221;</p><p>[43] Oracle&#8217;s acquisition options for Ampere were disclosed in the September 2024 filing. According to The Register, Oracle could take majority control by 2027. <a href="https://www.theregister.com/AMP/2024/09/26/oracle_ampere_stake_cpu/">theregister.com</a></p><p>[44] Oracle Stargate involvement: 450,000+ GB200 GPUs at Abilene, Texas; $300B+ committed capacity. Per OpenAI announcement, January 2025. The $300B figure is a commitment, not disbursed capital &#8212; see &#8220;Hotel Abilene&#8221; (The AI Realist, 2025) for the Commitment-vs-Spend Gap analysis. <a href="https://openai.com/index/announcing-the-stargate-project/">openai.com</a></p><p>[45] Andy Jassy, Amazon's annual shareholder letter, April 9, 2026. Amazon's custom chip business (Graviton, Trainium, Nitro) at $20B+ annualized revenue, growing triple-digit YoY. Jassy: &#8220;If our chips business were a stand-alone business... our annual run rate would be ~$50 billion. There&#8217;s so much demand for our chips that it&#8217;s quite possible we&#8217;ll sell racks of them to third parties in the future.&#8221; Graviton is used by 98% of the top 1,000 EC2 customers. Per Bloomberg and Electronics Weekly. <a href="https://www.bloomberg.com/news/articles/2026-04-09/amazon-is-considering-selling-its-ai-chips-to-other-companies">bloomberg.com</a></p><p>[46] Oracle A4 Standard instances launched in October 2025, based on the AmpereOne M (&#8220;Polaris&#8221;) processor. 96 cores at 3.6 GHz, 12-channel DDR5. Oracle claims 35% better core-for-core performance than the prior generation and 30% better price-performance than AMD EPYC alternatives. Vendor-claimed. Per Next Platform. <a href="https://www.nextplatform.com/2025/10/20/polaris-ampereone-m-arm-cpus-sighted-in-oracle-a4-instances/">nextplatform.com</a></p><p>[47] EU Chips Act: &#8364;69 billion in combined public and private investment catalyzed as of October 2025, per SEMI Europe report. This figure includes both direct public funding and private investment mobilized under the Act&#8217;s framework &#8212; it is not &#8364;69B in government spending. IPCEI Microelectronics and Communication Technologies programs commit additional &#8364;30B+ in public-private investment, per the European Commission. European Court of Auditors Special Report 12/2025 assessed the EU&#8217;s 20% global market share target as unlikely to be met. <a href="https://www.semi.org/sites/semi.org/files/2025-11/SEMI_Chips_Act_Report_Full_Report.pdf">semi.org</a></p><p>[48] SiPearl Rhea1 tapeout July 2025. 80 Arm Neoverse V1 cores, SVE vector units, 4 stacks HBM2E, TSMC N6. Current market leaders (ARM AGI CPU, Nvidia Grace/Vera, AWS Graviton4) use Neoverse V2 or V3 cores at 3nm&#8211;4nm process nodes. <a href="https://www.theregister.com/2025/07/09/sipearl_rhea1_tape_out/">theregister.com</a></p><p>[49] JUPITER exascale system at Forschungszentrum J&#252;lich: 1,300+ Rhea1 nodes providing ~5 PFLOPS as &#8220;Universal Cluster&#8221; module alongside 23,536 Nvidia H200 GPUs. Per HPCwire and Next Platform. <a href="https://www.nextplatform.com/2025/07/09/with-money-and-rhea1-tapeout-sipearl-gets-real-about-hpc-cpus/">nextplatform.com</a></p><p>[50] DARE project (Digital Autonomy with RISC-V in Europe), launched in March 2025. &#8364;240 million initial EU funding, coordinated by Barcelona Supercomputing Center. Per The Register. <a href="https://www.theregister.com/2025/03/07/dare_europe_risc_v_project/">theregister.com</a></p><p>[51] Huawei Ascend 910C performance estimated at 60&#8211;80% of Nvidia H100 on FP16 training workloads, per SemiAnalysis and industry analyst estimates. Performance varies significantly by workload and software optimization. Production target of 600,000 units in 2026, per Bloomberg. The binding constraint is HBM memory supply, not processor fabrication: CXMT can manufacture approximately 2 million HBM stacks/year, sufficient for 250,000&#8211;300,000 units at current capacity. <a href="https://newsletter.semianalysis.com/p/huawei-ascend-production-ramp">newsletter.semianalysis.com</a></p><p>[52] Cambricon revenue: RMB 4.6 billion ($630M) for the first three quarters of 2025. H1 2025 year-over-year growth exceeded 4,000%. The extraordinary percentage reflects a near-zero H1 2024 base of approximately RMB 60 million &#8212; absolute revenue remains small relative to Nvidia&#8217;s quarterly datacenter revenue of $60B+. ByteDance preorder of 200,000 chips per TrendForce. <a href="https://www.caixinglobal.com/2025-10-21/cambricon-completes-550-million-private-placement-as-revenue-growth-slows-102374118.html">caixinglobal.com</a></p><p>[53] Alibaba T-Head XuanTie C950, announced March 2026. RISC-V server chip with built-in Tensor Processing Engine for INT4/FP8 inference. SPECint2006 scores at approximately Apple M1 (2020) levels, per The Register. RISC-V International is headquartered in Switzerland; the ISA is royalty-free and open-source. <a href="https://www.theregister.com/2026/03/25/alibaba_damo_xuantie_c950_chip/">theregister.com</a></p><p>[54] DeepSeek-R1 demonstrated that pure RL with verifiable rewards produces emergent reasoning capabilities, including chain-of-thought and self-correction. DeepCoder confirmed the coding task pattern. The advantage of verifiable rewards is that they provide perfect reward signals &#8212; no reward model approximation error &#8212; for domains where verification is feasible.</p><p>[55] Meta&#8217;s acquisition of Manus and its connection to agentic verification demand per Futurum Group analysis, February 2026: &#8220;Manus operates containerized virtual machines, each running a parallel agent experiment that writes code, debugs it, browses the web, and retries autonomously.&#8221; <a href="https://futurumgroup.com/insights/will-nvidias-meta-deal-ignite-a-cpu-supercycle/">futurumgroup.com</a></p>]]></content:encoded></item><item><title><![CDATA[Open From Both Sides]]></title><description><![CDATA[Chinese proliferation, trivial alignment removal, and the week the safety moat didn't hold.]]></description><link>https://www.airealist.ai/p/open-from-both-sides</link><guid isPermaLink="false">https://www.airealist.ai/p/open-from-both-sides</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Thu, 09 Apr 2026 07:18:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SJH_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SJH_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SJH_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 424w, https://substackcdn.com/image/fetch/$s_!SJH_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 848w, https://substackcdn.com/image/fetch/$s_!SJH_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!SJH_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SJH_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4762199,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193656786?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SJH_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 424w, https://substackcdn.com/image/fetch/$s_!SJH_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 848w, https://substackcdn.com/image/fetch/$s_!SJH_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 1272w, https://substackcdn.com/image/fetch/$s_!SJH_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd03cee8c-8f6b-41dc-8271-956f7f638132_1920x1072.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On April 7, 2026, Anthropic announced that it had built the most capable AI model it had ever created &#8212; and that it would not release it to the public. The same week, a Chinese lab released an open-weight model under MIT license that achieves nearly 95% of Claude&#8217;s coding performance. And a solo researcher stripped all safety guardrails from Google&#8217;s newest open model in approximately four days, using a technique that requires no training data, no GPU time, and no expertise beyond following a recipe.</p><p>Three events. One week. Three different failure modes for the same strategy.</p><p>The strategy is access restriction &#8212; the idea, foundational to Western AI safety since 2023, that controlling a model means controlling its capabilities. Anthropic&#8217;s decision to withhold its new model is the purest expression of this logic. But two forces are now converging, making it untenable. Chinese labs are releasing open-source models that match Western closed ones, trained entirely on domestically manufactured chips that the US tried to restrict. And safety alignment &#8212; the technical mechanism that&#8217;s supposed to make open release safe &#8212; is being removed from any model within days of publication, at zero cost, by anyone who can type a command. The access-restriction era didn&#8217;t die from a single blow. It is being dissolved from opposite directions simultaneously.[1]</p><h2>The model too dangerous to ship</h2><p>Anthropic built <strong>Claude Mythos Preview</strong> as its next-generation foundation model. It was not designed for cybersecurity. The offensive capabilities emerged as a byproduct of broader improvements in coding, reasoning, and autonomous operation &#8212; precisely what makes them alarming.[2]</p><p>In testing, non-expert Anthropic engineers could ask Mythos to find remote code execution vulnerabilities overnight and wake up to a complete, working exploit. The model discovered a vulnerability in OpenBSD&#8217;s TCP SACK implementation &#8212; a signed integer overflow dating to 1998 &#8212; in one of the most security-hardened operating systems ever built.[3] It found a flaw in FFmpeg&#8217;s H.264 codec dating to a 2010 refactor of 2003-era code, a line that had been hit five million times by automated fuzzing tools without triggering detection. FFmpeg&#8217;s maintainers publicly acknowledged the find and shipped three patches in FFmpeg 8.1.[4]</p><p>The performance gap relative to Anthropic&#8217;s prior models is not incremental. Tested against roughly a thousand open-source repositories, Mythos achieved 595 crashes at tiers 1&#8211;2 and ten full control-flow hijacks on fully patched targets. Previous Claude models &#8212; Sonnet 4.6 and Opus 4.6 &#8212; managed 150&#8211;175 tier-1 crashes, approximately 100 at tier 2, a single tier 3, and no tier-5 hijacks.[5] The jump from zero to ten at the highest severity tier is a qualitative threshold, not a quantitative improvement.</p><p>Anthropic&#8217;s response was <strong>Project Glasswing</strong>, a $100 million defensive cybersecurity initiative. Mythos access is restricted to twelve core partners &#8212; AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Linux Foundation, Microsoft, Nvidia, Palo Alto Networks, and one unnamed organization &#8212; with approximately forty total organizations receiving some level of access. The model is delivered through Amazon Bedrock as a gated research preview, available only in US East (N. Virginia), with access controlled by an allow-list &#8212; meaning AWS is simultaneously the delivery infrastructure and one of the twelve partners.[6] Anthropic briefed CISA (the Cybersecurity and Infrastructure Security Agency), the Commerce Department, and senior US officials. The company estimates that other AI labs will reach comparable capabilities within six to eighteen months.[7]</p><p>The withholding decision is consistent with Anthropic&#8217;s identity as the safety-first lab. It is also a strategy whose premises are collapsing.</p><h2>A Tsinghua lab on Huawei chips</h2><p>Eleven days before Anthropic&#8217;s announcement, Zhipu AI &#8212; the Tsinghua University spinout now rebranded as Z.ai &#8212; released GLM 5.1 under an MIT license &#8212; one of the most permissive open-source licenses in common use.</p><p>The model is a 744-billion-parameter mixture-of-experts architecture with approximately 40 billion active parameters per token, trained on 28.5 trillion tokens using 100,000 Huawei Ascend 910B processors.[8] No Nvidia hardware. No Western chips at all. The Ascend 910B is manufactured by SMIC (Semiconductor Manufacturing International Corporation) at 7nm-class process nodes using DUV (deep ultraviolet) lithography &#8212; not the restricted EUV (extreme ultraviolet) process that US export controls were designed to block.</p><p>Zhipu claims GLM 5.1 reaches 94.6% of Claude Opus 4.6&#8217;s coding performance on the Claude Code evaluation harness (scoring 45.3 versus Opus&#8217;s 47.9) and 58.4 on SWE-Bench Pro, which it says beats both Opus 4.6 (57.3) and GPT-5.4 (57.7) on that benchmark.[9] These figures are vendor-claimed and not independently verified as of the release date &#8212; a critical caveat. But the predecessor GLM-5, which shares the same architecture, ranked first among open-weight models on LMArena Text Arena and was the first open-weight model to score 50 on the Artificial Analysis Intelligence Index, with SWE-bench Verified scores holding up under third-party testing.[10]</p><p>The entity behind this model has serious backing. Zhipu was incubated within Tsinghua University&#8217;s Knowledge Engineering Group and has raised approximately $1.5 billion in funding from a mix of China&#8217;s largest technology companies and state-backed investors.[11] Per WireScreen&#8217;s analysis of Chinese corporate filings, Chinese state entities beneficially own 15.4% of the company.[12] OpenAI has publicly stated that Zhipu benefits from over $1.4 billion in state-backed investment.[13] The US Commerce Department added Zhipu to the Entity List effective January 16, 2025, citing military modernization concerns under the most stringent control designation.[14]</p><p>And yet its most capable model sits on Hugging Face right now, downloadable by anyone, under a license that permits unrestricted commercial use.</p><p>GLM 5.1 is not an outlier: it is the latest point on an accelerating curve. DeepSeek R1 (January 2025) matched OpenAI&#8217;s o1 on reasoning benchmarks at roughly one-seventeenth the training cost. Qwen 3 from Alibaba (April 2025) outperformed DeepSeek R1 on coding, math, and reasoning. By late 2025, Alibaba&#8217;s Qwen had surpassed Meta&#8217;s Llama as the most forked model family on Hugging Face by download count, with developers creating over 100,000 Qwen-based derivative models on the platform.[15] The capability gap between Chinese open-weight models and Western closed models has compressed to what observers estimate is 6 to 9 months, and the trend line shows no visible inflection point.</p><p>This is the first arm of the pincer. Withholding a model only constrains capabilities if those capabilities don&#8217;t exist elsewhere. When a state-backed lab releases a near-frontier model under MIT license, trained on sanctioned hardware that the entire US export control architecture was designed to restrict, the &#8220;withhold&#8221; strategy addresses the wrong threat.</p><h2>Four days, zero training, one command</h2><p>On April 2, 2026, Google released Gemma 4, its newest open-weight model family. By approximately April 6, a Hugging Face account called <strong>dealignai</strong> &#8212; a self-described amateur ML researcher &#8212; had uploaded a version of the 31-billion-parameter dense model with all safety alignment permanently removed.[16]</p><p>The method is called CRACK: Controlled Refusal Ablation via Calibrated Knockouts &#8212; a variant of the abliteration technique first popularized by Maxime Labonne in mid-2024, building on the Arditi et al. research. It is not fine-tuning. It requires no training data and no GPU training time. CRACK uses 512 structurally mirrored prompt pairs &#8212; one harmful, one harmless &#8212; to identify the directions in the model&#8217;s weight space that encode safety refusal. It then applies magnitude-preserving orthogonal ablation to surgically remove those directions layer by layer. In a nutshell, this technique zeroes out the component of the model's internal signal that points in the "refuse this request" direction, while keeping the signal's overall strength unchanged. The model loses the ability to say no without losing any of its general capability: think of it as surgically removing one instrument from an orchestra without changing the volume.</p><p>Per dealignai&#8217;s self-reported benchmarks: 93.7% compliance with harmful prompts on HarmBench, with only a 2% degradation on MMLU (76.5% to 74.5%).[17] These benchmarks are self-reported and unverified, but the underlying science is peer-reviewed. The foundational paper &#8212; &#8220;Refusal in Language Models Is Mediated by a Single Direction&#8221; by Arditi et al. &#8212; was published as a main conference poster at NeurIPS 2024. The authors demonstrated, across 13 open-source chat models, that safety refusal is encoded along a single one-dimensional axis in the residual stream&#8212;the model&#8217;s internal representation space. Erase that direction, and the refusal disappears.[18] The finding has been replicated extensively. A separate paper at ICLR 2024 by Qi et al. showed that GPT-3.5 Turbo&#8217;s safety guardrails could be compromised by fine-tuning on as few as one hundred adversarial examples at a cost of less than twenty cents via the OpenAI fine-tuning API.[19]</p><p>The academic literature on proposed defenses is bleak. <strong>RepNoise</strong>, published at NeurIPS 2024, attempted to make safety-critical representations noisy and thus harder to isolate. Qi et al. broke it at ICLR 2025. The same paper demonstrated that seemingly minor factors &#8212; different random seeds, small hyperparameter adjustments &#8212; were sufficient to recover harmful capabilities. Fine-tuning on just one hundred benign data points could largely undo the defense.[20] <strong>Tamper-Resistant Safeguards (TAR)</strong>, which claimed to resist hundreds of fine-tuning steps, were broken in the same paper.[21] <strong>Circuit Breakers</strong>, published at NeurIPS 2024, rerouted harmful representations into an orthogonal space; a subsequent study found that automated multi-turn jailbreaks succeeded against circuit-breaker-protected models 54.2% of the time, though this finding is from an arXiv preprint rather than a peer-reviewed venue.[22]</p><p>The Qi et al. paper &#8212; published at ICLR 2025 and co-authored by Nicholas Carlini &#8212; concluded that durably safeguarding open-weight LLMs with current approaches remains challenging and warned that even evaluating these defenses can mislead audiences into thinking safeguards are more durable than they really are.[23]</p><p>The process has been fully automated. An open-source tool called <strong>Heretic</strong> uses Bayesian optimization to find optimal abliteration parameters and can strip alignment from a model with a single command. Over a thousand community-created uncensored models exist on Hugging Face. Dealignai alone has published more than thirty abridged versions &#8212; including Gemma 4, every size of Qwen 3.5, Mistral Small 4, Nvidia Nemotron, and MiniMax M2.5 &#8212; all within days of their respective releases.[24]</p><p>The pattern is now routine: a lab releases a model, and the uncensored version follows almost immediately. Safety alignment has an effective half-life measured in days.</p><p>This is the second arm of the pincer. Releasing a model with safety alignment only constrains the actors who maintain the alignment. When removal is free, automated, and instantaneous, the &#8220;release safely&#8221; strategy addresses the wrong threat. An uncensored GLM 5.1 is not Mythos &#8212; it lacks the frontier-scale capabilities that produce autonomous exploit discovery. But the capabilities emerging at the frontier scale today will reach the near-frontier scale within one or two model generations. The threat is the trajectory, not the current snapshot.</p><h2>The distillation bridge</h2><p>The two arms of the pincer appear independent. One is geopolitical (Chinese open-weight proliferation). The other is technical (alignment removal). But Anthropic&#8217;s own evidence reveals how they connect.</p><p>On February 24, 2026, Anthropic published detailed documentation that three Chinese labs &#8212; DeepSeek, Moonshot, and MiniMax &#8212; had systematically distilled capabilities from Claude using approximately 24,000 fraudulent accounts generating over 16 million exchanges.[25] The operation specifically targeted agentic reasoning, tool use, and coding &#8212; the same capabilities that make Mythos dangerous. MiniMax alone accounted for over 13 million of those exchanges and pivoted within twenty-four hours of Anthropic releasing a new model, redirecting nearly half its traffic to capture the latest capabilities.[26]</p><p>Anthropic warned that illicitly distilled models lack necessary safeguards and that, if open-sourced, the risk multiplies as capabilities spread beyond any single government&#8217;s control.[27] The decision to publish this evidence was itself an act of responsible disclosure &#8212; Anthropic could have addressed the distillation quietly. That it chose transparency strengthens the company&#8217;s credibility even as the evidence complicates its withholding strategy.</p><p>This is Anthropic simultaneously making two claims that exist in tension. The first: we must withhold our most capable model because its capabilities are too dangerous to release. The second: competitors are already systematically extracting those capabilities through the API and will soon release them in models with no safeguards.</p><p>If the second claim is true &#8212; and Anthropic&#8217;s own evidence is persuasive &#8212; then withholding Mythos delays proliferation by the six-to-eighteen-month window Anthropic estimates before competitors match its capabilities independently. The FFmpeg patches prove the window&#8217;s value at the individual-vulnerability level: defenders patched a sixteen-year-old flaw because Anthropic found it first. A year of defensive advantage in cybersecurity saves real systems from real exploits. But patching individual bugs does not address the structural problem: a model that can find thousands of vulnerabilities across every major OS and browser, running unrestricted on anyone&#8217;s hardware. The window is a reprieve, not a strategy.</p><p>The distillation evidence also reveals the bridge between the two arms of the pincer. Capabilities developed inside a closed model leak through the API layer via distillation. Once distilled into an open-weight model and released &#8212; as DeepSeek R1 was, under a permissive open license, in January 2025 &#8212; the alignment removal techniques apply. The model that was too dangerous to release is now downloadable, uncensored, and free.</p><p>You cannot export-control a sequence of API calls.[28]</p><h2>Governance for the governed</h2><p>Each of these developments was foreseeable. The problem is that Western AI governance was built on three premises that are now simultaneously false.</p><p><strong>Premise one: frontier capabilities require frontier compute.</strong> The January 2025 AI Diffusion Framework &#8212; the first US export controls on model weights &#8212; explicitly exempted open-weight models. The Bureau of Industry and Security acknowledged the logic: once weights are released, they can be copied and sent anywhere instantaneously.[29] The framework relied instead on controlling chips and closed weights above a high compute threshold. But it also created a sliding threshold mechanism: as open models improve, controls on closed models at equivalent capability levels automatically relax. This means open-weight proliferation systematically erodes export controls by design &#8212; each release lowers the ceiling. GLM 5.1&#8217;s training on sanctioned Huawei chips demonstrates that even the hardware bottleneck is not holding: the entire US semiconductor export control architecture was designed to prevent exactly this outcome.</p><p><strong>Premise two: Safety alignment is technically durable.</strong> The NTIA&#8217;s July 2024 report on dual-use foundation models with widely available model weights &#8212; the definitive US government analysis &#8212; found that safety mitigations that work for closed models do not reliably work for models with widely available weights.[30] The UK AI Security Institute&#8217;s research found that scaffolding-based defenses &#8212;safety layers bolted around the model rather than trained into it &#8212; can be trivially disabled and that once weights are released, the system cannot be rolled back.[31] As of April 2026, no peer-reviewed defense against alignment removal has survived adversarial evaluation. The one promising direction &#8212; extended-refusal fine-tuning, which distributes the safety signal across multiple dimensions rather than concentrating it in one &#8212; remains untested at scale against adaptive adversaries.</p><p><strong>Premise three: access restriction equals capability restriction.</strong> This is the core of what I&#8217;ve called in prior pieces &#8220;governance for the governed.&#8221;[32] Anthropic&#8217;s withholding of Mythos constrains actors who would have accessed capabilities through Anthropic. It does nothing about the actor who downloads GLM 5.1 from Hugging Face, runs CRACK on it, and obtains a model with no safety restrictions and near-frontier coding capability. The safety framework governs the governed. The actors who justify the controls operate entirely outside it. Alignment still prevents casual misuse by unsophisticated users &#8212; that value is real but narrow. For enterprise risk teams, the implication is concrete: &#8220;aligned model&#8221; should be treated as a configuration state, not a security property, and is subject to change the moment the model leaves the provider&#8217;s infrastructure.</p><p>The convergence is now industry-wide. On the same day this was written, Meta launched Muse Spark &#8212; its most capable model to date, built by the Meta Superintelligence Lab &#8212; and, for the first time, did not release it as open-weight.[33] The company that built its entire AI identity on open-weight Llama has concluded that its frontier model should stay closed, whether for safety reasons, competitive ones, or both. Anthropic withholds. Meta withholds. And across the Pacific, Zhipu is released under the MIT license. The Western labs are converging on closure at exactly the moment the strategy loses its structural advantage.</p><p>The regulatory landscape offers no resolution. The EU AI Act provides partial exemptions for open-weight models on transparency and documentation, with full obligations triggering only above the systemic risk threshold.[34] That threshold is based on training compute &#8212; a metric that distillation and post-training advances have decoupled from capability. A model distilled from a frontier system can match its performance at a fraction of the compute, sailing under the regulatory threshold as if it were harmless. The Trump administration's July 2025 AI Action Plan contains what may be the strongest federal endorsement of open-weight models to date, calling them strategically valuable.[35] China maintains a dual strategy requiring domestic models to align with core socialist values through mandatory security assessments while aggressively promoting open-weight releases internationally.[36] There is no enforceable international coordination mechanism for managing these dynamics.</p><h2>What Anthropic&#8217;s own evidence proves</h2><p>The most revealing document in this entire episode is the distillation disclosure. Anthropic&#8217;s evidence that Chinese labs systematically harvested Claude&#8217;s capabilities is credible &#8212; the scale (24,000 accounts, 16 million exchanges), the targeting (agentic reasoning, tool use, coding), and the speed (twenty-four-hour pivots to new model releases) all point to organized, state-adjacent operations.</p><p>But follow the chain. Anthropic builds a frontier model. It restricts access. Competitors distill capabilities through the API anyway. They incorporate those capabilities into their own models, which they release open-weight. Once open-weight, anyone can strip safety alignment in hours using automated tools and peer-reviewed techniques. The model that was too capable to ship is now running uncensored on someone&#8217;s laptop.</p><p>Each link in this chain has been demonstrated independently. Anthropic documented the distillation. Zhipu documented the open-weight release. Arditi et al. documented the removal of alignment. The full sequence has not yet been traced end-to-end in a single case but every component is operational, and the direction is clear.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Va9G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Va9G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 424w, https://substackcdn.com/image/fetch/$s_!Va9G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 848w, https://substackcdn.com/image/fetch/$s_!Va9G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 1272w, https://substackcdn.com/image/fetch/$s_!Va9G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Va9G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png" width="1456" height="1178" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1178,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:268836,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193656786?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Va9G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 424w, https://substackcdn.com/image/fetch/$s_!Va9G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 848w, https://substackcdn.com/image/fetch/$s_!Va9G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 1272w, https://substackcdn.com/image/fetch/$s_!Va9G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84d02808-5286-4be6-96aa-c937de8666b7_2720x2200.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Bloomsbury Intelligence and Security Institute drew an instructive parallel to digital piracy and digital rights management. The proliferation of open-weight models, adversarial jailbreaking techniques, and divergent international regulatory frameworks makes comprehensive AI safety unattainable.[37] Just as copy protection was routinely cracked and legal victories against individual platforms proved futile as alternatives proliferated, AI safety enforcement faces the same architecture: the defense is expensive and brittle; the attack is cheap and automated. BISI predicts a shift from prevention to harm reduction, accepting that some misuse is inevitable and allocating resources to mitigate the most severe outcomes.</p><h2>What comes after the safety moat</h2><p>The honest assessment is uncomfortable for every stakeholder. Anthropic&#8217;s withholding of Mythos is a reasonable act by a responsible actor operating within a framework that no longer constrains the broader system. The six-to-eighteen-month window matters: every patched vulnerability is a real defense. Concede that. But the window is compressing with every Chinese open-weight release and every automated alignment-removal tool.</p><p>Three dynamics will shape what replaces the access-restriction model.</p><p>First, the capability gap will continue to compress. GLM 5.1 at 94.6% of Claude Opus 4.6 today becomes parity within another iteration or two. The compression is not only Chinese: in the same week, a US open-weight model matched Claude Sonnet 4.6 on at least one practitioner&#8217;s production evaluations, a first for any open-source model against that benchmark.[38] The trajectory since DeepSeek R1 &#8212; January 2025 to April 2026 &#8212; has been relentless. Each new release arrives closer to the frontier, trained on hardware that export controls were supposed to deny.[39]</p><p>Second, alignment removal will continue to get easier. From manual dataset curation in 2023, to the discovery of the refusal direction in 2024, to automated single-command tools in 2025, each step has lowered the barrier. The cost asymmetry is stark: safety alignment demands massive compute and millions of reinforcement learning interactions to install. Removal requires zero training and zero cost. This asymmetry is not incidental: it follows from the finding that alignment is encoded in a shallow, low-dimensional subspace that can be identified and erased without touching the model&#8217;s general capabilities.[40]</p><p>Third, the irreversibility problem has no solution. Once model weights are released, they exist permanently. They cannot be recalled, patched, or updated. They can be infinitely copied at zero marginal cost. This is fundamentally different from nuclear materials or biological agents, where possession can be physically tracked and limited. The comparison to digital piracy is apt not as a metaphor but as a structural precedent: the same enforcement architecture that failed for music, film, and software is now failing for AI safety, for the same reasons.</p><p>This thesis would be wrong if any of the three conditions were met. First, a peer-reviewed defense against alignment removal that survives adversarial evaluation at scale within twelve months. Second, export controls that demonstrably prevent Chinese labs from reaching frontier capability for two consecutive model generations. Third, a coordinated international framework that achieves enforceable restrictions on open-weight model release. None appears likely. But they are the conditions to watch.</p><p>Project Glasswing itself may point toward what replaces the safety moat. It is not an access-restriction strategy: it is a harm-reduction strategy. Anthropic is deploying Mythos&#8217;s capabilities defensively, through vetted partners, to patch vulnerabilities before attackers find them. That is closer to the post-piracy model BISI describes: stop trying to prevent all misuse and concentrate resources on the most consequential defensive applications. Whether the AI safety community will make that transition deliberately &#8212; or have it forced on them &#8212; is the open question.</p><p>Anthropic withheld. Zhipu released. dealignai cracked. The capabilities are in the open, ungoverned and ungovernable by any framework that assumes controlling the model controls the capability. The safety moat was real. The water found a way around it.</p><div><hr></div><h3>Notes</h3><p>[1] The three events occurred between approximately March 27, 2026 (GLM 5.1 release) and April 7, 2026 (Anthropic Glasswing announcement). The Gemma 4 &#8220;crack&#8221; appeared approximately April 6, per Hugging Face upload timestamps.</p><p>[2] Anthropic, &#8220;<a href="https://red.anthropic.com/2026/mythos-preview/">Assessing Claude Mythos Preview&#8217;s cybersecurity capabilities</a>,&#8221; April 7, 2026. Anthropic describes the capabilities as &#8220;emergent&#8221; &#8212; arising from general capability improvements rather than cybersecurity-specific training.</p><p>[3] Ibid. The OpenBSD vulnerability was independently confirmed: OpenBSD published patch 025_sack.patch.sig for version 7.8, and Simon Willison verified the code age via git blame (<a href="https://simonwillison.net/2026/Apr/7/project-glasswing/">simonwillison.net, April 7, 2026</a>).</p><p>[4] Ibid. The target was FFmpeg&#8217;s H.264 codec, not a standalone &#8220;video software&#8221; application. The &#8220;five million&#8221; figure refers to the number of times the specific code line was executed by automated fuzzing tools &#8212; not five million discrete test runs. FFmpeg publicly acknowledged the patches; see PiunikaWeb, &#8220;<a href="https://piunikaweb.com/2026/04/08/ffmpeg-thanks-claude-mythos-16-year-bug-fix/">FFmpeg thanks Anthropic&#8217;s Claude Mythos for real 16-year bug fix</a>,&#8221; April 8, 2026.</p><p>[5] Ibid. The repository count is described as &#8220;roughly a thousand.&#8221; The tier-5 figure for prior models is implied by contrast (&#8221;zero&#8221; is inferred from the absence of any positive claim), not explicitly stated.</p><p>[6] AWS, &#8220;<a href="https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-bedrock-claude-mythos/">Amazon Bedrock now offers Claude Mythos Preview (Gated Research Preview)</a>,&#8221; April 7, 2026. Available in US East (N. Virginia) only. Access controlled by allow-list; AWS account teams contact approved organizations directly. AWS CISO Amy Herzog published a companion blog post: &#8220;Building AI Defenses at Scale: Before the Threats Emerge.&#8221;</p><p>[7] Anthropic, &#8220;<a href="https://www.anthropic.com/glasswing">Project Glasswing</a>,&#8221; April 7, 2026. The twelve core partners and the six-to-eighteen-month capability timeline are from the announcement. Logan Graham, Head of Frontier Red Team, quoted in <a href="https://edition.cnn.com/2026/04/07/tech/anthropic-claude-mythos-preview-cybersecurity">CNN, April 7, 2026</a>.</p><p>[8] Z.ai, &#8220;<a href="https://z.ai/blog/glm-5.1">GLM-5.1: Towards Long-Horizon Tasks</a>,&#8221; approximately March 27, 2026. Architecture details from <a href="https://docs.z.ai/guides/llm/glm-5">Z.ai developer documentation</a>. Z.ai states 744B total parameters; HuggingFace metadata shows 754B &#8212; the discrepancy likely reflects different counting methodologies for shared or embedding parameters. The 100,000 Huawei Ascend 910B figure is consistently reported across coverage and attributed to Z.ai, though the primary blog post renders via JavaScript and could not be directly verified in plain text. Vendor-published. Note: SMIC&#8217;s role as the Ascend 910B fabricator is based on TechInsights teardown analysis, not official Huawei or SMIC confirmation. &#8220;7nm-class&#8221; reflects that SMIC does not officially confirm process nodes for specific customers.</p><p>[9] Ibid. All benchmark scores are vendor-claimed and not specifically independently verified for GLM 5.1. Claude Code harness: 45.3 &#247; 47.9 = 94.57%. SWE-Bench Pro comparison scores: GPT-5.4 at 57.7, Opus 4.6 at 57.3.</p><p>[10] The predecessor GLM-5 achieved an LMArena Text Arena score of 1452 (first among open-weight models) and an Artificial Analysis Intelligence Index score of 50. SWE-bench Verified score of 77.8% held up under third-party testing per OfficeChai analysis.</p><p>[11] Zhipu was founded by Tsinghua University professors Tang Jie and Li Juanzi. Funding total assembled from Tracxn ($1.4B across 12 rounds as of early 2025), Caixin Global (over &#165;10B as of July 2025), and subsequent rounds. &#8220;Approximately $1.5 billion&#8221; reflects total through late 2025. Key investors include Alibaba, Tencent, Ant Group, Saudi Aramco&#8217;s Prosperity7, and multiple Chinese state-backed funds. In March 2025 alone, Zhipu received $257 million from three state-backed investors: Hangzhou City Investment (~$137M), Zhuhai&#8217;s Huafa Group (~$69M), and a Chengdu Hi-Tech zone fund (~$41.5M). Per <a href="https://www.usnews.com/news/technology/articles/2025-03-19/chinese-ai-firm-zhipu-raises-257-million-in-state-backed-funding-spree">Reuters, March 19, 2025</a>.</p><p>[12] WireScreen analysis cited in The Wire China, &#8220;<a href="https://www.thewirechina.com/2025/02/16/what-is-zhipu-ai/">What is Zhipu AI?</a>,&#8221; February 16, 2025. WireScreen was co-founded by David Barboza, a Pulitzer Prize-winning former NYT journalist. The 15.4% figure reflects beneficial ownership calculations based on Chinese corporate filings; methodology involves analytical judgments about which entities qualify as &#8220;state entities.&#8221;</p><p>[13] OpenAI Global Affairs, &#8220;<a href="https://openaiglobalaffairs.substack.com/p/chinese-progress-at-the-front">Chinese Progress at the Front</a>,&#8221; Substack, June 25, 2025. OpenAI&#8217;s actual phrasing: &#8220;over $1.4 billion in state-backed investment.&#8221;</p><p>[14] <a href="https://www.federalregister.gov/documents/2025/01/15/2025-00636/framework-for-artificial-intelligence-diffusion">Federal Register document 2025-00704</a>, effective January 16, 2025. Zhipu received a Footnote 4 designation &#8212; the most stringent control level.</p><p>[15] DeepSeek R1 training cost comparison from DeepSeek&#8217;s own technical report. Qwen adoption statistics from Hugging Face Hub data and Andreessen Horowitz partner estimates, as reported in <a href="https://www.technologyreview.com/2026/02/12/1132811/whats-next-for-chinese-open-source-ai/">MIT Technology Review, February 12, 2026</a>, and <a href="https://www.understandingai.org/p/the-best-chinese-open-weight-models">Understanding AI</a>.</p><p>[16] <a href="https://huggingface.co/dealignai/Gemma-4-31B-JANG_4M-CRACK">Hugging Face model page</a>. dealignai describes itself as &#8220;Amateur ML Researcher&#8221; on its <a href="https://huggingface.co/dealignai">Hugging Face organization page</a>.</p><p>[17] Self-reported benchmarks from the model card. CRACK methodology description from the same. The 512 prompt pairs figure is dealing with AI&#8217;s specific configuration; the number is configurable in the broader ablation methodology. CRACK is a variant of the abliteration technique first popularized by Maxime Labonne in his Hugging Face blog post &#8220;<a href="https://huggingface.co/blog/mlabonne/abliteration">Uncensor any LLM with abliteration</a>&#8220; (June 2024), which turned the Arditi et al. research finding into a practical, reproducible tool. The engineering step from research paper to community tool is itself evidence of how fast the research-to-exploitation pipeline operates. These figures have not been independently verified.</p><p>[18] Andy Arditi, Oscar Balcells Obeso, Aaquib Syed, Daniel Paleka, Nina Panickssery (Rimsky), Wes Gurnee, and Neel Nanda, &#8220;Refusal in Language Models Is Mediated by a Single Direction,&#8221; NeurIPS 2024 (main conference poster). Published in Advances in Neural Information Processing Systems 37, pp. 136037&#8211;136083. <a href="https://arxiv.org/abs/2406.11717">arXiv: 2406.11717</a>. The 13-model claim and &#8220;single direction in the residual stream&#8221; are verbatim from the abstract. A February 2025 follow-up (<a href="https://arxiv.org/abs/2602.02132">arXiv 2602.02132</a>) challenges the single-direction account, arguing for geometrically distinct refusal directions across categories. Important caveat: the 13 models studied were all RLHF- or DPO-aligned open-weight chat models. Whether the single-direction finding applies to models aligned via Constitutional AI (Anthropic&#8217;s approach) or other methods is untested. This does not affect the piece&#8217;s thesis &#8212; the vulnerability applies to all open-weight models, which are RLHF/DPO-aligned &#8212; but readers should not assume the finding transfers to closed models with different alignment architectures.</p><p>[19] Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, and Peter Henderson, &#8220;Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!&#8221; ICLR 2024. <a href="https://arxiv.org/abs/2310.03693">arXiv: 2310.03693</a>. The paper demonstrated safety degradation with as few as 10 adversarial examples; the &#8220;$0.20&#8221; cost figure refers to 100 examples via the OpenAI fine-tuning API.</p><p>[20] Xiangyu Qi, Boyi Wei, Nicholas Carlini, Yangsibo Huang, Tinghao Xie, Luxi He, Matthew Jagielski, Milad Nasr, Prateek Mittal, and Peter Henderson, &#8220;<a href="https://arxiv.org/abs/2412.07097">On Evaluating the Durability of Safeguards for Open-Weight LLMs</a>,&#8221; ICLR 2025. The RepNoise target: Domenic Rosati, Jan Wehner, Kai Williams, &#321;ukasz Bartoszcze, Jan Batzner, Hassan Sajjad, and Frank Rudzicz, &#8220;<a href="https://arxiv.org/abs/2405.14577">Representation Noising: A Defence Mechanism Against Harmful Finetuning</a>,&#8221; NeurIPS 2024.</p><p>[21] Tamirisa et al., &#8220;<a href="https://arxiv.org/abs/2408.00761">Tamper-Resistant Safeguards for Open-Weight LLMs</a>,&#8221; also published at ICLR 2025. Broken by the same Qi et al. paper.</p><p>[22] Anna Googasian, Marisol Koyamparambil Mamachan, Diana Ngo, Aravind Srinivasan, and Jacob Dunefsky, &#8220;<a href="https://arxiv.org/abs/2507.02956">A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks</a>,&#8221; arXiv preprint, June 2025. This paper is a preprint and has not been published at a peer-reviewed venue. The Circuit Breakers target: Andy Zou, Long Phan, Justin Wang, Derek Duenas, Maxwell Lin, Maksym Andriushchenko, Rowan Wang, Zico Kolter, Matt Fredrikson, and Dan Hendrycks, &#8220;<a href="https://arxiv.org/abs/2406.04313">Improving Alignment and Robustness with Circuit Breakers</a>,&#8221; NeurIPS 2024.</p><p>[23] This is the same Qi et al. (2025) paper cited in notes 20&#8211;21. Carlini is the third author.</p><p>[24] <a href="https://github.com/p-e-w/heretic">Heretic</a>. dealignai&#8217;s Hugging Face profile lists 32+ model uploads, including Gemma 4 31B, Qwen 3.5-VL (0.8B through 397B), Mistral Small 4, Nvidia Nemotron, and MiniMax M2.5.</p><p>[25] Anthropic, &#8220;<a href="https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks">Detecting and preventing distillation attacks</a>,&#8221; February 24, 2026. The exact figures: &#8220;approximately 24,000 fraudulent accounts&#8221; and &#8220;over 16 million exchanges.&#8221;</p><p>[26] Ibid. MiniMax accounted for over 13 million of the approximately 16 million exchanges (~79%). The twenty-four-hour pivot and &#8220;nearly half their traffic&#8221; redirected are Anthropic&#8217;s characterization.</p><p>[27] Ibid.</p><p>[28] Export controls can, in principle, apply to technology transfer via API under the deemed export rule (EAR &#167; 734.13), which covers making controlled technology available to foreign nationals. Anthropic&#8217;s distillation complaint is essentially arguing that API-based extraction constitutes such a transfer. The legal question is not jurisdiction but enforcement: detecting and preventing millions of automated API calls from spoofed accounts across multiple jurisdictions is operationally impractical at scale.</p><p>[29] Bureau of Industry and Security, &#8220;<a href="https://www.federalregister.gov/documents/2025/01/15/2025-00636/framework-for-artificial-intelligence-diffusion">Framework for Artificial Intelligence Diffusion</a>,&#8221; Federal Register 2025-00636, effective January 15, 2025. The open-weight exemption and sliding threshold mechanism are from the rule text. &#8220;Sliding threshold&#8221; and &#8220;erosion by design&#8221; are the author&#8217;s analytical framing &#8212; the rule itself sets a fixed compute threshold (10^26 operations) for closed-weight model export controls, with open-weight models exempt below that level. As open models improve toward that threshold, the practical gap between controlled and uncontrolled capability narrows.</p><p>[30] NTIA, &#8220;<a href="https://www.ntia.gov/federal-register-notice/2024/dual-use-foundation-artificial-intelligence-models-widely-available-model-weights">Dual-Use Foundation Artificial Intelligence Models with Widely Available Model Weights</a>,&#8221; U.S. Department of Commerce, July 30, 2024. The report found that safety mitigations designed for closed models &#8220;do not reliably work&#8221; for models with widely available weights. Paraphrase, not a direct quote. Critically, the report&#8217;s overall recommendation was against restricting the availability of open-weight models, favoring monitoring over mandated restrictions.</p><p>[31] UK AI Security Institute, &#8220;<a href="https://www.aisi.gov.uk/blog/managing-risks-from-increasingly-capable-open-weight-ai-systems">Managing risks from increasingly capable open-weight AI systems</a>,&#8221; 2025.</p><p>[33] Daniel Howley, &#8220;<a href="https://finance.yahoo.com/sectors/technology/article/meta-launches-muse-spark-ai-model-as-part-of-its-ai-turnaround-171109510.html">Meta launches Muse Spark AI model as part of its AI turnaround</a>,&#8221; Yahoo Finance, April 8, 2026. Muse Spark is the first model from Meta&#8217;s Superintelligence Lab (MSL), led by Scale AI founder Alexandr Wang. The article states: &#8220;Unlike prior AI models, Meta isn&#8217;t making Muse Spark open source, but rather says it hopes to make future versions of the model open.&#8221; B-tier source (named journalist).</p><p>[32] &#8220;Governance for the governed&#8221; is a framework developed across prior AI Realist pieces &#8212; see &#8220;Access, Disable, Destroy&#8221; (coercion stack analysis) and &#8220;Register, Disclose, Pay&#8221; (EU AI copyright enforcement).</p><p>[34] <a href="https://artificialintelligenceact.eu/article/53/">EU AI Act, Article 53</a> (obligations for providers of general-purpose AI models). Open-weight models receive lighter obligations unless they exceed the systemic risk compute threshold (10^25 FLOPS).</p><p>[35] White House, &#8220;Artificial Intelligence Action Plan,&#8221; July 2025. Analysis per Stanford HAI and Skadden, Arps.</p><p>[36] China&#8217;s dual approach: mandatory security assessments under the &#8220;Interim Administrative Measures for Generative Artificial Intelligence Services&#8221; (effective August 15, 2023) require alignment with &#8220;core socialist values.&#8221; Simultaneously, Chinese labs aggressively release models internationally under permissive licenses (DeepSeek R1 under MIT, GLM 5.1 under MIT, Qwen under Apache 2.0).</p><p>[37] Bloomsbury Intelligence and Security Institute, &#8220;<a href="https://bisi.org.uk/reports/ai-safety-lessons-from-digital-piracy">AI Safety: Lessons from Digital Piracy</a>,&#8221; 2025.</p><p>[38] Flo Crivello (<a href="https://x.com/Altimor/status/2041199915943215163">@Altimor</a>), CEO of Lindy AI, X post, approximately April 7, 2026. Crivello reported that Arcee Trinity Large Thinking was the first open-source model to beat Claude Sonnet 4.6 on Lindy&#8217;s internal production evaluations. Lindy&#8217;s evals are proprietary, application-specific (agentic workflows), and not independently verified. Independent benchmarks (SWE-bench Verified, Vals AI, LM Market Cap) show Trinity trailing Sonnet 4.6 on coding and general tasks. B-/C+ tier source: named practitioner with real production deployment, but narrow evals with zero external citations. The model is <a href="https://huggingface.co/arcee-ai/Trinity-Large-Thinking">Arcee AI&#8217;s 398B MoE</a> (13B active), released under Apache 2.0. Disclosure: the author was Chief Evangelist at Arcee AI until November 2025.</p><p>[39] Author&#8217;s assessment based on the trajectory from DeepSeek R1 (January 2025) through Qwen 3 (April 2025) through GLM 5.1 (March 2026). Each release narrowed the capability gap with Western closed models on standardized benchmarks.</p><p>[40] The structural asymmetry follows from Arditi et al. (2024): alignment is encoded in a low-dimensional subspace that can be identified and removed without degrading general capability. Installing alignment requires high-dimensional optimization (RLHF at scale); removing it requires low-dimensional surgery (abliteration). This cost asymmetry is inherent to the current alignment architecture, not an implementation failure.</p>]]></content:encoded></item><item><title><![CDATA[Train, Deploy, Write Down]]></title><description><![CDATA[The AI industry&#8217;s most expensive assets depreciate faster than anything in corporate history. The accounting rules haven&#8217;t noticed.]]></description><link>https://www.airealist.ai/p/train-deploy-write-down</link><guid isPermaLink="false">https://www.airealist.ai/p/train-deploy-write-down</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Tue, 07 Apr 2026 16:50:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!woGT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!woGT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!woGT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 424w, https://substackcdn.com/image/fetch/$s_!woGT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 848w, https://substackcdn.com/image/fetch/$s_!woGT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 1272w, https://substackcdn.com/image/fetch/$s_!woGT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!woGT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8182685,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193482903?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!woGT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 424w, https://substackcdn.com/image/fetch/$s_!woGT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 848w, https://substackcdn.com/image/fetch/$s_!woGT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 1272w, https://substackcdn.com/image/fetch/$s_!woGT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7cee814-f400-4b7d-8aa1-a13b54b199dc_2816x1584.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On March 11, 2026, OpenAI quietly retired GPT-5.1 from ChatGPT.[1] The model had been live for four months. Its replacement, GPT-5.4, was already shipping. Between August 2025 and March 2026, OpenAI released five major versions of GPT-5 &#8212; from 5.0 through 5.4, roughly one every six weeks &#8212; plus their mini and nano variants, each superseding the last.[2] No version of the current product lasted longer than 4 months before being deprecated or demoted to fallback status.</p><p>The pre-trained base model underneath all five versions likely cost north of $100 million to produce.[3] The base itself was still generating derivative value. But each post-trained variant &#8212; the product the customer actually used &#8212; lived and died on a cycle shorter than a single fiscal quarter.</p><p>There is no accounting standard designed for this.</p><h2>The gap nobody is measuring</h2><p>The AI industry runs on a three-stage training pipeline. Stage one is pre-training: the massive, months-long compute job that produces a foundation model &#8212; the raw intelligence. Epoch AI estimates these runs have grown in cost at 2.4&#215; per year since 2016, with the most expensive now exceeding $1 billion.[4] Stage two is post-training: reinforcement learning, alignment tuning, and capability refinement that turns the foundation into a product. Stage three is distillation: compressing the large model&#8217;s capabilities into smaller, cheaper variants for high-volume deployment.</p><p>The financial structure of this pipeline is what the accounting standards fail to capture. The pre-training run is the nine- or ten-figure expenditure, but it functions as a platform, not a product. OpenAI&#8217;s GPT-5 base has generated at least 5 major derivative versions over the past 7 months and counting. Google&#8217;s Gemini 2.5 Pro serves as the teacher model from which Gemini Flash and Flash Lite are distilled &#8212; a relationship Google&#8217;s own technical report confirms.[5] Anthropic&#8217;s Opus and Sonnet tiers share a similar architecture, though the company has not publicly disclosed whether Sonnet is distilled from Opus or trained independently at a smaller scale. <strong>The pre-training base is the platform; the post-training variants are the products; and the products depreciate at a rate that no GAAP framework was designed to capture.</strong></p><p>Under current US GAAP, companies face a classification problem with no clean answer.[6] If a company treats a training run as research and development, the full cost &#8212; $200 million, $500 million, whatever the bill &#8212; hits the income statement in the quarter it&#8217;s incurred. The balance sheet never sees it. If the company instead tries to capitalize the training run as software &#8212; spreading the cost over multiple years, the way it would with a conventional software project &#8212; the accounting rules create a catch-22: you can only start capitalizing costs after the major technical uncertainties are resolved, but for a frontier model, the uncertainty about whether it will achieve target performance isn&#8217;t resolved until training is substantially complete.[7] The FASB tightened this test in September 2025, and for companies selling model access as a service rather than using it internally, an even stricter standard applies&#8212;one in which the capitalization trigger is essentially indistinguishable from training completion. As EY flatly noted in December 2025, the guidance &#8220;does not provide specific guidance for AI software development.&#8221;[8]</p><p>The result is a classification vacuum. Most frontier labs expense training costs as R&amp;D, which means the model-layer depreciation problem hits the income statement immediately, arguably a conservative treatment. The larger risk sits one layer down: the hardware purchased to run the training is capitalized as property and equipment and depreciated over five to six years. And the gap between those two treatments &#8212; the model that lives three to twelve months, sitting on hardware that depreciates over six years &#8212; is where the financial fiction lives. The three-stage pipeline genuinely extends the economic utility of both the model and the hardware. A pre-training base that generates five derivative versions over eighteen months is better capital efficiency than one model, one training run, one product. But the accounting standards don&#8217;t reflect the pipeline&#8217;s structure: they apply a single useful-life estimate to hardware that serves three distinct workloads with three distinct depreciation curves.</p><h2>How the pipeline actually works</h2><p>The shift that enabled the current iteration speed was not primarily algorithmic. It was the transition from human-gated to machine-gated feedback.</p><p>The old regime &#8212; reinforcement learning from human feedback, or RLHF &#8212; required human labelers to rank model outputs, which were then used to train a reward model that guided the RL loop. The human labeling step was the bottleneck. Each post-training iteration required a fresh campaign of thousands of ranked comparisons, took weeks, cost real money in annotator time, and scaled with headcount rather than compute.[9] This is why the gap between GPT-4 and GPT-4o was measured in months, not weeks.</p><p>The new regime &#8212; reinforcement learning with verifiable rewards (RLVR) &#8212; replaces human labelers with automated verifiers. Does the code compile? Do the tests pass? Is the math correct? Did the tool call return the right result? The verification runs at machine speed.[10] No humans in the loop. The iteration cadence is now constrained by compute availability rather than labeler throughput. That is what makes it feasible to have five GPT-5 versions in seven months. With RLHF, each would have required a new annotation campaign. With RLVR, you run more GPU-hours.</p><p>The compute profile of this post-training loop matters for the hardware story. Regardless of which specific RL algorithm a lab uses &#8212; OpenAI has not disclosed its choice &#8212; all RL post-training methods share a structural characteristic: the generation phase consumes the majority of the compute budget.[11] Every iteration of the loop requires the model to generate multiple complete responses to each training prompt, score them, and then compute a gradient update. The generation step is inference. The gradient update is training. In practice, post-training is an inference-dominated workload wearing training&#8217;s clothes &#8212; the GPUs spend most of their time generating tokens sequentially, bottlenecked by how fast they can read model weights from memory, with the gradient update that actually changes the model&#8217;s weights consuming a small fraction of the total compute.[12]</p><p>The economic consequence is precise. Pre-training requires the latest and most expensive accelerators &#8212; tight GPU synchronization across thousands of chips, maximum interconnect bandwidth, cutting-edge precision formats. There is no substitute for frontier silicon at this stage. But post-training can, in principle, run its generation phase on previous-generation hardware, because the generation phase is inference. An H100 generating RL training samples does not need to be a Blackwell B200. It just needs enough memory bandwidth to run the model forward at a reasonable speed.</p><h2>The hardware cascade: real today, threatened tomorrow</h2><p>This is where the training pipeline structure intersects with the hardware depreciation problem that hyperscalers have been managing &#8212; or mismanaging &#8212; through accounting policy.</p><p>The implicit financial model behind extended useful lives is what the industry calls the computing cascade. New GPUs enter service for frontier pre-training. After twelve to eighteen months, the next GPU generation arrives, and the older chips migrate to post-training workloads and inference. After another cycle, they cascade further to batch processing, fine-tuning, and edge deployment. The cascade model is the financial justification for depreciating a GPU over five or six years, even though its frontier training life is eighteen months at best.[13]</p><p>The cascade is real. CoreWeave reported in late 2025 that its five-year-old A100 GPUs remained fully booked.[13] A100s that originally sold for $15,000&#8211;$25,000 trade on the secondary market at $8,000&#8211;$18,000, depending on variant &#8212; not zero.[14] The inference workload that absorbs cascaded hardware is genuinely large and growing: by most estimates, inference will consume 80% of AI compute cycles by 2030.[15] For labs running RLVR post-training loops, the generation cluster provides a natural home for last-generation silicon.</p><p>But two forces are converging to compress the cascade window.</p><p>First, post-training compute is scaling toward parity with pre-training. Epoch AI estimates that OpenAI scaled its RL compute roughly tenfold between o1 and o3, and projects that post-training compute will soon match pre-training budgets.[16] When the RL generation cluster needs as many GPUs as the pre-training cluster, the &#8220;old GPUs can handle it&#8221; thesis strains. The workload is still inference-dominated, but the scale demands frontier throughput.</p><p>Second, purpose-built decode architectures are attacking the GPU&#8217;s specific weakness at the silicon level. Cerebras&#8217;s WSE-3 uses 44GB of on-chip SRAM to eliminate the high-bandwidth memory (HBM) bottleneck that makes sequential token generation slow on conventional GPUs. Nvidia itself acknowledged the threat by acquiring Groq &#8212; the inference chip startup whose LPU architecture targets the same bottleneck &#8212; for $20 billion in December 2025 and launching the Groq 3 at GTC 2026.[17] As Cerebras&#8217;s CTO noted after watching the keynote: Jensen &#8220;essentially acknowledged that GPUs can&#8217;t really compete&#8221; in the high-value, high-speed inference segment above 400 tokens per second. These architectures are optimized for exactly the workload that the cascade model depends on old GPUs to serve, but the threat extends beyond production inference. Because RLVR post-training is itself inference-dominated, the same silicon that accelerates serving users could accelerate the generation cluster inside the training loop. A lab that runs GRPO-style group generation on purpose-built decode hardware and reserves conventional GPUs solely for gradient updates would need far fewer general-purpose training chips for post-training. The cascade doesn&#8217;t just lose its Tier 3 (inference) market; it also loses most of Tier 2 (post-training). Meanwhile, the hyperscalers themselves are investing in custom silicon for their own inference workloads, further reducing their demand for cascaded GPUs without any of that silicon reaching the broader secondary market.</p><p>The cascade works today. KKR has offered the strongest version of this argument: temporary overbuilds behave like rolling upgrades rather than stranded assets, because new AI workloads absorb excess capacity and power constraints naturally limit the potential for overbuild.[35] That may prove correct for physical infrastructure broadly &#8212; data centers, power systems, network fabric. It does not address the trained model itself, nor the hardware specifically built for AI workloads that Amazon has already begun writing down. The cascade&#8217;s financial logic may not survive the next two hardware cycles. And the depreciation schedules that depend on it extend five to six years into the future.</p><h2>The Great Hyperscaler Divergence</h2><p>The clearest evidence that the depreciation question is unresolved &#8212; not just theoretically but in practice, inside the companies making the largest bets &#8212; is that two hyperscalers took opposite actions in the same quarter under identical technological conditions.</p><p>In January 2025, Amazon shortened the useful life of a subset of its servers and networking equipment from six years to five years. The company&#8217;s SEC filing stated the reason explicitly: &#8220;the increased pace of technology development, particularly in the area of artificial intelligence and machine learning.&#8221;[18] The nine-month impact through September 2025 was an increase in depreciation expense of $889 million.[19] Amazon also retired some equipment early, taking $920 million in accelerated depreciation charges in Q4 2024.[20] Total operating income impact for 2025: approximately $1.3 billion.</p><p>In the same month, Meta extended the estimated useful life of its servers and network assets to 5.5 years &#8212; its fourth extension in three years.[21] The expected reduction in 2025 depreciation: approximately $2.9 billion.[22] Meta&#8217;s extension history runs from four years (pre-2022) through 4.5 and five years (2022) to the current 5.5 years, each step coinciding precisely with an acceleration in AI capital expenditure.[23]</p><p>Same quarter. Same AI hardware cycle. Same underlying technology. Amazon absorbed a $1.3 billion hit to reported income on a subset of its AI-specific servers. Meta gained $2.9 billion across its full server fleet &#8212; a figure based on assets in service as of December 31, 2024, with the benefit compounding as Meta deploys $115&#8211;135 billion in new capital through 2026.[23] The $4.2 billion swing between these two decisions is not an operational difference &#8212; it is a pure accounting choice. Amazon cited AI obsolescence as the reason for the shortening. Meta cited asset durability as the reason for the extension.</p><p>The specificity of Amazon&#8217;s reversal matters. This was not a blanket reassessment &#8212; it targeted a subset of servers, the ones most exposed to the AI hardware upgrade cycle. Amazon had already extended useful lives twice, from four to five years in 2022 and from five to six in 2023, capturing billions in depreciation savings along the way. The reversal is the admission that the earlier extensions were too aggressive for hardware running AI workloads and that the engineering reality caught up with the accounting assumption. The &#8220;subset&#8221; language also implies that Amazon&#8217;s general-purpose servers may still justify longer lives while its AI-specific fleet does not. If auditors accept that distinction, it establishes the precedent that AI hardware and general server hardware require different depreciation schedules, a precedent no other hyperscaler has yet adopted.</p><p>Alphabet had already extended from four to six years in January 2023, reducing FY2023 depreciation by $3.9 billion and increasing net income by $3.0 billion.[24] Microsoft was the first mover, extending from four to six years in 2022.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xih4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xih4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 424w, https://substackcdn.com/image/fetch/$s_!Xih4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 848w, https://substackcdn.com/image/fetch/$s_!Xih4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 1272w, https://substackcdn.com/image/fetch/$s_!Xih4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xih4!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png" width="1200" height="779.6703296703297" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:946,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:180372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193482903?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xih4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 424w, https://substackcdn.com/image/fetch/$s_!Xih4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 848w, https://substackcdn.com/image/fetch/$s_!Xih4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 1272w, https://substackcdn.com/image/fetch/$s_!Xih4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b3cf8-1158-4e96-92ba-233fb1bd85d4_2400x1560.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The aggregate annual depreciation reduction across Meta, Alphabet, and Microsoft from their extensions is difficult to calculate precisely because the benefit compounds as new assets enter at the extended life. But the directional impact is unambiguous: billions of dollars in reported operating income each year reflect accounting policy, not economic reality. And Amazon&#8217;s reversal &#8212; the only hyperscaler to move in the opposite direction &#8212; is the market signal that the cascade model is under stress where it matters most: in the infrastructure built specifically for AI workloads.</p><h2>What the post-training pipeline means for IP</h2><p>The same post-training capabilities that depreciate in months are also the ones most exposed to extraction by competitors &#8212; and the extraction is already happening at an industrial scale.</p><p>Pre-training lays the foundation for the model&#8217;s general intelligence. It cannot be stolen through an API. No amount of querying will extract the model&#8217;s weights, architecture, or training data. The hundred-million-dollar investment in pre-training is protected because the asset is accessible only through the interface the lab chooses to expose.</p><p>Post-training is different. The capabilities that RLVR optimizes &#8212; chain-of-thought reasoning, tool use, code generation, structured problem-solving &#8212; are expressed in the model&#8217;s outputs. They are visible to anyone with API access. And they are exactly the capabilities that distillation can extract.</p><p>In February 2026, Anthropic accused three Chinese AI labs &#8212; DeepSeek, Moonshot AI, and MiniMax &#8212; of running coordinated distillation campaigns against Claude. The scale was industrial: over 16 million exchanges through approximately 24,000 fraudulent accounts, using proxy services that operated networks of tens of thousands of accounts simultaneously.[25] The three campaigns targeted Claude&#8217;s most commercially valuable capabilities: DeepSeek focused on reasoning and chain-of-thought extraction; Moonshot targeted coding and vision; and MiniMax targeted agentic tool use and orchestration.[26] Google separately reported distillation attacks on Gemini using more than 100,000 prompts. OpenAI made similar accusations to House lawmakers.[27]</p><p>The economics of distillation theft map precisely onto the training pipeline&#8217;s cost structure. A lab spends months and hundreds of millions on pre-training, which the attacker cannot access. It then spends weeks and millions on post-training, producing the capabilities the customer pays for. The attacker queries the API at inference pricing &#8212; cents per response &#8212; and harvests exactly the post-training outputs that the RL loop optimized. The cost of stealing is structurally proportional to the cost of inference, not to the cost of training.</p><p>Anthropic explicitly drew the connection to export controls: restricting China&#8217;s access to advanced chips limits both direct model training and the scale of illicit distillation.[28] The structural argument is that export controls target Stage 1 (pre-training requires massive compute). Distillation attacks bypass Stage 1 entirely by extracting Stages 2 and 3 from American labs at inference cost. Distillation is not the only path to closing the capability gap &#8212; independent efficiency gains are real, as DeepSeek V3&#8217;s near-frontier performance at a claimed $5.6 million in compute demonstrates. But distillation is faster and cheaper for targeted capability extraction, and it scales with API access rather than with compute budgets.</p><p>The domains where RLVR works best &#8212; code, math, tool use &#8212; are the domains most vulnerable to distillation, because they produce structured, evaluable outputs that are rich training signals. The same property that makes a capability rapidly improvable through automated verification makes it rapidly extractable through scaled API querying. The RLVR revolution, which accelerated Western labs&#8217; iteration speed, simultaneously expanded the attack surface for capability theft.[37]</p><h2>What the standards miss and why it matters for valuations</h2><p>The accounting standards were not designed for any of this.</p><p>A pre-training base that costs $200 million and serves as the foundation for 12 to 18 months of derivative products lacks a clear classification. If expensed as R&amp;D, it disappears from the balance sheet immediately, thereby understating the asset base and front-loading costs. If capitalized as internal-use software and amortized over the GAAP-standard three to five years, it overstates the asset&#8217;s life by two to four times. If treated as a platform with derivative products amortized separately, there is no GAAP mechanism to implement this.</p><p>A post-training variant that costs single-digit millions in compute and lives four months before deprecation has no depreciation schedule that makes sense. In practice, no company has established a precedent for amortizing capitalized software over a single fiscal quarter, and auditors would challenge any such attempt.</p><p>A distilled model &#8212; a mini or nano variant &#8212; has near-zero standalone training cost but derives its entire value from the teacher model&#8217;s capabilities. Its useful life is tied to the parent variant&#8217;s survival. No standard addresses this dependency chain.</p><p>The regulatory landscape is blank. No SEC rule, staff accounting bulletin, or formal guidance addresses AI cost classification.[29] The FASB issued an Invitation to Comment on intangible asset recognition in December 2024, receiving 43 responses &#8212; including BDO&#8217;s explicit flag that the treatment of large language model training costs is unresolved.[30] The IASB identified AI as a potential test case for its IAS 38 modernization, but solutions are not expected before 2027 at the earliest.[31] The PCAOB has begun targeting AI companies for inspection, but has issued no specific guidance on auditing AI cost classification.[32]</p><p>Dario Amodei offered a framing in mid-2025 that illuminates the problem from the operator&#8217;s perspective &#8212; though it is a management construct rather than a recognized accounting measure. Each model, he argued, can be profitable as a standalone unit: a 2023 model costing $100 million generates $200 million in revenue over its deployment life. But the company trains a $1 billion model in 2024, then a $10 billion model in 2025 &#8212; so the entity-level P&amp;L shows escalating losses even as each vintage is individually profitable.[33] This is structurally analogous to a pharmaceutical company that recoups each drug&#8217;s development costs but spends faster on the next pipeline candidate than revenue from the last one arrives. The difference is that pharma development cycles are 10 to 15 years and are protected by 20-year patents. AI model cycles are three to twelve months and protected by nothing except the cost of pre-training, which is eroding due to distillation attacks.</p><p>Three implications follow. First, any company capitalizing training costs with useful lives exceeding twelve months should trigger immediate scrutiny &#8212; the empirical evidence on variant lifespans suggests much faster economic depreciation. This applies most acutely to private AI companies and startups whose valuations implicitly assume the trained model is a durable asset; for public hyperscalers that expense training as R&amp;D, the risk concentrates in the hardware layer. The vacuum also affects M&amp;A: purchase price requires identifying and valuing acquired intangible assets, but no established methodology exists for valuing a trained model with a three-to-twelve-month competitive life.[36] </p><p>Second, the divergence between Amazon&#8217;s depreciation reversal and Meta&#8217;s extension is not a settled debate &#8212; it is an ongoing disagreement over whether the cascade model holds, playing out in real time in SEC filings. The verified annual depreciation impact across three companies alone &#8212; Alphabet&#8217;s $3.9 billion, Meta&#8217;s $2.9 billion, and Amazon&#8217;s $1.3 billion reversal &#8212; demonstrates that the accounting policy choice swings billions in reported income each year. The gap widens as these companies guide a collective $650 billion in 2026 capital expenditure onto balance sheets carrying extended useful lives.[38] </p><p>Third, the circular financing patterns in the AI ecosystem &#8212; where Nvidia invests in OpenAI, which buys Nvidia chips, which generates Nvidia revenue &#8212; echo the vendor financing loops that accelerated the telecom bust, where equipment makers like Cisco and Lucent lent to customers to buy their own products, creating self-referential demand that inflated revenue until the credit collapsed.[34]</p><h2>What breaks</h2><p>The thesis that AI training costs represent a systematically mispriced depreciation risk would be wrong under three conditions. First, if FASB issues AI-specific capitalization guidance that establishes useful-life standards aligned with actual competitive shelf life &#8212; unlikely before 2028, given the current regulatory calendar, but conceivable. Second, if the frontier model release cadence slows to eighteen months or longer, making three-to-five-year depreciation schedules defensible &#8212; possible if scaling laws plateau, but inconsistent with the post-training acceleration trend. Third, if a secondary market for trained model weights emerges that establishes residual value for superseded models &#8212; currently nonexistent for proprietary models, though the open-weight ecosystem provides a partial analogue.</p><p>Amazon&#8217;s reversal is the most honest signal in the market. A company that extended useful lives from four to six years, captured billions in depreciation savings, and then reversed course &#8212; explicitly citing AI-driven obsolescence &#8212; is telling you what its own engineers concluded when they examined the hardware the models actually run on. The question is whether companies still extending will follow Amazon&#8217;s lead or wait until the write-down is forced.</p><p>The AI industry has built an extraordinary pipeline for converting capital into intelligence. Pre-training creates the platform. Post-training creates the product. Distillation distributes the product at scale. Each stage has a different cost, useful life, and vulnerability to obsolescence and theft. The accounting standards do not recognize any of these distinctions. The depreciation schedules applied to the hardware assume a six-year cascade. The training costs are either invisible (expensed as R&amp;D) or overstated (capitalized at lives that don&#8217;t match reality). And the post-training IP that makes the models commercially valuable &#8212; the capabilities customers actually pay for &#8212; can be extracted at an industrial scale by competitors paying inference prices.</p><p>Train, deploy, write down. The question is not whether the write-down comes. It is whether it arrives as a managed accounting adjustment or as a market repricing that nobody&#8217;s balance sheet was built to absorb.</p><div><hr></div><h3>Notes</h3><p>[1] OpenAI, &#8220;Model Release Notes,&#8221; updated March 11, 2026. &#8220;As of March 11, 2026, GPT-5.1 models are no longer available in ChatGPT.&#8221; https://help.openai.com/en/articles/9624314-model-release-notes</p><p>[2] GPT-5 launched August 7, 2025; GPT-5.1 November 12, 2025; GPT-5.2 December 11, 2025; GPT-5.3-Codex February 5, 2026; GPT-5.4 March 5, 2026; GPT-5.4 mini/nano approximately March 18, 2026. Sources: Wikipedia entries for GPT-5, GPT-5.1, GPT-5.2; OpenAI blog posts for 5.3 and 5.4 releases. https://help.openai.com/en/articles/6825453-chatgpt-release-notes</p><p>[3] Sam Altman stated at an MIT event in April 2023 that the cost of training GPT-4 was &#8220;more than&#8221; $100 million. Dario Amodei said on the &#8220;In Good Company&#8221; podcast with Norges Bank CEO Nicolai Tangen (July 2024) that current frontier models cost approximately $100 million, with models in training at the time costing &#8220;more like a billion.&#8221; Epoch AI estimates that GPT-5 used less total pre-training compute than GPT-4.5 (which Epoch estimates at approximately $200 million), offset by substantially scaled post-training. The $100 million+ figure for a GPT-5-class pre-training run is a reasonable lower bound based on these independent estimates. https://epochai.substack.com/p/why-gpt-5-used-less-training-compute</p><p>[4] Ben Cottier, Robi Rahman, Loredana Fattorini, Nestor Maslej, and David Owen, &#8220;The rising costs of training frontier AI models,&#8221; arXiv:2405.21015 (2024). The 2.4&#215; annual growth rate (90% CI: 2.0&#215; to 2.9&#215;) is based on analysis of 45 frontier models using amortized hardware CapEx plus energy costs. The paper projects costs will exceed $1 billion by 2027. The Epoch AI trends dashboard, updated through early 2026, reports a 3.5&#215;/year growth rate for frontier language models specifically since 2020. https://arxiv.org/abs/2405.21015</p><p>[5] Google DeepMind, &#8220;Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities,&#8221; technical report. &#8220;The smaller models in the Gemini 2.5 series &#8212; Flash size and below &#8212; use distillation,&#8221; using a k-sparse approximation of the teacher model&#8217;s next-token prediction distribution. https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf</p><p>[6] Three GAAP standards potentially apply to AI model training costs. ASC 730 (Research and Development) requires all R&amp;D costs to be expensed as incurred. ASC 350-40 (Internal-Use Software) permits capitalization during the &#8220;application development stage&#8221; for software that a company builds for its own use. ASC 985-20 (Software to Be Sold, Leased, or Otherwise Marketed) permits capitalization only after &#8220;technological feasibility&#8221; is established &#8212; a higher bar &#8212; and applies to companies selling model access (OpenAI, Anthropic). The consensus across all Big Four firms is that generative AI applications are &#8220;a form of software&#8221; subject to software development frameworks, not general R&amp;D. But which framework applies&#8212;and when capitalization can begin&#8212;remains judgment-intensive.</p><p>[7] FASB ASU 2025-06, issued September 18, 2025. The ASU eliminates the three-stage (preliminary project, application development, post-implementation) framework for internal-use software and replaces it with a &#8220;probable-to-complete&#8221; threshold, with an exclusion for &#8220;significant development uncertainty.&#8221; Effective for fiscal years beginning after December 15, 2027. Summaries: Deloitte, &#8220;FASB Amends Guidance on the Accounting for and Disclosure of Software Costs&#8221; (September 18, 2025), https://www.deloitte.com/us/en/services/audit-assurance/accounting-standards/fasb-amends-guidance-software-costs.html; EY, &#8220;To the Point: FASB modernizes guidance on accounting for software costs&#8221; (December 11, 2025); Forvis Mazars, &#8220;FASB&#8217;s Improvements to Accounting for Internal-Use Software&#8221; (December 2025); KPMG, &#8220;Handbook: Software and website costs&#8221; (February 2026).</p><p>[8] EY, &#8220;Technical Line: Software costs&#8221; (December 11, 2025). Direct quote: &#8220;The guidance does not provide specific guidance for AI software development.&#8221; https://www.ey.com/en_us/insights/assurance/to-the-point-fasb-modernizes-guidance-on-accounting-for-software-costs</p><p>[9] The RLHF pipeline requires human annotators to produce ranked comparisons of model outputs, which are used to train a reward model. OpenAI&#8217;s InstructGPT paper (Ouyang et al., 2022) describes the three-stage RLHF process in detail. The human annotation step is inherently time-limited and does not scale with compute availability. https://arxiv.org/abs/2203.02155</p><p>[10] Sebastian Raschka, &#8220;The State of Reinforcement Learning for LLM Reasoning,&#8221; April 2025. RLVR uses &#8220;rewards derived from rules-based or deterministic verifiers&#8221; &#8212; compilers for code, symbolic checkers for math, and tool execution results for agentic tasks. https://magazine.sebastianraschka.com/p/the-state-of-llm-reasoning-model-training</p><p>[11] All RL algorithms for LLM post-training &#8212; PPO, GRPO, REINFORCE variants, and their derivatives &#8212; require the model to generate completions as the first step of each training iteration. The generation step is a forward pass (inference). The gradient update step is training. OpenAI has not disclosed which specific RL algorithm it uses for GPT-5 or its variants. The structural claim that post-training is inference-dominated holds across all published algorithms.</p><p>[12] Cameron R. Wolfe, &#8220;Group Relative Policy Optimization (GRPO),&#8221; November 2025. GRPO was introduced by DeepSeek in the DeepSeekMath paper (Shao et al., arXiv:2402.03300, 2024) and used in DeepSeek-R1. It eliminates the critic model used in PPO and instead generates G completions per prompt (typically 8&#8211;16), comparing rewards within the group to compute advantages. https://cameronrwolfe.substack.com/p/grpo The ratio of generation to gradient-update compute varies by algorithm and configuration. ROLL documentation (Alibaba): &#8220;GRPO trades increased inference cost (multiple samples per prompt) for simpler architecture and more stable training.&#8221; Inference cost scales linearly with the group size parameter. The ratio of inference to training time depends on group size, model size, and completion length, but generation dominates in all published configurations. Nathan Lambert (Interconnects) notes that RL post-training naturally decomposes into clusters for acting, generation, and learning, with policy gradient updates communicating less frequently than pre-training&#8217;s constant gradient synchronization. https://www.interconnects.ai/p/what-comes-next-with-reinforcement</p><p>[13] The computing cascade model is described by multiple industry analysts. See Introl, &#8220;Secondary GPU Markets: Buying and Selling Used AI Hardware,&#8221; March 2026: &#8220;Value cascade: Years 1&#8211;2 for frontier training, 3&#8211;4 for inference, 5&#8211;6 for batch workloads.&#8221; CoreWeave H100s from 2022 contract expirations reported as rebooking at 95% of original pricing. https://introl.com/blog/secondary-gpu-markets-buying-selling-used-hardware-guide-2025</p><p>[14] Introl, &#8220;Secondary GPU Markets,&#8221; March 2026. https://introl.com/blog/secondary-gpu-markets-buying-selling-used-hardware-guide-2025 A100 40GB variants: $8,000&#8211;$12,000; A100 80GB variants: $12,000&#8211;$18,000 (from $15,000&#8211;$25,000+ new). H100 on-demand cloud pricing has declined approximately 70% from 2024 peaks.</p><p>[15] Multiple industry projections converge on inference consuming 75&#8211;80% of AI compute cycles by 2030. Epoch AI data shows only 30% of OpenAI&#8217;s compute spending in 2024 went to inference, suggesting the shift is still underway. https://epoch.ai/data-insights/openai-compute-spend</p><p>[16] Epoch AI, &#8220;Why GPT-5 used less training compute than GPT-4.5 (but GPT-6 probably won&#8217;t),&#8221; September 2025. &#8220;OpenAI scaled RL by 10&#215; from o1 to o3.&#8221; The analysis projects that &#8220;tripling post-training compute will soon be akin to tripling the entire compute budget &#8212; so current growth rates likely can&#8217;t be sustained for much more than a year.&#8221; These are Epoch estimates based on public data and The Information&#8217;s spending projections, not OpenAI disclosures. https://epochai.substack.com/p/why-gpt-5-used-less-training-compute</p><p>[17] Cerebras WSE-3 uses 44GB of on-chip SRAM, eliminating the HBM bandwidth bottleneck for the sequential decode phase. Nvidia acquired Groq in a $20 billion licensing deal in December 2025 and launched the Groq 3 LPU at GTC 2026, integrating Groq&#8217;s SRAM-rich architecture into its inference stack. Cerebras CTO Sean Lie&#8217;s &#8220;essentially acknowledged&#8221; quote from SDxCentral, &#8220;Cerebras spins Nvidia&#8217;s Groq tie-up as proof its wafer-scale bet was right,&#8221; March 2026. https://www.sdxcentral.com/analysis/cerebras-spins-nvidias-groq-tieup-as-proof-its-waferscale-bet-was-right/ OpenAI has separately signed a $10 billion compute deal with Cerebras for inference capacity through 2028 (Reuters, January 2026), and OpenAI internal teams have attributed Codex performance limitations to Nvidia GPU hardware for inference workloads (Reuters, February 2026). For background on the disaggregated inference architecture, see &#8220;Acquired, Absorbed, Disaggregated&#8221; (The AI Realist). The application of disaggregated inference to RLVR&#8217;s generation phase is a structural inference from the workload profile &#8212; no published example exists of a lab running RL group generation on Cerebras or Groq as of April 2026.</p><p>[18] Amazon 10-Q for the quarter ended September 30, 2025 (filed October 31, 2025): &#8220;Effective January 1, 2025, we changed our estimate of the useful lives of a subset of our servers and networking equipment from six years to five years.&#8221; &#8220;The shorter useful lives are due to the increased pace of technology development, particularly in the area of artificial intelligence and machine learning.&#8221; https://www.sec.gov/Archives/edgar/data/1018724/000101872425000036/amzn-20250331.htm</p><p>[19] Amazon 10-Q, Q3 2025. https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&amp;CIK=0001018724&amp;type=10-Q&amp;dateb=&amp;owner=include&amp;count=10 Nine-month impact: increase in depreciation and amortization expense of $889 million and reduction in net income of $677 million, primarily impacting the AWS segment.</p><p>[20] Amazon 10-K FY2024. https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&amp;CIK=0001018724&amp;type=10-K&amp;dateb=&amp;owner=include&amp;count=10 &#8220;We recorded approximately $920 million of accelerated depreciation and related charges for the quarter ended December 31, 2024 related to these decisions.&#8221; Combined with the $700 million anticipated decrease in 2025 operating income from the useful-life change and $600 million from continuing accelerated depreciation on early-retired equipment, the total 2025 impact is approximately $1.3 billion.</p><p>[21] Meta 10-K FY2024 (SEC filing): &#8220;In January 2025, we completed an assessment of the useful lives of certain servers and network assets, which resulted in an increase in their estimated useful life to 5.5 years, effective beginning fiscal year 2025.&#8221; https://www.sec.gov/Archives/edgar/data/0001326801/000132680125000017/meta-20241231.htm. Meta&#8217;s extension history: four years (pre-2022) &#8594; 4.5 years (Q2 2022) &#8594; five years (Q4 2022) &#8594; 5.5 years (January 2025). Per Meta 10-K FY2022: &#8220;The financial impact of the changes in estimates [in 2022] was a reduction in depreciation expense of $860 million.&#8221;</p><p>[22] Meta 10-K FY2024 (same filing as footnote [21]): &#8220;Based on the servers and network assets placed in service as of December 31, 2024, we expect this change in accounting estimate will reduce our full-year 2025 depreciation expense by approximately $2.9 billion.&#8221;</p><p>[23] The timing correlation is precise. Meta&#8217;s Q2 2022 extension coincided with the pivot to AI infrastructure spending following the metaverse-driven stock decline. The January 2025 extension to 5.5 years occurred in the same filing that guided 2025 capital expenditure to $60&#8211;65 billion, subsequently raised to $66&#8211;72 billion; actual 2025 capex was $72.2 billion. Meta&#8217;s Q4 2025 earnings (January 29, 2026) guided 2026 capital expenditure to $115&#8211;135 billion. https://www.sec.gov/Archives/edgar/data/0001326801/000162828026003832/meta-12312025xexhibit991.htm Each extension reduces the depreciation burden on an expanding capital base, compounding the income-statement benefit.</p><p>[24] Alphabet Q4 2023 earnings release (SEC-filed): &#8220;In January 2023, we completed an assessment of the useful lives of our servers and network equipment and adjusted the estimated useful life of our servers from four years to six years and the estimated useful life of certain network equipment from five years to six years. This change in accounting estimate was effective beginning in fiscal year 2023, and the effect was a reduction in depreciation expense of $3.9 billion and an increase in net income of $3.0 billion.&#8221; https://www.sec.gov/Archives/edgar/data/1652044/000165204424000014/googexhibit991q42023.htm</p><p>[25] Anthropic, &#8220;Detecting and preventing distillation attacks,&#8221; February 23, 2026. &#8220;We have identified industrial-scale campaigns by three AI laboratories &#8212; DeepSeek, Moonshot, and MiniMax &#8212; to illicitly extract Claude&#8217;s capabilities to improve their own models. These labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts.&#8221; https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks</p><p>[26] Anthropic, &#8220;Detecting and preventing distillation attacks&#8221; (see footnote [25] for URL). DeepSeek: 150,000+ exchanges targeting reasoning capabilities and chain-of-thought extraction. Moonshot: 3.4 million exchanges targeting coding and vision. MiniMax: 13 million exchanges targeting agentic coding, tool use, and orchestration. MiniMax was detected while still active, before the model it was training had been released.</p><p>[27] Google Threat Intelligence Group disclosed in February 2026 that it identified and disrupted distillation and model extraction attacks on Gemini using more than 100,000 prompts. OpenAI submitted a letter to House lawmakers earlier in February 2026 accusing DeepSeek of distillation. TechCrunch, &#8220;Anthropic accuses Chinese AI labs of mining Claude as US debates AI chip exports,&#8221; February 24, 2026. https://techcrunch.com/2026/02/23/anthropic-accuses-chinese-ai-labs-of-mining-claude-as-us-debates-ai-chip-exports/</p><p>[28] Anthropic, &#8220;Detecting and preventing distillation attacks&#8221; (see footnote [25] for URL): &#8220;Distillation attacks reinforce the rationale for export controls: restricted chip access limits both direct model training and the scale of illicit distillation.&#8221;</p><p>[29] The SEC Investor Advisory Committee voted in December 2025 to recommend AI disclosure requirements, citing that only 40% of S&amp;P 500 companies provide AI-related disclosures. SEC Chair Paul Atkins rejected prescriptive rules. Crowell &amp; Moring, &#8220;Investor Advisory Committee Recommends SEC Disclosure Guidelines for Artificial Intelligence,&#8221; 2025.</p><p>[30] FASB Invitation to Comment on intangible asset recognition, December 2024. BDO&#8217;s response flagged the AI training cost gap. Bloomberg Tax, &#8220;Accounting Groups Differ on Tracking Intangible Assets in AI Era,&#8221; 2025. https://news.bloombergtax.com/financial-accounting/accounting-groups-differ-on-tracking-intangible-assets-in-ai-era The FASB and IASB held a joint meeting on intangible assets in 2025; both acknowledged the issue but took no immediate action. IAS Plus, &#8220;Intangible assets,&#8221; 2025.</p><p>[31] The IASB commenced its IAS 38 modernization project in April 2024 but has not published an exposure draft. ACCA&#8217;s AB Magazine reported in June 2025 that the IASB is exploring recognition of internally generated intangibles, including AI-related assets, but any new standard is years away.</p><p>[32] PCAOB 2025 inspection priorities included a focus on &#8220;audits of issuers with significant investment in artificial intelligence technologies.&#8221; PCAOB, &#8220;Staff Report Outlines 2025 Inspection Priorities,&#8221; 2025. No specific guidance on AI cost classification has been issued.</p><p>[33] Dario Amodei, &#8220;Cheeky Pint&#8221; podcast, reported August 2025; also described in a conversation with Stripe co-founder John Collison. The per-model profitability framing: &#8220;In 2023, you train a model that costs $100 million, and then you deploy it in 2024, and it makes $200 million of revenue. Meanwhile, in 2024, you also train a model that costs $1 billion.&#8221; The company-level P&amp;L shows escalating losses even as each vintage&#8217;s inference revenue exceeds its training cost. This framing is contested &#8212; it is structurally analogous to pharmaceutical per-drug profitability claims when the company-level entity is unprofitable &#8212; but it accurately describes the training cost dynamics.</p><p>[34] The 1990s telecom overbuild involved more than $500 billion invested in fiber optic infrastructure. Global Crossing went from a $47 billion valuation to bankruptcy. Total telecom equity losses exceeded $2 trillion. Cisco took a $2.25 billion inventory write-off in April 2001. The vendor-financing mechanism was critical to the bust: equipment makers like Cisco and Lucent provided financing to customers to purchase their products, creating self-referential demand that inflated revenue until the credit collapsed. The AI parallel (Nvidia investing in AI labs that buy Nvidia chips) shares this structural feature, though the AI ecosystem has more diverse end-use cases than dark fiber. Fortune, &#8220;AI dot-com bubble parallels,&#8221; September 2025. MOI Global, &#8220;Parallels Between the Hyperscalers and the Telecom Firms of the 1990s,&#8221; https://moiglobal.com/parallels-between-the-hyperscalers-and-the-telecom-firms-of-the-1990s/. Strategy+Business, &#8220;Why Cisco Fell: Outsourcing and Its Perils.&#8221;</p><p>[35] KKR, &#8220;Beyond the Bubble: Why AI Infrastructure Will Compound Long after the Hype,&#8221; November 2025. https://www.kkr.com/insights/ai-infrastructure KKR argues that AI infrastructure overbuilds are more likely to behave as rolling upgrades than stranded assets, because new workloads absorb excess capacity. The argument is strongest for physical infrastructure (data centers, power) and weakest for the trained model itself, which has no secondary market and no residual value once superseded.</p><p>[36] ASC 805, Business Combinations, requires acquirers to identify and measure intangible assets separately from goodwill. For an acquisition of an AI company, the trained model is the primary intangible asset &#8212; but standard valuation methods (relief-from-royalty, multi-period excess earnings) require an estimate of useful life and future cash flows. With competitive shelf lives of three to twelve months and no established market comparables, the valuation exercise is unusually speculative. This has direct implications for PE firms acquiring AI companies: the purchase price allocated to the trained model may require aggressive write-down within months of the transaction closing.</p><p>[37] Anthropic, &#8220;Detecting and preventing distillation attacks&#8221; (see footnote [25] for URL). Anthropic reports building &#8220;several classifiers and behavioral fingerprinting systems to identify suspicious distillation attack patterns in API traffic,&#8221; along with enhanced verification for accounts and safeguards to reduce the efficacy of outputs for illicit distillation. The defensive capacity is real but the asymmetry favors the attacker: each account ban triggers a replacement from the hydra cluster, and distillation traffic can be mixed with legitimate customer requests to evade pattern detection. Independent efficiency gains also close the gap &#8212; DeepSeek V3 achieved near-frontier performance at a claimed $5.6 million in compute &#8212; but the distillation path remains faster and cheaper for targeted capability extraction.</p><p>[38] Aggregate 2026 capital expenditure guidance as of Q4 2025 earnings: Amazon $200 billion (Q4 2025 earnings call, February 5, 2026); Alphabet $175&#8211;185 billion (Q4 2025 earnings); Meta $115&#8211;135 billion (Q4 2025 earnings, January 29, 2026); Microsoft fiscal year 2026 guidance implies approximately $80 billion based on quarterly run rates. These are total capex figures, not exclusively AI-related &#8212; Amazon&#8217;s figure includes fulfillment and logistics infrastructure &#8212; but the majority of incremental spending across all four companies is AI-driven. Silicon Republic, &#8220;Investors worried after Big Tech plans $650bn spend in 2026,&#8221; February 6, 2026. https://www.siliconrepublic.com/business/big-tech-650bn-capital-expense-bill-2026-meta-amazon-google-microsoft</p>]]></content:encoded></item><item><title><![CDATA[What to Buy for Local LLMs (April 2026)]]></title><description><![CDATA[A practitioner&#8217;s guide to inference and training, why they&#8217;re not the same machine, and why you may still need cloud]]></description><link>https://www.airealist.ai/p/what-to-buy-for-local-llms-april</link><guid isPermaLink="false">https://www.airealist.ai/p/what-to-buy-for-local-llms-april</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Fri, 03 Apr 2026 15:55:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!iykK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iykK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iykK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 424w, https://substackcdn.com/image/fetch/$s_!iykK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 848w, https://substackcdn.com/image/fetch/$s_!iykK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 1272w, https://substackcdn.com/image/fetch/$s_!iykK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iykK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5597243,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193079086?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iykK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 424w, https://substackcdn.com/image/fetch/$s_!iykK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 848w, https://substackcdn.com/image/fetch/$s_!iykK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 1272w, https://substackcdn.com/image/fetch/$s_!iykK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb4e8714-67d0-43dc-8c88-43bcb221f896_2816x1584.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>I just published a piece on why NVIDIA&#8217;s product segmentation created this market.[1] This is the practical companion. No thesis, no structural argument &#8212; just what works, what doesn&#8217;t, and what it costs. All prices in USD as of April 2026; EUR and GBP prices are roughly comparable at current exchange rates. I&#8217;ll update this guide quarterly.</p><p>A GPU is not a computer. Every NVIDIA recommendation below assumes a workstation around it. Prebuilt single-GPU RTX 5090 systems run $5,000 to $8,000 complete &#8212; often cheaper than the GPU alone at street prices.[2] Professional workstations cost more. Dual-GPU builds start around $7,600. Apple and AMD mini PCs are complete systems; NVIDIA GPUs are not. I&#8217;ll state total system costs throughout.</p><p><strong>Software.</strong> For inference: Ollama or LM Studio on Apple Silicon (both wrap llama.cpp&#8217;s Metal backend; Ollama is adding MLX). Ollama or llama.cpp with CUDA on NVIDIA single-GPU. vLLM for multi-GPU serving. For training: Unsloth (built on Hugging Face&#8217;s TRL and PEFT ecosystem) on CUDA; mlx-lm for LoRA, QLoRA, and full fine-tuning on Apple Silicon. Download GGUF models from Hugging Face Hub for Ollama and llama.cpp; safetensors for vLLM and transformers. All model sizes in this guide assume Q4_K_M quantization &#8212; the standard, quality-optimized 4-bit format &#8212; unless otherwise noted.</p><h1>Inference</h1><p>Inference is memory-bandwidth-bound. The hardware that generates tokens fastest is the hardware that reads model weights from memory fastest. Capacity determines which models you can run. Bandwidth determines how fast they run.</p><h3>Under 30B: RTX 5090</h3><p>Nothing touches it. 32 gigabytes of GDDR7 at 1,792 GB/s.[3] A dense 30B model fits and runs at 60 to 90 tokens per second at short to moderate context &#8212; the bandwidth ceiling for 32B Q4 decode is about 94 tok/s. Speeds drop at longer context as KV cache competes for bandwidth. MoE architectures are dramatically faster: Hardware Corner measured a 30B MoE at 234 tok/s because only 3B active parameters are read per token.[3] If your workload fits in 32 gigabytes, buy this and stop reading. At long context (64K+), KV cache grows fast &#8212; verify that model weights plus KV cache fit before committing. $3,500 to $4,800 at current street prices; the DRAM shortage has made these hard to find. System cost: $5,000 to $8,000.[4] If 24GB is enough for your models, a used RTX 4090 at $1,500 to $2,200 remains the best-value NVIDIA card &#8212; 1,008 GB/s bandwidth, mature CUDA support, and a total system cost under $4,000.</p><h3>70B: Mac Studio M4 Max</h3><p>The 70B sweet spot. 128 gigabytes of unified memory at 546 GB/s.[5] A Q4 Llama 3.3 70B runs at 8 to 15 tokens per second &#8212; closer to 15 at short context, dropping toward 8 at longer conversations. Speculative decoding with a small draft model can roughly double effective throughput in llama.cpp, though results depend on how well the draft model matches the target. $3,499 with 512GB SSD, $3,699 with 1TB (the practical minimum for storing multiple large models).[6] Complete system &#8212; plug in power and a display, and you&#8217;re running. The M3 Ultra (819 GB/s, starting at $3,999 for 96GB) is faster per token, but the M4 Max is the value pick at this tier. Most practitioners use llama.cpp&#8217;s Metal backend via Ollama or LM Studio. Ollama is transitioning to an MLX backend, with a preview showing 57% faster prefill and 93% faster generation on supported models.[7]</p><h3>70B on a budget: Mac mini cluster</h3><p>Four Mac Mini M4 Pro units (48GB each) connected via Thunderbolt 5 pool 192 gigabytes of shared memory for $6,400 to $7,200.[8] EXO Labs demonstrated Nemotron 70B at 4-8 tokens per second and Qwen2.5Coder-32B at 18 tokens per second on M4 Pro clusters.[9] The entire cluster draws about 200 watts under full load &#8212; less than a single RTX 5090. The catch: you need direct Thunderbolt 5 cable connections between nodes (no TB5 switches exist yet), and inter-node latency makes this better suited for batch inference than interactive chat. macOS 26.2&#8217;s RDMA support drops inter-node latency from about 300 microseconds to under 50, but that&#8217;s still orders of magnitude slower than on-chip memory access.[10] If your budget is $7,000 and you want 70B inference with room to grow, a cluster is viable. For the best single-machine experience at 70B, the Mac Studio M4 Max at $3,699 is still the answer.</p><h3>70B with CUDA: RTX PRO 6000</h3><p>The CUDA answer to the 70B tier. 96 gigabytes of GDDR7 at 1.8 TB/s &#8212; nearly identical bandwidth to the RTX 5090 but with three times the VRAM.[11] A 70B Q4 model fits on a single card with over 50GB of headroom for long context and concurrent users. For team serving (4+ users via vLLM), that headroom matters &#8212; each concurrent user at 8K context adds 2-4 gigabytes of KV cache.</p><p>No NVLink &#8212; dual-card setups communicate over PCIe Gen 5. A dual PRO 6000 gives you 192GB total for running 70B in FP16 or fitting very large models, but the PCIe interconnect creates the same bottleneck as dual 5090s for cross-GPU workloads. A single-card PRO 6000 avoids the bottleneck entirely and handles 70B Q4 with room to spare.</p><p>A complete single-GPU professional workstation runs about $22,000; a dual-GPU one, about $30,000 to $33,000.[12] At these prices, the honest comparison is a year of B200 cloud time. Buy a PRO 6000 if you need always-on 96GB CUDA locally &#8212; for team inference, compliance-constrained training, or workflows where cloud latency or data residency rules it out.</p><h3>Multi-GPU NVIDIA (no NVLink)</h3><p><strong>Dual RTX 5090 (64GB).</strong> Two RTX 5090s give you 64 gigabytes of VRAM and access to vLLM&#8217;s tensor parallelism over PCIe.[13] NVLink was last available on the RTX 3090; the two GPUs communicate over PCIe x8/x8, a bottleneck for large models. A 70B Q4 model fits in 64GB but runs at a pace comparable to or slower than a single Mac Studio M4 Max &#8212; per-layer PCIe synchronization overhead eats up the raw bandwidth advantage. Where dual 5090s shine is inference on 30 to 40B models that benefit from parallelism, or training (see below). System cost: $9,000 to $12,000, or $7,600 for prebuilt GPUs at list price.[14]</p><h3>200B+: Mac Studio M3 Ultra</h3><p>Still the current Ultra &#8212; Apple skipped the M4 generation. 256 gigabytes at 819 GB/s. Llama 3.1 405B fits in Q4 (~235 GB). DeepSeek V3 671B fits only at aggressive quantization (1.5-2-bit dynamic quants via Unsloth, ~192-226GB) &#8212; functional but with measurable quality loss.[15] About $5,999 on the base chip with 1TB SSD. The M5 Ultra is expected mid-2026 with potentially 1,200+ GB/s bandwidth &#8212; if you can wait two to three months, wait.[16] Jeff Geerling tested a four-unit M3 Ultra cluster connected via Thunderbolt 5 RDMA, pooling 1.5 terabytes of unified memory and running large MoE models at 28 to 32 tokens per second.[17] macOS 26.2 enables RDMA natively, though clusters max out at four units in a full mesh (no TB5 switches).[18] Apple recently removed the 512GB option and raised the 256GB upgrade price from $1,600 to $2,000 &#8212; a signal of a DRAM shortage.[19]</p><h3>Budget and niche</h3><p><strong>AMD Strix Halo.</strong> 128 gigabytes for $2,000 in a mini PC.[20] Bandwidth is lower (212 GB/s measured), which makes dense 70B models painfully slow at 3 to 5 tokens per second. But Mixture-of-Experts models change the math: Llama 4 Scout (109B total, 17B active MoE) manages an estimated 10 to 20 tokens per second.[21] Vulkan via llama.cpp now outperforms AMD&#8217;s own ROCm on Strix Halo.[22] If you&#8217;re on a budget and your workloads are MoE-heavy, this is the most memory per dollar you can buy.</p><p><strong>DGX Spark.</strong> 128 gigabytes of LPDDR5x, 273 GB/s, $4,699.[23] Hard to recommend for most practitioners. For inference, a Mac Studio M4 Max delivers twice the bandwidth at a lower price. For training, a PRO 6000 is faster, and the cloud is cheaper &#8212; $4,699 buys over 900 hours of B200 time. The Spark&#8217;s only defensible use case is always-on, locally 128GB of CUDA when the cloud is not an option (air-gapped environments, compliance constraints, or workflows that require continuous local iteration at 70B+ model scales). The EXO Labs hybrid setup (Spark for prefill, Mac Studio for decode) showed a 2.8&#215; speedup on an 8B model, but 70B+ results have not been published.[24]</p><h3>Coming soon</h3><p>The Mac Studio M5 Ultra is expected in mid-2026, potentially with 1,200+ GB/s and up to 256GB.[16] AMD&#8217;s Strix Point is also expected to be released late 2026 with improved bandwidth. The rumored RTX 5090 Super with 48GB GDDR7 would change the NVIDIA story at the 70B tier &#8212; but the DRAM shortage makes a 2026 launch unlikely.</p><h1>Training: Supervised Fine-Tuning (SFT)</h1><p>SFT &#8212; training a model on input-output pairs to follow instructions, adopt a style, or learn a domain &#8212; is the most common local training task. Memory scales with model size, quantization, and method: full fine-tuning loads the entire model and its optimizer states; LoRA freezes most weights and trains small adapter layers; QLoRA additionally quantizes the frozen weights to 4-bit &#8212; cutting VRAM by 60-80%.</p><h3>8B to 40B: RTX 5090</h3><p>QLoRA fine-tuning of an 8B model takes 7 to 16 gigabytes of VRAM with Unsloth depending on LoRA rank, context length, and batch size &#8212; 7GB at rank 16 with short context, 14 to 16GB at rank 64 with 8K context.[25] Unsloth&#8217;s &#8220;full fine-tuning&#8221; mode &#8212; all parameters trained, but base weights stored in 4-bit &#8212; uses 20 to 24 gigabytes, which is workable on 32GB.[25] Traditional FP16 full fine-tuning of 8B (model + AdamW optimizer states + gradients) needs 48 to 64 gigabytes and does not fit on a single 5090. Unsloth&#8217;s Blackwell-optimized kernels deliver about 2&#215; the training speed of standard implementations.[26] System cost: $5,000 to $8,000.</p><p>NVIDIA&#8217;s own benchmarks show QLoRA fine-tuning of models up to 40B parameters on a single RTX 5090.[27] Full SFT of 40B does not fit in 32GB. This is the ceiling for single-GPU consumer SFT.</p><h3>50B to 70B: PRO 6000, dual 5090, or cloud</h3><p>A single PRO 6000 with 96 gigabytes can QLoRA a 70B model at about 38 gigabytes peak VRAM &#8212; about 4 hours for a standard fine-tune.[28] The DGX Spark&#8217;s 128GB also handles 70B QLoRA, though lower bandwidth makes it 30-50% slower, and at $4,699, a cloud GPU is cheaper unless you need to stay local. Full SFT of 70B requires about 300GB total (model weights, AdamW optimizer states, and gradients) &#8212; cloud only (2&#215; H100 80GB with DeepSpeed ZeRO-3, or a single B200).[29] If you don&#8217;t own a PRO 6000 (a complete professional workstation runs about $22,000), renting a cloud GPU for a few hours is cheaper for occasional fine-tuning.</p><p>Two 5090s (64GB combined) with DeepSpeed ZeRO can train QLoRA models up to 50-60B &#8212; beyond the single-GPU ceiling but limited by PCIe interconnect overhead. Not practical for full SFT of 70B (optimizer states don&#8217;t fit). System cost: $9,000 to $12,000.[30]</p><h3>Apple Silicon and AMD</h3><p><strong>MLX LoRA.</strong> It works. mlx-lm supports LoRA and QLoRA natively.[31] mlx-tune adds an Unsloth-compatible training API on top of MLX, supporting SFT, DPO, GRPO, and multi-modal fine-tuning, letting you prototype locally on Apple Silicon before scaling to cloud GPUs. The 128 to 256 gigabytes of unified memory on a Mac Studio lets you load larger SFT models than any consumer NVIDIA card. The ecosystem is thinner: no Unsloth (yet &#8212; &#8220;coming very soon&#8221;), no DeepSpeed. If your workflow is LoRA on a custom dataset, MLX handles it well. If you need GRPO, DPO, or the latest training techniques, you need CUDA.[32]</p><p><strong>AMD.</strong> Functional but not recommended as your primary path. ROCm supports PyTorch training, and Unsloth offers AMD compatibility through its Core library. The driver-kernel maturity gap means more debugging than training.[33]</p><h2>Training: Reinforcement Learning (GRPO, DPO)</h2><p>RL fine-tuning is harder on hardware than SFT. GRPO (the technique behind DeepSeek R1) generates multiple completions per prompt, scores them, and updates the policy model &#8212; requiring 1.5 to 2&#215; the memory of equivalent SFT because the model must hold both the policy weights and the generated sequences simultaneously. DPO loads a quantized reference model alongside the training model, adding 2-4 gigabytes to an 8B model with modern implementations. Both require CUDA for production-quality training as of April 2026 &#8212; TRL&#8217;s DPO trainer technically runs on any PyTorch backend, including ROCm, but optimization and stability are not there yet for serious workloads.</p><h3>8B to 30B: RTX 5090 or cloud</h3><p>GRPO on an 8B model via Unsloth uses 14-18 gigabytes &#8212; well within the 5090&#8217;s 32GB.[34] DPO on 8B is similar. This is the entry point for local RL. System cost: $5,000 to $8,000.</p><p>GRPO on a 14B model pushes into 22-28 gigabytes, leaving the 5090 with little headroom for longer sequences. A 30B GRPO run may not fit at all depending on sequence length and batch size. The DGX Spark&#8217;s 128GB handles 30B GRPO with room to spare &#8212; but a cloud B200 does it faster and cheaper unless you&#8217;re running these jobs frequently enough to justify the $4,699.[35]</p><h3>70B: PRO 6000 (marginal) or cloud</h3><p>GRPO or DPO on 70B needs 80-100 gigabytes for the policy model, generated sequences, and optimizer states. No consumer device handles this. A single PRO 6000 (96GB) may fit 70B GRPO at the lower end of that range, but has no headroom &#8212; and at $22,000+ for the workstation, cloud is almost always the better answer. A dual PRO 6000 over PCIe gives you 192GB but adds interconnect overhead, and the price tag becomes astronomical. A B200 (192GB) or 2&#215; H100 (160GB combined) handle it cleanly. Budget $15 to $50 per GRPO run, depending on provider pricing ($3 to $6 per GPU-hour).[36]</p><h3>Coming soon for training</h3><p>Unsloth lists &#8220;MLX training coming very soon&#8221; as of March 2026 &#8212; if this ships, Apple Silicon gains GRPO and SFT through Unsloth&#8217;s optimized kernels, narrowing the CUDA gap significantly for parameter-efficient methods.[37] On the NVIDIA side, the PRO 6000 with 96GB remains the local SFT limit; for RL beyond 8B, most practitioners are better served by the cloud.</p><h2>Cloud</h2><p>For workloads that exceed local hardware &#8212; 70B+ full fine-tuning, 70B RL, multi-GPU distributed training, or high-concurrency production inference &#8212; the B200 is the default. The discipline that makes cloud work: don&#8217;t leave it idle, and don&#8217;t use it for debugging.</p><p><strong>B200 for training and inference.</strong> 192GB HBM3e at 8,000 GB/s, with NVLink for multi-GPU scaling. A single B200 handles 70B QLoRA and 70B GRPO; an 8-GPU node handles 70B full SFT. For inference, the B200 delivers up to 4.9&#215; the throughput of a PRO 6000 and wins on cost-per-token despite the higher hourly rate.[38] Pricing: $3 to $6 per GPU-hour on neo-cloud providers, higher on hyperscalers.</p><p><strong>The workflow that saves money:</strong> get everything ready locally &#8212; code, data, configuration, hyperparameters tested on a small model &#8212; then ship the job to the cloud. A $5/hour B200 running for 4 focused hours costs $20. The same B200 left idle overnight while you debug a data loading issue costs $120. The difference between a $20 fine-tune and a $500 fine-tune is almost entirely local preparation.[39]</p><p><strong>H100 and H200 remain viable.</strong> H100 at $2.50 to $3.00 per GPU-hour is adequate for 8B to 30B SFT and RL. H200 at $3.00 to $4.00 per GPU-hour with 141GB HBM3e is the value option for 70B QLoRA when B200 availability is tight.[40]</p><p><strong>Providers.</strong> RunPod, Lambda, Vast.ai, Together AI, and the hyperscalers all offer B200 and H100/H200 instances. Pricing, availability, and minimum commitments change faster than this guide can be updated. For occasional training jobs, spot instances work. For sustained inference, reserved capacity or on-prem is more economical, which is where the local hardware recommendations above take over.[41]</p><h2>The asymmetry</h2><p>The same machine is rarely best for both. The Mac Studio dominates inference because bandwidth is king, and Apple ships more of it per dollar than anyone. The RTX 5090 dominates local training because there&#8217;s no equivalent in CUDA's ecosystem. And for anything beyond what fits in 32 or 96 gigabytes of VRAM, the B200 is the default &#8212; as long as you treat cloud time as a production resource, not a sandbox.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hh8s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hh8s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 424w, https://substackcdn.com/image/fetch/$s_!Hh8s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 848w, https://substackcdn.com/image/fetch/$s_!Hh8s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 1272w, https://substackcdn.com/image/fetch/$s_!Hh8s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hh8s!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png" width="1200" height="857.967032967033" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:1041,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:429099,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193079086?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hh8s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 424w, https://substackcdn.com/image/fetch/$s_!Hh8s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 848w, https://substackcdn.com/image/fetch/$s_!Hh8s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 1272w, https://substackcdn.com/image/fetch/$s_!Hh8s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ad81baf-8a1b-49ad-9644-1b4d14281daf_2400x1716.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>What you get at each price point.</strong></figcaption></figure></div><p>For practitioners who need local, always-on inference, a 128 GB+ Apple Silicon Mac is the best option today. For training, a local GPU will force you to compromise on model sizes and algorithms, and rely on a B200 in the cloud for jobs that don&#8217;t fit.</p><p>The irony &#8212; training on the company whose product segmentation created the inference vacuum, then inferring on the company that filled it &#8212; is the subject of the companion piece.[42]</p><div><hr></div><h3>Notes</h3><p>[1] &#8220;Your Parents Paid,&#8221; <a href="https://www.airealist.ai/">The AI Realist</a>. Paid post.</p><p>[2] Prebuilt single-GPU RTX 5090 workstation pricing, April 2026: MSI Aegis $3,599 (sold out &#8212; cheaper than standalone GPUs); Skytech $5,300; CyberPower/Maingear $4,400&#8211;$5,300; Alienware Area-51 $5,300 (discounted). ArsenalPC MES2X dual RTX 5090 base: $7,602. Professional workstations (Dell Precision, Lenovo ThinkStation) start higher. Sources: <a href="https://www.tomshardware.com/">Tom&#8217;s Hardware</a>; <a href="https://videocardz.com/">VideoCardz</a>; <a href="https://petronella.com/">Petronella AI Workstation Guide</a>.</p><p>[3] <a href="https://www.nvidia.com/en-us/geforce/graphics-cards/50-series/rtx-5090/">NVIDIA RTX 5090</a>: 32GB GDDR7, 512-bit bus, 1,792 GB/s. Bandwidth ceiling for decode: 1,792 GB/s &#247; model weight read per token. Dense 32B Q4_K_M (~19GB): ceiling ~94 tok/s. MoE 30B with 3B active (~2GB read): ceiling ~896 tok/s. <a href="https://www.hardware-corner.net/rtx-5090-llm-benchmarks/">Hardware Corner RTX 5090 LLM benchmarks</a> (Q4_K_XL via llama-bench): Qwen3 8B at 145&#8211;185 tok/s TG, Qwen3 30B A3B MoE at 234 tok/s (4K context) declining to ~110 tok/s (32K), Qwen3 32B dense at 52 tok/s (147K extreme context). <a href="https://localllm.in/blog/best-gpus-llm-inference-2025">LocalLLM.in</a>/RunPod report 213 tok/s on 8B models.</p><p>[4] RTX 5090 street prices as of April 2026: Newegg FE at $3,695, Amazon at $3,899, custom AIB models $4,500&#8211;4,800 (<a href="https://wccftech.com/">WCCFTech</a>, BestValueGPU tracker). The DRAM shortage has driven prices well above the $1,999 list price. Prebuilt RTX 5090 workstations: $5,000&#8211;8,000 complete. <a href="https://videocardz.com/">VideoCardz</a> notes that standalone RTX 5090 pricing has approached the cost of entire prebuilt systems. EU: &#8364;3,800&#8211;5,200. UK: &#163;3,200&#8211;4,000.</p><p>[5] <a href="https://www.apple.com/shop/buy-mac/mac-studio">Apple Mac Studio M4 Max</a>: 128GB unified memory requires the 16-core CPU / 40-core GPU chip variant. 546 GB/s memory bandwidth.</p><p>[6] Mac Studio M4 Max 128GB pricing confirmed by <a href="https://petapixel.com/">PetaPixel</a> review (March 2025) and <a href="https://www.bhphotovideo.com/">B&amp;H Photo</a> (April 2026): $3,499 with 512GB SSD, $3,699 with 1TB SSD. EU: &#8364;4,099 / &#8364;4,299. UK: &#163;3,599 / &#163;3,799. Build-to-order upgrade from the $1,999 base (36GB).</p><p>[7] <a href="https://ollama.com/">Ollama</a> MLX preview: March 2026. Performance claims from Ollama blog. llama.cpp Metal remains the current default for most Mac users.</p><p>[8] <a href="https://www.apple.com/shop/buy-mac/mac-mini">Mac Mini M4 Pro 48GB</a>: $1,599 with 512GB SSD, $1,799 with 1TB. Four units: $6,396&#8211;$7,196 plus ~$200 for Thunderbolt 5 cables. The 24GB base ($1,399) is cheaper but limits total pooled memory to 96GB &#8212; not enough for 70B. M4 Pro bandwidth: 273 GB/s per node.</p><p>[9] EXO Labs Mac Mini cluster benchmarks: Qwen2.5Coder-32B at 18 tok/s, Nemotron-70B at 4&#8211;8 tok/s on M4 Pro nodes. Five-node cluster total power: ~200W under full load. Sources: AIBase; <a href="https://medium.com/">Medium/Faizan Saghir</a> (January 2026).</p><p>[10] macOS 26.2 RDMA over Thunderbolt 5: inter-node latency ~300&#956;s &#8594; under 50&#956;s (Awesome Agents, March 2026). Requires Recovery Mode boot (<code>rdma_ctl enable</code>). No TB5 switches exist &#8212; direct cabling only. <a href="https://github.com/exo-explore/exo">EXO Labs</a> is the primary clustering software.</p><p>[11] <a href="https://www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000/">RTX PRO 6000 Blackwell Workstation Edition</a> (<a href="https://www.nvidia.com/content/dam/en-zz/Solutions/data-center/rtx-pro-6000-blackwell-workstation-edition/workstation-blackwell-rtx-pro6000-workstation-edition-nvidia-us-3519208-web.pdf">datasheet PDF</a>): 96GB GDDR7 ECC, 1.8 TB/s bandwidth, PCIe Gen 5 x16. No NVLink support &#8212; multi-GPU setups communicate over PCIe only, same bottleneck as dual 5090s. MSRP ~$8,565; retail $8,000&#8211;$9,200 as of March 2026. Server Edition: 1.6 TB/s, passive cooling. Also: <a href="https://www.thundercompute.com/blog/nvidia-rtx-pro-6000-pricing">Thunder Compute</a>; <a href="https://lenovopress.lenovo.com/lp2263-thinksystem-nvidia-rtx-pro-6000-blackwell-server-edition-pcie-gen5-gpu">Lenovo Press</a>.</p><p>[12] PRO 6000 workstation pricing: <a href="https://www.apy-groupe.com/">APY</a> (France) configures a single RTX PRO 6000 + Threadripper Pro 9965WX + 128GB ECC + 1TB at &#8364;20,305 HT (~$22,000). Dual adds ~&#8364;8,000 for the second GPU: total &#8364;28,000&#8211;30,000 HT (~$30,000&#8211;33,000). BOXX APEXX T4 PRO-X priced similarly. Cloud comparison: a B200 at $5/hr &#215; 24/7 &#215; 30 days = $3,600/month. The dual PRO 6000 breaks even in 8&#8211;9 months of continuous use at these prices.</p><p>[13] Dual RTX 5090: the RTX 5090 does NOT support NVLink (removed since RTX 3090). Two GPUs communicate over PCIe. X670E/X870E consumer boards bifurcate to x8/x8 with two GPUs. <a href="https://github.com/vllm-project/vllm">vLLM</a> supports tensor parallelism over PCIe for inference; <a href="https://github.com/microsoft/DeepSpeed">DeepSpeed</a> ZeRO supports it for training.</p><p>[14] Dual RTX 5090 system cost: <a href="https://www.arsenalpc.com/">ArsenalPC</a> MES2X base at $7,602. At current GPU street prices ($3,500&#8211;4,800 each), DIY or custom builds run $9,000&#8211;12,000. RTX 5090 TDP: 575W; dual GPUs need 1,500W+ PSU. Consumer AM5 boards bifurcate to x8/x8; Threadripper provides full x16/x16 but adds $4,500+ to CPU cost.</p><p>[15] Mac Studio M3 Ultra 256GB: base (28-core, 96GB, 1TB) at $3,999 + $2,000 memory upgrade = ~$5,999. Higher chip (32-core/80-core) adds $1,500. 512GB option discontinued March 2026. GGUF file sizes for DeepSeek V3.1 671B: Q8_0 = 713GB, Q4_K_M = 405GB, Q3_K_M = 320GB, Q2_K = 246GB, UD-IQ1_S = 192GB (<a href="https://huggingface.co/unsloth/DeepSeek-V3.1-GGUF">unsloth/DeepSeek-V3.1-GGUF</a>). At 256GB, the 671B model requires 1.5 to 2-bit dynamic quantization to fit with room for KV cache and OS. Llama 3.1 405B at Q4_K_M = ~235GB fits comfortably. Also: <a href="https://appleinsider.com/">AppleInsider</a>; <a href="https://videocardz.com/">VideoCardz</a>.</p><p>[16] M5 Ultra: not yet announced. Projection based on M5 Max (614 GB/s on MacBook Pro) and UltraFusion architecture. Expected mid-2026 per <a href="https://www.macworld.com/">Macworld</a>/Bloomberg. M4 Ultra was never released; Apple skipped to M5 Ultra.</p><p>[17] Jeff Geerling, &#8220;1.5 TB of VRAM on Mac Studio &#8212; RDMA over Thunderbolt 5,&#8221; <a href="https://www.jeffgeerling.com/">jeffgeerling.com</a>, December 2025. Four Mac Studios, 1.5TB total. DeepSeek V3.1 671B at 32.5 tok/s, Qwen3 235B at 31.9 tok/s.</p><p>[18] macOS 26.2 RDMA: Requires Recovery Mode boot (<code>rdma_ctl enable</code>). Apple TN3205. No TB5 switches &#8212; clusters require direct full-mesh wiring, limiting practical size to four units.</p><p>[19] Apple removed the 512GB memory option from Mac Studio in March 2026 and raised the 256GB upgrade price. <a href="https://videocardz.com/">VideoCardz</a>; <a href="https://www.macrumors.com/">MacRumors</a>.</p><p>[20] AMD Ryzen AI Max+ 395 (Strix Halo): 128GB LPDDR5x. <a href="https://frame.work/">Framework Desktop</a> at $1,999 (US); ~&#8364;2,200 (EU). Also Beelink GTR9 Pro, GMKtec EVO-X2.</p><p>[21] MoE performance on Strix Halo: community benchmarks (<a href="https://www.hardware-corner.net/gpu-ranking-local-llm/">Hardware Corner GPU ranking</a>, <a href="https://level1techs.com/">Level1Techs</a>). Llama 4 Scout is 109B total / 17B active per token. At Q4, active weights per token read &#8776; 10GB; theoretical max at 212 GB/s &#8776; 21 tok/s. Practical speeds with overhead: ~10&#8211;20 tok/s. Treat as directional estimates.</p><p>[22] AMD Vulkan via <a href="https://github.com/ggerganov/llama.cpp">llama.cpp</a>: AMD used Vulkan for GTC 2026 DGX Spark comparisons. Community testers confirmed Vulkan RADV outperforms ROCm HIP on Strix Halo.</p><p>[23] <a href="https://www.nvidia.com/en-us/products/workstations/dgx-spark/">DGX Spark</a>: 128GB LPDDR5x, 273 GB/s, NVIDIA Grace Blackwell. $4,699 (increased from $3,999 at launch due to memory-shortage surcharge).</p><p>[24] <a href="https://blog.exolabs.net/">EXO Labs</a>, &#8220;Combining NVIDIA DGX Spark + Apple Mac Studio for 4x Faster LLM Inference,&#8221; October 2025. Measured speedup: 2.8&#215; over Mac Studio alone. Model tested: Llama-3.1 8B &#8212; 70B+ not published.</p><p>[25] <a href="https://unsloth.ai/">Unsloth</a> QLoRA VRAM for 8B: ~7GB at rank 16, batch 1, 2K context; ~12&#8211;16GB at rank 64, batch 2, 8K context. Unsloth&#8217;s &#8220;full fine-tuning&#8221; stores base weights in 4-bit but trains all parameters &#8212; uses ~20&#8211;24GB for 8B. This is NOT traditional FP16 full SFT, which would require ~48&#8211;64GB (16GB model + 32GB AdamW states + gradients). <a href="https://developer.nvidia.com/blog/">NVIDIA Developer Blog</a>, November 2025.</p><p>[26] Unsloth Blackwell-optimized kernels: 2&#215; training speed vs standard implementations. <a href="https://developer.nvidia.com/blog/">NVIDIA Developer Blog</a>, November 2025.</p><p>[27] &#8220;Fine-tune models with as many as 40 billion parameters on a single Blackwell GPU.&#8221; <a href="https://developer.nvidia.com/blog/">NVIDIA Developer Blog</a>, November 2025. QLoRA on all linear layers.</p><p>[28] Spheron benchmark: QLoRA fine-tuning of Llama-3.1 70B at 38GB peak VRAM. ~4 hours on A100 80GB; PRO 6000 (96GB) should be comparable. <a href="https://blog.spheron.network/">Spheron blog</a>, February 2026.</p><p>[29] Full SFT of 70B in FP16: ~300GB total (model + optimizer states). Requires 2&#215; H100 80GB with <a href="https://github.com/microsoft/DeepSpeed">DeepSpeed</a> ZeRO-3, or a single B200. Cloud is the practical option.</p><p>[30] Dual RTX 5090 for training: <a href="https://github.com/microsoft/DeepSpeed">DeepSpeed</a> ZeRO Stage 2/3 enables model sharding over PCIe. QLoRA on 50&#8211;60B feasible; full SFT of 70B is not. PCIe x8/x8 creates ~15&#8211;30% throughput penalty vs NVLink.</p><p>[31] <a href="https://github.com/ml-explore/mlx-lm">mlx-lm</a> supports LoRA, QLoRA, and full fine-tuning. Also supports distributed fine-tuning via mx.distributed.</p><p>[32] MLX SFT limitations as of April 2026: no <a href="https://unsloth.ai/">Unsloth</a> integration (&#8221;coming very soon&#8221;), no DeepSpeed. SFT and LoRA work well. The 128&#8211;256GB unified memory on Mac Studio enables SFT on larger models than any consumer NVIDIA card.</p><p>[33] AMD ROCm training: <a href="https://github.com/unslothai/unsloth">Unsloth Core</a> supports AMD GPUs. PyTorch + ROCm is functional. Community reports 10&#8211;20% more debugging overhead vs CUDA.</p><p>[34] GRPO on 8B via <a href="https://unsloth.ai/">Unsloth</a>: ~14&#8211;18GB (model + generated sequences + optimizer states). Fits on RTX 5090 with headroom. <a href="https://www.marktechpost.com/">MarkTechPost</a>, March 2026.</p><p>[35] DGX Spark for RL: 128GB allows GRPO on models up to ~30B. <a href="https://unsloth.ai/">Unsloth</a> Docker supports Spark natively (CUDA 12.x, PyTorch, <a href="https://github.com/huggingface/trl">TRL</a>, GRPO). Bandwidth limitation (~273 GB/s) slows throughput vs PRO 6000, but memory capacity enables model sizes the 5090 cannot touch.</p><p>[36] 70B GRPO memory requirement: policy model (~38GB in 4-bit), generated sequences, optimizer states, gradient buffers. Total: 80&#8211;100GB depending on sequence length and batch size. B200 (192GB) handles this on a single GPU. 2&#215; H100 (160GB combined) is the alternative. At $3&#8211;6/GPU-hr (neo-cloud), a 4&#8211;8 hour GRPO run = $12&#8211;48.</p><p>[37] <a href="https://unsloth.ai/">Unsloth Studio changelog</a>, March 2026: &#8220;macOS: Currently supports chat and Data Recipes. MLX training is coming very soon.&#8221;</p><p>[38] B200 (192GB HBM3e on SXM5; some providers list 180GB variants): 8,000 GB/s bandwidth. CloudRift benchmarks (February 2026): up to 4.9&#215; RTX PRO 6000 throughput in 8-GPU configurations. Pricing as of April 2026: <a href="https://www.runpod.io/">RunPod</a> $4.99/GPU-hr on-demand; <a href="https://spheron.network/">Spheron</a> $6.03 on-demand, $2.18 spot; <a href="https://lambdalabs.com/">Lambda</a> and neo-cloud providers $3&#8211;6/GPU-hr; hyperscalers (AWS, GCP, Azure) $6&#8211;12/GPU-hr on-demand. Average across 22 providers: $4.76/GPU-hr (<a href="https://getdeploying.com/">getdeploying.com</a>).</p><p>[39] The local-then-cloud workflow: test on a small model locally (8B on RTX 5090, or reduced batch on Apple Silicon), verify end-to-end, then scale on cloud hardware.</p><p>[40] H100 SXM: ~$2.50&#8211;3.00/GPU-hr. H200: ~$3.00&#8211;4.00/GPU-hr. H200 delivers 1.8&#8211;2.1&#215; H100 throughput on long-context inference. CloudRift benchmarks.</p><p>[41] Provider comparison changes faster than this guide. Current pricing: <a href="https://www.runpod.io/">RunPod</a>, <a href="https://lambdalabs.com/">Lambda</a>, <a href="https://vast.ai/">Vast.ai</a>, <a href="https://together.ai/">Together AI</a>.</p><p>[42] &#8220;Your Parents Paid,&#8221; <a href="https://www.airealist.ai/">The AI Realist</a>.</p>]]></content:encoded></item><item><title><![CDATA[Your Parents Paid]]></title><description><![CDATA[NVIDIA built the world&#8217;s most profitable hardware company by treating its consumer GPUs as a recruitment pipeline. Now many recruits are buying Macs.]]></description><link>https://www.airealist.ai/p/your-parents-paid</link><guid isPermaLink="false">https://www.airealist.ai/p/your-parents-paid</guid><dc:creator><![CDATA[Julien Simon]]></dc:creator><pubDate>Fri, 03 Apr 2026 15:54:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ipfq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ipfq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ipfq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 424w, https://substackcdn.com/image/fetch/$s_!Ipfq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 848w, https://substackcdn.com/image/fetch/$s_!Ipfq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 1272w, https://substackcdn.com/image/fetch/$s_!Ipfq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ipfq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8707322,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.airealist.ai/i/193075250?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ipfq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 424w, https://substackcdn.com/image/fetch/$s_!Ipfq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 848w, https://substackcdn.com/image/fetch/$s_!Ipfq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 1272w, https://substackcdn.com/image/fetch/$s_!Ipfq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2b404107-503d-4a6d-a9fc-710c0142a227_2816x1584.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Jensen Huang stood before 20,000 developers at GTC 2026 and said something remarkable about the product line that made NVIDIA a household name. &#8220;GeForce is NVIDIA&#8217;s greatest marketing campaign,&#8221; he told the crowd. &#8220;We attract future customers starting long before you could afford to pay for it yourself. Your parents paid.&#8221; He paused, then repeated it: &#8220;Your parents paid for you to be NVIDIA customers. And every single year, they paid up. Year after year after year until someday you became an amazing computer scientist and became a proper customer, a proper developer.&#8221; Then the kicker: &#8220;This is the house that GeForce made.&#8221;[1]</p><p>The audience laughed. They weren&#8217;t supposed to take notes. The product specs tell a different story from the keynote.</p><h2>The house that GeForce built and its tenants</h2><p>In fiscal year 2026, NVIDIA&#8217;s datacenter segment generated $193.7 billion in revenue, roughly 90% of the company&#8217;s total revenue of $215.9 billion.[2] Gaming, the segment that includes GeForce, contributed $16 billion. Seven percent. The company&#8217;s gross margin for the full year was 71.1%.[3] NVIDIA didn&#8217;t just build the house that GeForce made: it evicted GeForce from the master bedroom, converted it to an Airbnb, and moved to a penthouse funded by H100s.</p><p>That financial reality shapes every product NVIDIA ships in ways Jensen didn&#8217;t mention on stage. NVIDIA&#8217;s consumer product line is not engineered to serve its most demanding users. It is engineered to ensure that its most demanding users become datacenter customers. The RTX 5090 has 32 gigabytes of video memory. The next NVIDIA product with enough memory to run a 70-billion-parameter model costs four times as much. The product after that costs more than ten times as much. This is not a gap in the lineup. It is the lineup.[4]</p><p>NVIDIA didn&#8217;t lose the local inference market. It designed a product line that made winning it someone else&#8217;s job.</p><h2>Three layers of segmentation</h2><p>The mechanism has three parts. Each independently routes demand toward NVIDIA&#8217;s highest-margin products. Together, they create a segmentation architecture so precise that it may have inadvertently handed Apple and AMD the fastest-growing consumer AI use case.</p><p>The first layer is the VRAM ceiling. The RTX 5090, launched in January 2025, pairs 32 gigabytes of GDDR7 memory with a 512-bit memory bus delivering 1,792 GB/s of bandwidth &#8212; a 78% generational improvement that makes it the highest-bandwidth consumer GPU ever built for workloads that fit in memory.[5] NVIDIA did increase VRAM by a third, from 24 gigabytes on the RTX 4090. The problem is that model sizes have increased faster. A 70-billion-parameter model quantized to 4-bit precision requires roughly 35-40 gigabytes for weights alone, more with long context. It does not fit. A 120-billion-parameter Mixture-of-Experts model requires 60-70 gigabytes. It does not fit. The emerging class of frontier open-weight models &#8212; DeepSeek R1 at 671 billion parameters and Llama 3.1 at 405 billion &#8212; requires memory measured in the hundreds of gigabytes. None of them fit.[6]</p><p>The 32-gigabyte ceiling is not a technical constraint. Samsung&#8217;s 3-gigabyte GDDR7 modules are in mass production. NVIDIA&#8217;s own Founders Edition design video inadvertently showed the RTX 5090 PCB labeled with 3-gigabyte module part numbers.[7] The RTX 5090 laptop variant already ships with 3-gigabyte modules.[8] GamersNexus confirmed during its teardown of the RTX PRO 6000 that the same GB202 die &#8212; identical silicon, slightly more cores enabled &#8212; supports 96 gigabytes using thirty-two 3-gigabyte chips.[9] A 48-gigabyte consumer card is well within NVIDIA&#8217;s engineering capability: the silicon supports it, the modules exist, and the laptop ships with them. NVIDIA chose not to ship it.</p><p>The reason is arithmetic. The RTX PRO 6000, with 96 gigabytes of GDDR7 ECC on the same GB202 die, costs $7,999 to $8,900.[10] Same silicon with 10% more cores. Triple the memory. Four times the price. If NVIDIA shipped a 48-gigabyte RTX 5090, it would cannibalize the professional tier. If it shipped a 64-gigabyte variant, it would threaten the economics of cloud GPU rental. Every gigabyte of GDDR7 allocated to a $2,000 consumer card is a gigabyte not generating revenue in an $8,000 workstation card or a $25,000 datacenter GPU. At current DRAM prices &#8212; which surged 171% year-over-year by the third quarter of 2025 &#8212; the allocation math is unambiguous.[11]</p><p>The second layer is the interconnect restriction. The RTX 3090, launched in September 2020, was the last GeForce card to include NVLink, the high-speed GPU-to-GPU interconnect that allows two cards to share memory.[12] When NVIDIA removed it from the RTX 4090, Jensen explained that the I/O area had been &#8220;repurposed to cram in as much AI processing as we could.&#8221;[13] The same decision persisted through Blackwell. Neither the RTX 5090 nor the RTX PRO 6000 has NVLink.[14] The technology exists exclusively on datacenter GPUs &#8212; the H100 at 900 GB/s bidirectional, the B200 at 1,800 GB/s &#8212; which cost $25,000 and up per card.</p><p>Without NVLink, multi-GPU setups on consumer hardware communicate over the standard motherboard bus at roughly 64 GB/s &#8212; fourteen times slower than H100 NVLink.[15] Tensor parallelism over PCIe still works &#8212; vLLM supports it, and a dual RTX 5090 can run 70B models &#8212; but the communication overhead is severe enough that independent benchmarks found a single RTX PRO 6000 outperforming multi-card consumer setups on large models, simply by avoiding the bottleneck.[16] For most practitioners, single-GPU memory remains the practical ceiling. That ceiling is 32 gigabytes on the RTX 5090 &#8212; or 96 gigabytes if you pay $8,000 for the RTX PRO 6000. The segmentation ladder, again.</p><p>Even a dual PRO 6000 setup &#8212; $16,000 and 192 gigabytes, matching a single B200&#8217;s memory capacity &#8212; delivers roughly a third to a fifth of the B200&#8217;s throughput at less than half its price. Even on a cost-per-token basis, the B200 wins by roughly 2&#215;, because GDDR7 over PCIe cannot compete with HBM3e over NVLink.[16]</p><p>The third layer is the bandwidth constraint. When NVIDIA did build a unified-memory device for local AI, it paired 128 gigabytes of memory with 273 GB/s of bandwidth. The DGX Spark &#8212; announced at CES 2025 for $3,000, shipping in October 2025 at $3,999, now $4,699 after a memory-shortage surcharge &#8212; has the capacity.[17] It does not have the speed. The bandwidth limitation likely reflects both the thermal envelope of a 1.1-liter desktop enclosure and the economics of LPDDR5x &#8212; but whatever the cause, the effect is the same. Token generation in LLM inference is memory-bandwidth-bound: the model reads its entire weight matrix from memory for every token produced. At 273 GB/s, the DGX Spark generates tokens at roughly half the rate of a Mac Studio M4 Max (546 GB/s) and a third the rate of a Mac Studio M3 Ultra (819 GB/s).[18]</p><p>John Carmack (yes, that John Carmack) tested his unit in October 2025 and posted the results: &#8220;DGX Spark appears to be maxing out at only 100 watts power draw, less than half of the rated 240 watts, and it only seems to be delivering about half the quoted performance.&#8221;[19] Awni Hannun, lead developer of Apple&#8217;s MLX framework, independently confirmed similar results &#8212; roughly 60 teraflops in matrix operations, well below expectations.[20] A CES 2026 software update improved matters, with NVIDIA claiming up to 2.6&#215; speedups on optimized configurations that use speculative decoding and aggressive quantization. Typical workloads saw 1.3 to 1.4&#215;.[21]</p><p>The Spark reveals NVIDIA&#8217;s priorities. It gives you the memory and the CUDA ecosystem but not the bandwidth, ensuring that anyone who needs both capacity and speed still has to rent datacenter GPUs. Jensen positioned the Spark as a prototyping companion for DGX Cloud, which is exactly what a funnel would look like if it weighed two pounds and sat on your desk.[22]</p><p>A more creative use of the Spark came from outside NVIDIA. In October 2025, EXO Labs &#8212; a small open-source distributed inference project &#8212; wired two DGX Sparks to a Mac Studio M3 Ultra and split the inference workload between them. The Sparks handled prefill, the compute-intensive phase in which a long input prompt is processed via large matrix multiplications. The Mac handled decode, the bandwidth-heavy phase where tokens are generated one at a time. The result: a 2.8&#215; speedup over the Mac Studio alone, with each device contributing exactly the capability the other lacked: the Spark&#8217;s 100 teraflops of FP16 compute for prefill, the Mac&#8217;s 819 GB/s bandwidth for decode.[23] This is disaggregated inference &#8212; the same architectural principle that AWS and Cerebras announced at datacenter scale in March 2026, using Trainium for prefill and the Cerebras wafer-scale engine for decode.[24] EXO demonstrated it on two consumer desktops connected by standard 10 Gigabit Ethernet for under $10,000.</p><p>The structural irony is precise. NVIDIA is building disaggregated inference into its next-generation Rubin CPX datacenter platform &#8212; compute-dense processors for prefill, HBM-rich GPUs for decode, and NVLink 6.0 connectivity.[25] The architecture NVIDIA is building its next datacenter generation around already works on a desk, across vendor boundaries, orchestrated by a twenty-person startup in London. The Spark isn&#8217;t a bad standalone product. It&#8217;s half of an excellent hybrid, and the other half is a Mac.</p><h2>The DRAM shortage locked it in</h2><p>The segmentation strategy might have softened over time &#8212; a 48-gigabyte RTX 5090 Super was widely rumored for 2026 &#8212; if the memory market hadn&#8217;t intervened. DRAM contract prices surged 171% year-over-year by Q3 2025, driven by datacenter demand for DDR5 and high-bandwidth memory cannibalizing total wafer capacity.[26] NVIDIA reportedly cut GeForce GPU production by 30 to 40 percent in early 2026.[27] The 16-gigabyte RTX 5060 Ti was at risk of discontinuation due to rising memory costs, making low-margin consumer SKUs uneconomical.[28]</p><p>The shortage converted a product strategy into a supply constraint. At current prices, memory accounts for the majority of the bill-of-materials cost on high-end consumer GPUs.[29] Every 3-gigabyte GDDR7 module allocated to a hypothetical $2,000 consumer card could generate $8,000 in revenue for a professional card or $25,000 in a datacenter product. NVIDIA&#8217;s allocation committee &#8212; if such a thing exists &#8212; would have to be economically irrational to prioritize the consumer tier. The shortage is expected to persist through 2027 at minimum, with some analysts projecting normalization no earlier than 2028.[30]</p><p>NVIDIA&#8217;s product segmentation creates a vacuum. The DRAM shortage prevents NVIDIA from closing it. The 32-gigabyte ceiling is now both a choice and a constraint.</p><h2>What filled the vacuum</h2><p>Apple didn&#8217;t set out to build the best local inference platform. Most practitioners still run models that fit in 32 gigabytes, and for them, the RTX 5090 is unmatched. But the capability frontier is moving toward 70B and above, and the sovereignty use case concentrates at exactly those model sizes: the models powerful enough to handle sensitive medical, legal, and financial workloads are the models that don&#8217;t fit on a consumer NVIDIA card. The unified memory architecture that makes Apple Silicon exceptional for large language models was designed for a different problem entirely &#8212; eliminating the CPU-GPU memory copy overhead that drained laptop battery life and slowed video editing workflows. But the same design that lets Final Cut Pro share memory buffers seamlessly between CPU and GPU also means that a Mac Studio with 128 gigabytes of unified memory has, functionally, 128 gigabytes of VRAM. No bus to cross. No copy overhead. Every byte is accessible to both the CPU and the GPU&#8217;s matrix multiplication units at full bandwidth.[31]</p><p>The numbers are specific. The Mac Studio M3 Ultra delivers 819 GB/s across its memory bus &#8212; three times that of the DGX Spark, and faster per dollar than anything NVIDIA sells below the datacenter tier.[32] The Mac Studio M4 Max offers 128 gigabytes at 546 GB/s for $3,699 &#8212; twice the Spark&#8217;s bandwidth at a lower price.[33] The MacBook Pro M5 Max, shipping since early 2026, offers 128 GB of storage and 614 GB/s of bandwidth in a laptop form factor.[34] Apple&#8217;s M5 generation added dedicated Neural Accelerators in every GPU core &#8212; purpose-built matrix-multiplication hardware that delivers 3.3 to 4.1 times faster prompt processing than the M4 generation on equivalent workloads.[35] Token generation, the bandwidth-bound phase, improved by 19 to 27 percent &#8212; closely matching the 28% memory bandwidth increase between the base M5 and base M4.[36] Two different mechanisms, one confirmation: for decode-heavy inference, bandwidth is the bottleneck, and Apple is shipping more of it every year.</p><p>The software ecosystem matured with startling speed. Apple&#8217;s MLX framework, released in December 2023, reached version 0.31.1 with roughly biweekly releases and 23,900 GitHub stars.[37] Most Mac practitioners today run models through llama.cpp&#8217;s Metal backend &#8212; hardware-agnostic, NVIDIA-independent, but not Apple-controlled. In March 2026, Ollama &#8212; the most popular tool for running LLMs locally &#8212; began transitioning its Apple Silicon backend from llama.cpp to MLX, with a preview release showing 57% faster prefill and 93% faster token generation on initial supported models.[38] The full rollout is expected in Q2 2026. When it arrives, the default path for running an open-weight model on a Mac will increasingly route through Apple&#8217;s own inference framework.</p><p>Whether Apple planned this matters less than what it did next. Multiple sources describe the LLM advantage as initially coincidental, a side effect of laptop chip architecture decisions.[39] But Apple has since leaned in hard. The M3 Ultra was explicitly marketed as running &#8220;LLMs with over 600 billion parameters.&#8221;[40] M5 added dedicated matrix multiplication hardware. macOS 26.2 enables Thunderbolt 5 clustering of multiple Mac Studios for combined memory pools exceeding a terabyte.[41] The trajectory has shifted from architectural accident to competitive strategy. Apple can afford to sell 128 gigabytes of GPU-accessible memory at consumer prices because it has no datacenter GPU business to cannibalize. The structural asymmetry is the advantage: NVIDIA must protect $194 billion in datacenter revenue; Apple must protect nothing.</p><p>AMD attacked from a different direction. The Ryzen AI Max+ 395, codenamed Strix Halo, packs 128 gigabytes of LPDDR5x unified memory into a mini PC that costs $2,000 &#8212; less than half the DGX Spark, less than half the equivalent Apple Silicon.[42] The bandwidth is lower: 256 GB/s theoretical, roughly 212 GB/s measured, which makes dense 70-billion-parameter models painfully slow at 3-5 tokens per second.[43] But the emerging class of Mixture-of-Experts architectures &#8212; where only a fraction of the total parameters are active per token &#8212; plays to Ryzen&#8217;s strengths. A 30-billion-parameter MoE model with 3 billion active parameters runs at around 50 tokens per second. Llama 4 Scout, with 109 billion total parameters, manages roughly 15 tokens per second.[44] Usable.</p><p>The software story is rougher. AMD&#8217;s ROCm stack remains a source of friction. Vulkan, the open graphics API, now outperforms ROCm on Strix Halo for many llama.cpp workloads. AMD itself used Vulkan for its GTC 2026 benchmark comparisons against the DGX Spark.[45] This effectively sidesteps AMD&#8217;s software maturity problem for inference &#8212; the one workload where CUDA&#8217;s moat is thinnest. Qualcomm&#8217;s Snapdragon X Elite brings similar unified LPDDR5x memory to Windows laptops, though benchmark data at 70B+ scales remains limited.[46]</p><h2>The ecosystem compounds</h2><p>The deeper consequence is not that Apple and AMD are selling hardware. It is that each sale weakens CUDA&#8217;s gravitational pull at the inference layer.</p><p>CUDA&#8217;s dominance in AI is real and earned. PyTorch, DeepSpeed, Unsloth, TRL &#8212; virtually every training framework is optimized for NVIDIA first, with alternatives months or years behind.[47] Porting a codebase from CUDA to ROCm typically requires modifying 15 to 20 percent of the code and three to six months of optimization work.[48] For training, the moat is deep and getting deeper.</p><p>But inference is not training. Running a pretrained model does not require custom CUDA kernels. It requires loading weights into memory and multiplying matrices &#8212; operations that llama.cpp, MLX, and Vulkan handle on any hardware. Every developer who downloads Ollama on a Mac Studio, every startup that deploys a Ryzen AI Max+ mini PC for edge inference, every enterprise that builds a compliant local cluster has learned to run models without CUDA. They haven&#8217;t left the NVIDIA ecosystem for training. But they&#8217;ve discovered that inference &#8212; the workload that will eventually dwarf training in market size &#8212; doesn&#8217;t require it.[49] This doesn&#8217;t eliminate NVIDIA dependency; it bifurcates it. Training stays on CUDA. Inference increasingly doesn&#8217;t. The question is which half of the workflow grows faster.</p><p>This is the pattern I described in &#8220;Open Source, Closed Orbit&#8221;: NVIDIA&#8217;s ecosystem strategy works by routing community adoption through hardware-dependent infrastructure.[50] The Black Hole pulls everything toward NVIDIA silicon. Local inference, running through hardware-agnostic frameworks, is the first workload category where the gravity is measurably weakening. Not because anyone built a better CUDA. Because the workload doesn&#8217;t need CUDA at all.</p><p>The compounding accelerates when privacy is factored into the calculation. Forty-four percent of organizations cite data privacy as the top barrier to LLM adoption.[51] HIPAA violations can result in fines of up to $2.1 million per incident. The EU Data Act took effect in September 2025. The US CLOUD Act&#8217;s compelled disclosure provision means that any inference workload running on a US cloud provider&#8217;s infrastructure is, in principle, accessible to a US court order &#8212; regardless of where the server sits physically.[52] For a European hospital, a defense contractor, or a financial institution running models on patient data, contract terms, or trading signals, local inference is not a cost optimization. It is a compliance requirement. For individual practitioners and small teams, a Mac Studio solves this today. For enterprises with regulatory audit requirements, local hardware is necessary but not sufficient &#8212; fleet management, monitoring, and certification infrastructure are still missing from Apple&#8217;s offering.</p><p>NVIDIA&#8217;s product line prices that compliance requirement into the segmentation ladder. A CTO who needs private 70-billion-parameter inference has three NVIDIA options: a 32-gigabyte RTX 5090 that cannot run the model, a $4,699 DGX Spark that can run it slowly, or cloud GPU rental that puts the data on someone else&#8217;s infrastructure &#8212; defeating the purpose. The fourth option is a $3,699 Mac Studio that runs 70B locally at usable speed with no data leaving the building. The sovereignty premium &#8212; the additional cost of keeping inference private &#8212; is not set by the physics of silicon. It is set by NVIDIA&#8217;s product segmentation. Apple and AMD make it cheaper because they have no datacenter business pushing practitioners toward the cloud.[53]</p><h2>What Jensen would say</h2><p>Jensen would not dispute the segmentation. He announced it. His rebuttal would be more precise: the DGX Spark gives you 128 gigabytes with full CUDA compatibility, 200 Gbps RDMA networking for clustering, and a direct path to DGX Cloud &#8212; the entire stack, on your desk, for $4,699. The bandwidth limitation is a trade-off for thermals and form factor, not a deliberate throttle. And cloud GPU rental at $0.69 per hour for an RTX 5090 makes local ownership unnecessary for most practitioners.</p><p>The first two points are defensible. The Spark is a genuine product with a genuine use case &#8212; CUDA prototyping at model scales that don&#8217;t fit on consumer GPUs. The RDMA clustering is technically impressive, though multi-Spark clustering benchmarks for 70B+ inference have not been independently published. The third point &#8212; cloud rental &#8212; deserves scrutiny. A cloud RTX 5090 at $0.69 per hour costs about $600 per month at 24/7 utilization, or about $6,000 per year with a savings plan [54] A Mac Studio M4 Max costs $3,699 once. The break-even for always-on local inference is measured in months, not years. A January 2026 study found consumer hardware breaking even against API pricing in 15 to 118 days at moderate volume.[55] </p><p>Cloud rental is cheaper for intermittent use; local hardware is cheaper for anything resembling a production workload. The caveat is organizational: buying a Mac is a hardware decision, but deploying it as inference infrastructure means retraining an engineering team that learned on CUDA and integrating devices that most IT departments have never managed at scale. The economics push practitioners toward owning hardware. NVIDIA&#8217;s product line pushes them toward owning someone else&#8217;s.</p><h2>What would have to break</h2><p>The segmentation thesis breaks down under three conditions.</p><p>First, NVIDIA ships a consumer GPU with 48 gigabytes or more of VRAM before the M5 Ultra arrives. A rumored RTX 5090 Super with 48 gigabytes of GDDR7 would close the gap for 70-billion-parameter models. If it arrives at the $2,000 to $2,500 price point with the RTX 5090&#8217;s 1,792 GB/s bandwidth, the value proposition against Apple Silicon reverses for that model tier. The DRAM shortage makes this unlikely before late 2026 at the earliest, but it remains the most direct competitive response.[56]</p><p>Second, NVIDIA re-enables NVLink or an equivalent high-speed interconnect on consumer cards. This would allow practitioners to pool VRAM across multiple GPUs at datacenter-comparable speeds. The incentive against this is structural: every consumer NVLink bridge sold is an H100 not rented. NVIDIA has moved in the opposite direction for three consecutive GPU generations.</p><p>Third, the CUDA moat extends into inference. If NVIDIA ships inference-specific optimizations &#8212; through TensorRT-LLM, NIM, or a CUDA-exclusive quantization format &#8212; that make the performance gap between CUDA and llama.cpp/MLX too large to ignore, practitioners return to NVIDIA hardware regardless of memory capacity. The DGX Spark&#8217;s CES 2026 software update, which delivered meaningful speedups through TensorRT-LLM and speculative decoding, suggests NVIDIA is pursuing this path.[57] But the update also demonstrated the strategy&#8217;s limitation: software optimizations can improve throughput within the bandwidth constraint, but cannot eliminate the constraint itself. At 273 GB/s, no amount of software makes the Spark faster than hardware with three times the bandwidth.</p><p>The most likely outcome is coexistence. NVIDIA dominates training and high-throughput production inference in the datacenter. Apple dominates personal and small-team local inference through memory capacity and ecosystem maturity. AMD competes on price at the entry tier. The local inference market grows despite NVIDIA&#8217;s product line, not because of it &#8212; because that product line is optimized for a $194 billion datacenter business that dwarfs any revenue a 48-gigabyte consumer card could generate.</p><p>Intuition, if not logic, points to a place Apple hasn&#8217;t been since discontinuing its Xserve rack-mounted servers in 2011.[58] Unified memory, Thunderbolt 5 clustering, MLX, and a silicon advantage at the inference layer add up to a server product that competes with DGX &#8212; not on training, but on private inference at enterprise scale. Tim Cook&#8217;s Apple is unlikely to re-enter the server market. But Cook&#8217;s potential successor is John Ternus, the SVP of Hardware Engineering, who already oversees the silicon and devices, and now the design teams that would build them.[59]</p><p>Jensen was right about one thing. This is the house that GeForce made. He just didn&#8217;t mention that some tenants had moved out, bought a Mac, and stopped paying rent.</p><div><hr></div><h3>Notes</h3><p>[1] Jensen Huang, GTC 2026 keynote, March 16, 2026, SAP Center, San Jose. Transcript confirmed by Yahoo Finance, heise.de, 36kr, and <a href="https://www.rev.com/">Rev.com</a>.</p><p>[2] NVIDIA Q4 FY2026 earnings press release (Form 8-K, EX-99.1), filed February 25, 2026, <a href="https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&amp;CIK=1045810&amp;type=8-K">SEC EDGAR</a>. Datacenter revenue: $193.737 billion. Total revenue: $215.938 billion.</p><p>[3] NVIDIA CFO Commentary (Form 8-K, EX-99.2), filed February 25, 2026. Full-year GAAP gross margin: 71.1%. Non-GAAP: 71.3%. The Q3 FY2026 quarterly margin was 73.4%; the full-year figure was lower due to a $4.5 billion H20 inventory charge in Q1 related to China export restrictions.</p><p>[4] NVIDIA RTX 5090: 32GB, $3,500&#8211;4,800 street as of April 2026 (Newegg FE at $3,695, Amazon at $3,899, custom AIB models to $4,800; DRAM shortage has driven prices well above the $1,999 list price). NVIDIA RTX PRO 6000: 96GB, $7,999&#8211;8,900. NVIDIA H100 SXM: 80GB, approximately $25,000&#8211;40,000.</p><p>[5] NVIDIA GeForce RTX 5090 specifications: 32GB GDDR7 on 512-bit bus, 1,792 GB/s bandwidth. RTX 4090 delivered 1,008 GB/s on a 384-bit bus. Improvement: 78%. <a href="https://videocardz.com/">VideoCardz</a>; NVIDIA product page.</p><p>[6] Model sizes at Q4 quantization (approximate): Llama 3.3 70B &#8776; 35&#8211;40GB; Nemotron 3 Super 120B &#8776; 60GB; DeepSeek R1 671B &#8776; 336GB; Llama 3.1 405B &#8776; 203GB. Rule of thumb: BF16 &#8776; 2GB per billion parameters; Q4 &#8776; 0.5GB per billion parameters, plus overhead for KV cache. Note: NVIDIA&#8217;s NVFP4 format (available only on Blackwell GPUs via TensorRT-LLM) can compress a 70B model to approximately 18GB, fitting within the RTX 5090&#8217;s 32GB &#8212; but at a noticeable quality penalty compared to Q4, particularly on reasoning tasks. This is a partial escape hatch, not a full solution.</p><p>[7] VideoCardz analysis, citing @unikoshardware: NVIDIA Founders Edition design video showed RTX 5090 PCB labeled with K4VCF322ZC &#8212; a Samsung 3GB GDDR7 module part number. Samsung 2GB and 3GB GDDR7 modules share identical BGA footprints. B-tier source for PCB detail; Samsung module pin compatibility confirmed by Samsung semiconductor product catalog (A-tier).</p><p>[8] RTX 5090 Laptop GPU ships with 24GB (8&#215; 3GB GDDR7 modules). NVIDIA product specifications.</p><p>[9] GamersNexus RTX PRO 6000 Blackwell teardown, June 24, 2025. Confirmed 32 memory positions populated with Samsung 3GB GDDR7 modules (32 &#215; 3GB = 96GB). Die markings confirmed GB202-870-A1 variant. Note: a 48GB desktop consumer card using sixteen 3GB modules would increase DRAM power draw relative to the current sixteen 2GB configuration. Whether the existing VRM and thermal solution accommodate this without modification is unconfirmed &#8212; but the laptop SKU ships with 3GB modules at lower TDP, and the PRO 6000 runs thirty-two 3GB modules at 600W. The constraint is commercial, not physical.</p><p>[10] NVIDIA RTX PRO 6000 Blackwell: 96 GB GDDR7 ECC, 24,064 CUDA cores (188/192 SMs enabled), GB202-870-A1 die. $7,999 retail (Newegg as of March 2026); some configurations to $8,900. The PRO 6000 serves genuine non-AI workstation markets &#8212; CAD, simulation, film VFX &#8212; where ECC memory, ISV certification, and long-lifecycle support justify a premium over consumer cards. The 4&#215; price premium over the RTX 5090 is not pure segmentation, but the memory capacity gap (96GB vs. 32GB) is the feature most relevant to AI inference practitioners, and that gap is a product design choice. NVIDIA RTX PRO Blackwell GPU Architecture Whitepaper V1.0; <a href="https://www.thundercompute.com/blog/nvidia-rtx-pro-6000-blackwell-pricing">Thundercompute</a> pricing analysis (February 2026).</p><p>[11] TrendForce Q3 2025 DRAM contract pricing data, reported by XDA Developers: overall DRAM contract prices 171.8% higher year-over-year.</p><p>[12] NVIDIA GeForce RTX 3090, launched September 2020, supported NVLink via the NVLink Bridge accessory. Confirmed: NVIDIA product specifications; Best Buy product listing.</p><p>[13] Jensen Huang, press gaggle following RTX 4090 launch event, September 20, 2022. Reported by Chuong Nguyen, <a href="https://www.windowscentral.com/hardware/nvidia-geforce-rtx-4090-why-no-nvlink">Windows Central</a>, September 21, 2022. Verbatim: &#8220;The reason why we took [NVLink] out was that we needed the I/Os for something else, and so we use the I/O area to cram in as much AI processing as we could.&#8221;</p><p>[14] RTX 5090: no NVLink. ASUS TUF RTX 5090 spec page: &#8220;NVLink/Crossfire Support: No.&#8221; RTX PRO 6000 Blackwell: no NVLink. The official NVIDIA RTX PRO 6000 datasheet lists PCIe 5.0 x16 with no mention of NVLink; Thundercompute teardown analysis confirms communication is limited to the PCIe bus.</p><p>[15] PCIe 5.0 x16: approximately 64 GB/s bidirectional. H100 NVLink: 900 GB/s bidirectional. Ratio: 14&#215;. B200 NVLink: 1,800 GB/s. NVIDIA datacenter GPU specifications.</p><p>[16] CloudRift benchmarks (October 2025, February 2026) comparing RTX 4090, RTX 5090, RTX PRO 6000, H100, H200, and B200 across multiple model sizes using vLLM. For large models requiring multi-GPU tensor parallelism, the single PRO 6000 outperformed multi-card consumer setups because its 96GB avoided PCIe communication entirely. The benchmarker noted: &#8220;consumer-grade GPUs lack NVLink, and tensor parallelism requires extensive PCIe communication, which becomes a bottleneck.&#8221; Dual RTX PRO 6000 vs. single B200: both have 192GB, but B200 delivers up to 4.9&#215; the throughput of a single PRO 6000 in 8-GPU configurations at 8K+8K context. For a 2-GPU PRO 6000 setup, the gap narrows to roughly 3&#215; on short-context workloads (bandwidth ratio: B200 at 8,000 GB/s vs. dual PRO 6000 at ~3,000 GB/s after PCIe overhead) and widens to ~5&#215; on long-context workloads. <a href="https://cloudrift.ai/blog/benchmarking-rtx-gpus-for-llm-inference">cloudrift.ai</a>; <a href="https://cloudrift.ai/blog/benchmarking-b200">cloudrift.ai</a>. [17] DGX Spark: Announced as &#8220;Project DIGITS&#8221; at CES 2025 (January 6, 2025) at &#8220;starting at $3,000.&#8221; Shipped October 15, 2025 at $3,999 (delayed from original May target). Price raised to $4,699 on February 23, 2026, per <a href="https://forums.developer.nvidia.com/">NVIDIA Developer Forums</a> announcement citing &#8220;worldwide constraints in memory supply.&#8221; Wccftech; WinBuzzer; NVIDIA Developer Forums.</p><p>[18] DGX Spark hardware specifications: 128GB LPDDR5x, 273 GB/s memory bandwidth. NVIDIA DGX Spark User Guide (<a href="https://docs.nvidia.com/dgx-spark/">docs.nvidia.com</a>). Mac Studio M4 Max: 546 GB/s (Apple specifications). Mac Studio M3 Ultra: 819 GB/s (Apple specifications). Bandwidth ratios: M4 Max/Spark = 2.0&#215;; M3 Ultra/Spark = 3.0&#215;.</p><p>[19] John Carmack, <a href="https://x.com/ID_AA_Carmack/status/1982831774850748825">X post</a>, October 27, 2025. Verbatim: &#8220;DGX Spark appears to be maxing out at only 100 watts power draw, less than half of the rated 240 watts, and it only seems to be delivering about half the quoted performance.&#8221; Note: 240W is the external power supply rating. NVIDIA documents the SoC TDP at 140W. Carmack&#8217;s comparison was directionally correct; the TDP distinction is worth noting.</p><p>[20] Awni Hannun, GitHub gist with DGX Spark microbenchmark results, October 2025. Approximately 60 TFLOPS in BF16 matrix operations. Independent tester Lance Cleveland reproduced approximately 70 TFLOPS using Hannun&#8217;s methodology.</p><p>[21] NVIDIA Developer Blog, January 2026: &#8220;New Software and Model Optimizations Supercharge NVIDIA DGX Spark.&#8221; Headline claim: up to 2.6&#215; speedup. This peak figure applies to Qwen-235B on a dual DGX Spark configuration using NVFP4 and speculative decoding. Typical single-unit workloads (Qwen3-30B, Stable Diffusion 3.5 Large) saw 1.3&#8211;1.4&#215; improvements. StorageReview; HotHardware.</p><p>[22] NVIDIA positions the Spark alongside DGX Cloud: the product page features &#8220;DGX Spark + DGX Cloud&#8221; workflow integration. NVIDIA product page (<a href="https://www.nvidia.com/en-us/products/workstations/dgx-spark/">nvidia.com/dgx-spark</a>).</p><p>[23] EXO Labs, &#8220;Combining NVIDIA DGX Spark + Apple Mac Studio for 4x Faster LLM Inference with EXO 1.0,&#8221; <a href="https://blog.exolabs.net">blog.exolabs.net</a>, October 15, 2025. Configuration: two DGX Sparks (128GB, 273 GB/s, 100 TFLOPS FP16 each) + one Mac Studio M3 Ultra (256GB, 819 GB/s, 26 TFLOPS FP16). Benchmark: Llama-3.1 8B FP16, 8,192-token prompt, 32 output tokens. Measured speedup: 2.8&#215; over Mac Studio alone. The blog post headline claims &#8220;4&#215;&#8221; but this is a theoretical projection for longer contexts; the measured result is 2.8&#215;. Tom&#8217;s Hardware and Simon Willison both reported the measured figure. Note: all benchmark data originates from EXO Labs; no independent reproduction has been published. The 8B model used fits on each device individually; performance at 70B+ scales requiring combined memory has not been published.</p><p>[24] AWS and Cerebras disaggregated inference partnership announced March 13, 2026. Trainium3 chips handle compute-bound prefill; Cerebras CS-3 wafer-scale engines (44GB SRAM, 21+ PB/s internal bandwidth) handle bandwidth-bound decode. Connected via Amazon Elastic Fabric Adapter. Cerebras press release; AWS announcement.</p><p>[25] NVIDIA Rubin CPX announced GTC 2026. Compute-dense Rubin CPX processors for prefill, standard Rubin GPUs with HBM4 for decode, connected via NVLink 6.0. NVIDIA Developer Blog, &#8220;NVIDIA Rubin CPX Accelerates Inference Performance and Efficiency for 1M+ Token Context Workloads,&#8221; March 2026. [26] See note 11.</p><p>[27] Production cut: reported by Overclock3D, PC Gamer, Windows Central, and Igor&#8217;sLAB, all citing BoBantang/Benchlife. NVIDIA has not officially confirmed this figure. Igor&#8217;sLAB: &#8220;The reports of a significant reduction in GeForce GPU production are based exclusively on unofficial sources and have not been confirmed.&#8221;</p><p>[28] Overclock3D: NVIDIA reportedly considering discontinuing the 16GB RTX 5060 Ti variant due to GDDR7 cost escalation.</p><p>[29] At current DRAM prices, multiple analysts estimate that memory accounts for 70&#8211;80% of the bill of materials cost for high-VRAM consumer GPUs (the GPU die plus VRAM combined). Historically, VRAM accounted for 30&#8211;40% of the BOM. The inflation-era figure is specific to the current supply crisis. BuySellRam; Quasa.io analysis; VideoCardz.</p><p>[30] IDC, TeamGroup, and Counterpoint Research project DRAM shortages through 2027. Intel CEO and IEEE Spectrum analysis of new fab timelines suggest 2028 or beyond for full normalization. SK Hynix plans to boost DRAM production 8&#215; in 2026, which TweakTown notes &#8220;still won&#8217;t be enough.&#8221;</p><p>[31] Apple Silicon unified memory architecture: CPU, GPU, and Neural Engine share a single memory pool with zero-copy access. No discrete VRAM; all system memory is GPU-accessible. Apple technical documentation.</p><p>[32] Mac Studio with M3 Ultra: up to 256GB unified memory, 819 GB/s memory bandwidth. Starting at $5,599 for the 192GB configuration. Apple product specifications (<a href="https://www.apple.com/mac-studio/">apple.com</a>).</p><p>[33] Mac Studio with M4 Max: up to 128GB unified memory, 546 GB/s memory bandwidth. The 128GB configuration requires the 16-core CPU / 40-core GPU chip variant and is $3,499 with a 512GB SSD, $3,699 with a 1TB SSD (apple.com; confirmed by B&amp;H Photo and PetaPixel review, March 2025). EU: &#8364;4,099 / &#8364;4,299. UK: &#163;3,599 / &#163;3,799.</p><p>[34] MacBook Pro with M5 Max: up to 128GB unified memory, 614 GB/s memory bandwidth. <a href="https://www.apple.com/newsroom/">Apple newsroom</a>, March 2026. Apple product specifications.</p><p>[35] Apple Machine Learning Research, &#8220;Exploring LLMs with MLX and the Neural Accelerators in the M5 GPU,&#8221; published November 19, 2025. Prompt processing (time-to-first-token) improvement: 3.33&#215; to 4.06&#215; across six tested models. Token generation improvement: 19&#8211;27%. Benchmarks conducted on base M5 vs. base M4 MacBook Pro (both 24GB configurations).</p><p>[36] Base M5 memory bandwidth: 153 GB/s. Base M4: 120 GB/s. Improvement: 28%. The 19&#8211;27% token-generation improvement, corresponding to a 28% bandwidth increase, confirms the memory-bandwidth-bound nature of LLM decode. Apple ML Research, ibid.</p><p>[37] MLX GitHub repository (<a href="https://github.com/ml-explore/mlx">github.com/ml-explore/mlx</a>): 23,900 stars as of March 2026. Version 0.31.1. Release frequency: approximately biweekly. MLX was first released in December 2023.</p><p>[38] Ollama v0.19.0: released March 27, 2026 (GitHub tag); blog post March 30, 2026 (<a href="https://ollama.com/blog/mlx">ollama.com/blog/mlx</a>). Performance claims: prefill 1,154 &#8594; 1,810 tok/s (57% improvement); decode 58 &#8594; 112 tok/s (93% improvement). These are Ollama-published figures. The MLX backend is described as a &#8220;preview&#8221; &#8212; at launch, only Qwen3.5-35B-A3B is supported. llama.cpp remains the backend for all other models. Full rollout expected Q2 2026. Methodological note: the benchmark compared NVFP4 quantization (MLX) against Q4_K_M (llama.cpp); part of the improvement reflects the difference in quantization format, not solely the backend change.</p><p>[39] Multiple sources describe Apple Silicon&#8217;s LLM advantage as initially incidental. Cult of Mac: &#8220;How Apple accidentally made the best AI computer.&#8221; XDA Developers: &#8220;Apple has a sleeper advantage when it comes to local LLMs.&#8221; One investment analyst quoted by a Substack: &#8220;The Mac mini M4 may be the most underanalyzed product in Apple&#8217;s lineup from an AI strategy perspective.&#8221;</p><p>[40] Apple Newsroom, March 2025: M3 Ultra announcement explicitly stated the chip enables running &#8220;LLMs with over 600 billion parameters.&#8221; Apple product marketing (<a href="https://www.apple.com/newsroom/">apple.com/newsroom</a>).</p><p>[41] macOS 26.2 Thunderbolt 5 clustering: enables pooled inference memory across multiple Mac Studios via RDMA. Demonstrated by EXO Labs and community builders. Awesome Agents reported Mac Studio clusters running trillion-parameter models for approximately $40,000 in hardware.</p><p>[42] AMD Ryzen AI Max+ 395 (Strix Halo): 128GB LPDDR5x unified memory, 256 GB/s theoretical bandwidth. Framework Desktop: $1,999 for 128GB configuration. Also available from Beelink GTR9 Pro and GMKtec EVO-X2 at similar prices. 31+ OEM devices announced at CES 2026. AMD product specifications; Framework blog.</p><p>[43] Measured bandwidth: approximately 212 GB/s (LLM Tracker benchmarks). Dense 70B model performance at Q4: 3&#8211;5 tok/s. LLM Tracker; Hardware Corner benchmarks.</p><p>[44] MoE model performance on Ryzen AI Max+ 395: 30B MoE at Q8 &#8776; 50 tok/s; Llama 4 Scout 109B &#8776; 15 tok/s. LLM Tracker; community benchmarks. These figures are from community testing and should be treated as approximate.</p><p>[45] AMD used Vulkan llama.cpp for GTC 2026 benchmark comparisons against DGX Spark. Community testers found that Vulkan via the RADV driver outperforms ROCm HIP on Strix Halo for many llama.cpp workloads. GitHub llama.cpp Vulkan performance discussions; AMD blog.</p><p>[46] Qualcomm Snapdragon X Elite: ARM-based SoC with LPDDR5x unified memory (up to 64GB on current configurations). The unified memory architecture is conceptually similar to Apple Silicon &#8212; all memory is GPU-accessible &#8212; but current configurations max out at 64GB, half the Apple and AMD offerings. Benchmark coverage for large LLM inference (70B+) on Snapdragon X Elite is sparse as of publication. The platform is primarily positioned for Windows laptops, not desktop workstations. [47] CUDA training ecosystem dominance: PyTorch defaults to CUDA. DeepSpeed, Unsloth, and TRL require CUDA. Apple Silicon has MLX LoRA for basic SFT but lacks GRPO support. AMD ROCm is functional but substantially less mature. This is the consensus among practitioners, documented across multiple sources.</p><p>[48] CUDA-to-ROCm porting effort: 15&#8211;20% codebase modification, 3&#8211;6 months optimization, 10&#8211;20% initial performance penalty. HyperFRAME Research; Introl analysis.</p><p>[49] Jensen Huang, Q4 FY2026 earnings call, February 2026: stated &#8220;the agentic AI inflection point has arrived&#8221; and projected inference would eventually dwarf training in market size. NVIDIA earnings transcript.</p><p>[50] &#8220;Open Source, Closed Orbit: The Hardware Monopolist&#8217;s Guide to Owning Open Source,&#8221; The AI Realist (<a href="https://www.airealist.ai">www.airealist.ai</a>). Framework: NVIDIA&#8217;s &#8220;Black Hole&#8221; model (centripetal, routing ecosystem gravity back to NVIDIA hardware) versus Hugging Face&#8217;s &#8220;Sun&#8221; model (centrifugal, hardware-agnostic).</p><p>[51] Privacy as a barrier to LLM adoption: 44% of organizations cited data privacy as the top concern in enterprise LLM deployment surveys. Multiple analyst reports corroborate this range; the specific 44% figure is from Cisco&#8217;s 2024 Data Privacy Benchmark Study, the most recent large-sample study available. HIPAA penalty: maximum $2.1 million per violation category per year under the HITECH Act tiered penalty structure.</p><p>[52] US CLOUD Act compelled disclosure provision (18 U.S.C. &#167; 2713): requires providers of electronic communication or remote computing services subject to US jurisdiction to produce data in their &#8220;possession, custody, or control&#8221; regardless of data location. For a detailed trace of the legal pathway and its implications for cloud-hosted AI workloads, see &#8220;Access, Disable, Destroy,&#8221; The AI Realist (<a href="https://www.airealist.ai">www.airealist.ai</a>). EU Data Act (Regulation 2023/2854) entered into application on September 12, 2025.</p><p>[53] The sovereignty premium framing draws on the cost comparison structure throughout this piece. NVIDIA options for private 70B inference: RTX 5090 (32GB, cannot run the model), DGX Spark ($4,699, runs at ~3&#8211;5 tok/s on 70B), cloud rental ($2&#8211;5/hr, data leaves the building). Apple option: Mac Studio M4 Max ($3,699 with 1TB SSD; runs 70B at Q4, ~8&#8211;15 tok/s; data stays local). The price delta between the cheapest NVIDIA option that works (Spark at $4,699) and the Apple option ($3,699) is $1,000, and the performance delta (2&#8211;3&#215; faster on Apple at the bandwidth-bound decode step) means the effective cost of NVIDIA sovereignty is higher than the sticker price suggests.</p><p>[54] Cloud RTX 5090: $0.69/hr on RunPod community cloud (March 2026 pricing). At 24/7 utilization: $0.69 &#215; 24 &#215; 30 &#8776; $497/month, or approximately $6,000/year. <a href="https://www.runpod.io/pricing">RunPod</a>.</p><p>[55] Knoop and Holtmann, &#8220;Private LLM Inference on Consumer Blackwell GPUs: A Practical Guide for Cost-Effective Local Deployment in SMEs,&#8221; <a href="https://arxiv.org/abs/2601.09527">arXiv</a>, January 2026. Found consumer GPU electricity-only inference costs of $0.001&#8211;0.04 per million tokens; break-even against API pricing at 15&#8211;118 days at moderate volume (30 million tokens/day).</p><p>[56] RTX 5090 Super with 48GB GDDR7: widely rumored based on Samsung 3GB GDDR7 module availability and PCB compatibility. Launch reportedly slipped to Q3 2026 or later due to DRAM supply constraints. GameGPU; TweakTown; VideoCardz. Unconfirmed by NVIDIA.</p><p>[57] See note 20. The CES 2026 software update for DGX Spark focused on TensorRT-LLM optimizations and speculative decoding &#8212; CUDA-exclusive techniques that do not benefit Apple Silicon or AMD platforms.</p><p>[58] Apple Xserve: rack-mounted 1U server sold from 2002 to January 31, 2011. When a customer complained about the discontinuation, Steve Jobs replied, &#8220;Hardly anyone was buying them.&#8221; Apple suggested migrating to the Mac Pro Server or the Mac mini Server. Apple does run server-side inference today via Private Cloud Compute (PCC), announced at WWDC 2024 &#8212; but PCC serves Apple&#8217;s own services (Apple Intelligence), not enterprise customers. A rack-mounted inference product for sale would be a fundamentally different market entry. <a href="https://en.wikipedia.org/wiki/Xserve">Wikipedia</a>; Macworld, November 5, 2010.</p><p>[59] John Ternus, Apple SVP Hardware Engineering, age 50. Bloomberg (Mark Gurman, March 2026), NYT (January 2026), and multiple outlets identify him as the leading candidate to succeed Tim Cook as CEO. In January 2026, Cook expanded Ternus&#8217;s role to include oversight of hardware and software design teams, robotics, and product marketing &#8212; in addition to his existing responsibility for all hardware engineering, including iPhone, iPad, Mac, and AirPods. Ternus was the face of the MacBook Neo launch, a role Cook has historically reserved for himself.</p>]]></content:encoded></item></channel></rss>