08 May 2026 Weekly AI Drop 3 min read

Weekly AI Drop #3

Google silently installed an AI model on a billion devices. SpaceX rented its supercomputer to Anthropic. A 4-person Miami startup is outperforming Claude at 5% of the cost.

In 60 seconds:

Chrome silently installed Gemini Nano (4GB on-device model) on a billion machines.
Anthropic × SpaceX: Anthropic rented all of Colossus 1 (300 MW, 220k GPUs) — Claude rate limits already doubled.
Subquadratic (SubQ) came out of stealth: 12M token context, 52× faster than FlashAttention, <5% of Claude’s cost.
Anthropic launched 10 financial-services agent templates for M365 (GL Reconciler, KYC, Month-End Close…).

A 4-person startup from Miami just outperformed Claude at 5% of the cost. Google silently downloaded an AI model onto a billion devices. And SpaceX just rented its supercomputer to Anthropic.

Happy Friday and welcome to this week in AI. If you missed last week’s drop, catch up there first — context compounds.

🕵️ Google secretly installed AI on your computer

Chrome has been downloading a 4GB AI model called Gemini Nano onto your device. Without asking. Without telling you. If you delete it, Chrome re-downloads it automatically. 1 billion Chrome users. 4 billion GB pushed silently across the internet.

👊🏻 So what: On-device LLMs are coming. Big step from centralized AI and LLMs today. Same as mainframes evolved into the personal computer.

🤝 Anthropic x SpaceX: the deal nobody saw coming

Anthropic just signed a deal to rent all compute capacity at SpaceX’s Colossus 1 data center. 300+ megawatts. 220,000 GPUs. Available within the month.

👊🏻 So what: Anthropic was capacity-constrained. Now they’re not. Limits already doubled for Claude Code and API users.

Claude Code rate limits doubled. Peak-hour throttling removed for Pro and Max. API rate limits for Opus models up 900% on output tokens. The number one complaint about Claude was “I hit my limit.” That just got a lot harder to do.

🧪 SubQ: a 4-person startup just set a whole new bar

Subquadratic came out of stealth with SubQ. New architecture. 12 million token context window. 52x faster than FlashAttention at 1M tokens. Less than 5% the cost of Claude.

👊🏻 So what: Still needs independent verification. But if these numbers hold, the economics of long-context AI just changed permanently. A team of 4. $29M seed. Watch this space.

🏦 Claude launched financial services agents

10 ready-to-run agent templates: GL Reconciler, KYC Screener, Valuation Reviewer, Month-End Closer, Statement Auditor, and more. Ships as plugins, works across Microsoft 365.

👊🏻 So what: This is Anthropic going vertical. Not “AI for finance.” Actual agents doing actual finance work. If you’re in finance or financial services and not piloting this, your competitor is and you are staying behind.

The real gap isn’t between companies that read the news and those that don’t. It’s between leaders who put these tools in their teams’ hands TODAY and those still waiting for permission from IT. 🤦🏼‍♂️

Bottom line for leaders: Two things just got dramatically harder to ignore — on-device AI (your laptop is now an inference machine) and vertical agents (Claude doing month-end close, not “AI for finance”). If your AI strategy still reads “explore tools,” it’s already outdated. The framework I use to help leaders close that gap is Executive AI Fitness — and where it goes deeper, an AI Operator Sprint that ships running workflows inside your business.

🌶️ Which one was most spicy? Google’s silent install? The SpaceX x Anthropic plot twist? 👇