>
← Back to Insights

The Week AI Marked Its Own Homework

Anthropic shipped Claude Opus 4.8 just 41 days after 4.7, OpenAI started selling bigger ChatGPT ad units, Google quietly let users label their own "Preferred" sources inside AI answers, and Marc Benioff defended Salesforce's habit of marketing AI it hasn't shipped yet. The AI platforms spent the week grading their own work — and the marketing stack noticed.

The Week AI Marked Its Own Homework

1 Anthropic shipped Claude Opus 4.8 in 41 days — its fastest cycle yet

On May 28, Anthropic released Claude Opus 4.8 just 41 days after Opus 4.7, the company's fastest flagship cadence. The release added a 2.5x faster "fast mode" at 3x lower cost than prior models, dynamic workflows capped at 1,000 subagents, and adjustable effort settings letting customers tune cost vs. depth. Opus 4.8 scored 69.2% on SWE-Bench Pro, ahead of GPT-5.5 and Gemini 3.1 Pro on multiple benchmarks.

The marketing-stack implication isn't about coding benchmarks — it's about contract terms. If your CDP, MMM, or agentic creative vendor is built on Claude (and a growing list are), the underlying model just got faster and cheaper without a price change. That's a real margin gift. Use the next vendor renewal to push for those savings to flow through, especially on per-token or per-agent pricing. The model layer is deflating; your software bill shouldn't be inflating against it.

2 OpenAI started testing bigger, richer ad units in ChatGPT

OpenAI began testing upgraded ad formats inside ChatGPT, including larger images, customisable call-to-action buttons, and dedicated e-commerce layouts featuring pricing and customer reviews. The new units sit on top of the self-serve Ads Manager OpenAI opened to all US businesses on May 5, when it dropped the $50K minimum, added CPC bidding from $3–$5, and released a Conversions API.

This is the second act of ChatGPT advertising and it's the one that matters. Phase one was access — "any US business can run an ad." Phase two is performance — "the unit actually drives commerce." If you're a DTC brand, e-comm units inside ChatGPT are the closest thing to a Google Shopping equivalent in an answer engine. Test with a small budget now; the CPMs at $25 and CPC bids around $3 won't last forever.

3 Google let users pick their own "Preferred" sources in AI answers

Google rolled out new AI Search features letting users designate favoured websites whose links receive visible "Preferred" badges inside AI-generated responses, alongside new "Highly Cited" labels and article carousels intended to surface original reporting. Google said users are twice as likely to click links carrying the Preferred badge.

This is Google's quiet response to the trust problem of AI Overviews. By letting consumers nominate sources, Google offloads editorial trust to the user while keeping the answer in its own surface. For brands and publishers, the practical move is a new earned-media job: convince loyal readers to mark you Preferred. SEO has spent twenty years optimising for the algorithm; the next playbook adds optimising for the reader's badge.

4 Marc Benioff defended marketing AI Salesforce hasn't shipped yet

Salesforce CEO Marc Benioff spent the week defending the company's practice of marketing AI capabilities that aren't broadly deployed by customers, arguing forward-looking promotion has long been standard across tech. Same week, Fortune reported Salesforce is keeping engineering teams "slim" thanks to AI — while still hiring aggressively into sales.

The Benioff defense is unintentionally honest about where the AI hype cycle is in B2B SaaS: the demos run, the deployments lag. Marketers buying "agentic" anything should ask for two numbers before signing — how many customers in your industry are running this in production, and what's the median time from signed contract to live deployment. If the answer is "early adopters" and "depends," you're buying a demo, not a product.

5 DarkIris shipped a video platform on ByteDance's Seedance 2.0

On May 29, DarkIris Inc. (Nasdaq: DKI) launched video.aideptus.com, an AI video generation platform built on ByteDance's Seedance 2.0 model. Use cases: short-video production, pre-visualisation, and pre-production for advertising and marketing agencies looking to compress concept-to-asset workflows.

The DarkIris launch matters less for what it is than for what it represents: Seedance is now the underlying engine for third-party advertising-creative tools, the same way Stable Diffusion underpinned a generation of static-image generators. If your agency stack has standardised on Runway or Veo for video, Seedance is now a credible third option — particularly for short-form Asia-pacing content. Diversify the model layer in your creative pipeline before any single provider gets to set the price.

6 Google tested branded vs. non-branded controls inside AI Max

Quietly during the week, Google rolled out tests of new branded search controls in AI Max campaigns, finally giving advertisers a way to separate branded vs. non-branded traffic types — a sore point since AI Max launched. Marketers had been raising the issue for months on Twitter/X and in earnings transcripts.

The fix is small in feature terms and huge in measurement terms. Without a brand/non-brand split, AI Max has been over-crediting itself with conversions that would have happened anyway on brand queries. The new controls let savvy buyers strip out that distortion and see what AI Max actually drove on its own. Re-baseline your AI Max budget once the controls hit your account — the incremental number is going to look smaller, and that's the honest one.

7 Ravelin says 44% of enterprise merchants are integrating AI shopping agents

Fresh research from Ravelin dropped this week: 44% of enterprise e-commerce merchants say they're already integrating AI shopping agents into their experience — not piloting, integrating. That's a step-change from the ~10% running through pilots in late 2025.

If almost half your competitive set already has an AI shopping agent (or a partnership with one) live on the site, your roadmap question isn't "should we?" It's "which?" And the secondary question is the operational one: how does your support, returns, and CX team interact with a chat surface that's now also a sales surface. The merchandising-and-CX silo finally has to merge, and brands that hold the old org chart will lose the chat to better-integrated competitors.

8 The compliance wall is forming — Colorado AI Act, EU rules close in

Two deadlines moved closer this week and most marketers haven't priced them in. The Colorado AI Act becomes enforceable in June 2026, with consequential decisions covered. The EU's high-risk AI rules take effect in August 2026. For US brands selling into either, AI agents making consequential decisions (pricing, eligibility, credit) need documented bias testing, transparency notices, and human-review paths.

This is the boring half of agentic marketing — the half that becomes a problem only when the regulator emails. The good news: most of what's required is documentation that should already exist (training data sources, model evaluation, opt-out flows). The bad news: most marketing teams don't have it. Get a one-pager per AI surface in your stack covering inputs, decisions, audit trail. The teams that have one in June will save the ones that don't a fortune in August.

9 Twitch came back to brand budgets with measurement to match

Ahead of TwitchCon Rotterdam this weekend, Twitch rolled out new real-time measurement capabilities for advertisers, giving buyers per-stream attention data and stronger creator-led attribution. Coupled with the platform's continuing creator commerce push, Twitch is positioning itself as a serious mid-funnel buy, not just a top-of-funnel awareness bet.

Twitch has spent three years being underbought relative to its attention share. The new measurement story is the missing piece for advertisers who couldn't get TV-grade metrics back from a creator-led stream. If your brand has a Gen Z target and your budget split still skews to YouTube and TikTok by 5:1, this is the cycle to test reallocation. The CPMs are still soft, the measurement is finally maturing.

10 The week's pattern: AI platforms are now grading themselves — demand the receipts

String the week together and one pattern dominates: AI platforms shipping their own capability upgrades (Opus 4.8), their own ad units (OpenAI), their own quality labels (Google Preferred), and their own forward-looking marketing claims (Salesforce). Each platform is, in effect, marking its own homework.

The marketer's defensive playbook is unchanged but increasingly urgent: independent measurement, third-party verification, and apples-to-apples incrementality tests. If a platform's own data says it's the best place for your money, it usually is — for the platform. Build a budget allocation that runs at least one external test per quarter against the platform's own claims. The platforms that are right will look fine; the ones that aren't will reveal themselves. Either way, you'll be ready.

Sources

  1. Introducing Claude Opus 4.8 — Anthropic
  2. Anthropic releases Opus 4.8 with new 'dynamic workflow' tool — TechCrunch
  3. OpenAI tests upgraded ad formats inside ChatGPT — The Agile Brand Guide
  4. OpenAI ChatGPT Ads Manager self-serve launch — Flyweel
  5. Google Marketing Live 2026: News and announcements (Preferred & Highly Cited) — Google
  6. Marc Benioff: Salesforce keeping engineering slim, still hiring sales — Fortune
  7. DarkIris Announces Commercial Launch of AIGC Video Platform on Seedance 2.0 — FinancialContent
  8. Google tests branded search controls in AI Max & Ravelin agentic e-commerce data — The Agile Brand Guide
  9. Navigating the Agentic Frontier: IAB Tech Lab 2026 Roadmap (compliance context) — IAB Tech Lab
  10. Top Daily Marketing Stories Today — May 30, 2026 (Twitch measurement & Colorado AI Act) — Marketing Agent Blog

Get tomorrow's daily marketing brief

Sharp takes on the trends moving US market entry. 5 minutes a day. From the Landbridge desk.