Big Trends & Ecosystem Shifts 🌎
Project Glasswing's first update shows that Claude Mythos Preview, running with ~50 partners, surfaced 10,000+ high and critical-severity vulnerabilities in a single month. With under 1% patched so far, the bottleneck has flipped from finding bugs to fixing them. Worth noting that Mythos is the only LLM that was able to find most of these vulnerabilities in the first place.
Microsoft is canceling most internal Claude Code licenses, moving thousands of engineers to GitHub Copilot CLI by June 30. Claude Code simply got too popular internally, undermining adoption of Microsoft's own tool while the token costs continued to skyrocket.
Dueling appearances this week put the two labs in opposing camps on the question everyone is asking: does AI gut white-collar work or supercharge it? Anthropic leaned into the disruption framing; OpenAI pushed optimism. Expect "jobs" to harden into a positioning battle as much as a research debate.
Developer Tools 🛠️
A new natural-experiment study finds Copilot availability boosted contributions 28–40%, but the gains skew toward incremental and maintenance work over substantive, capability-creating code. Translation: AI clears the backlog well, the hard architectural calls less so. A useful counterweight to the raw productivity numbers.
Artificial Analysis and IBM launched ITBench-AA, the first benchmark for agentic enterprise IT work, starting with Kubernetes incident response. Every frontier model scored under 50%, and more digging often hurt, as over-eager agents flagged false root causes. A rare unsaturated benchmark, and a reality check on "autonomous" SRE.
Best Upcoming Events

🗽 Next week is the #NYTechWeek! And we found the best engineering event for you to attend on each day:
Monday: Slack x Amplitude: AI at Work
Tuesday: Future of DevEx
Wednesday: AI Builders Night
Thursday: Agents at Scale
Friday: Finance Agents Demo Night
See you there!

