Editor's Note
Welcome to The Future of DevEx - your weekly look at how AI is changing the way we build software.
This week showed how fast developer workflows are being rebuilt around agent-native infrastructure. GitHub is evolving into the operating system for AI agents, new funding rounds are supercharging dev tools, and research is forcing a rethink of model architecture itself. There’s a ton to dig into - here’s what matters most.

Developer Tools 🛠️

At Universe 2025, GitHub unveiled Agent HQ, a hub for deploying and managing AI agents that integrate directly with repos and Actions. With Copilot Workflows, you can now assign coding tasks to Copilot from tools like Slack or Linear. This means fewer context switches and a smoother way to assign, track, and complete repetitive tasks.

New Relic introduced agent-aware observability, plus MCP server that gives GitHub Copilot, Claude, and ChatGPT direct access to telemetry.
Instead of searching dashboards or pinging ops, your AI tools can now answer questions like “what caused the CPU spike yesterday?” with real data.

RapidFire AI lets you test multiple RAG configurations simultaneously — from chunking to reranking — even on a single laptop.
It drastically speeds up context engineering iteration. Devs can now quickly see what RAG setup delivers the best grounding without needing heavy infra or guesswork.

Webflow’s App Gen moves the platform beyond static sites into dynamic, interactive web applications — from prompt to deployment, no handoffs or switching tools. Developers and designers can now generate production-grade apps like calendars, calculators, and job boards directly inside Webflow.

Big Trends & Ecosystem Shifts 🌎

A seven-year pact gives OpenAI reserved access to EC2 UltraServers, thousands of GB200/GB300 GPUs, and scaling to millions of CPUs.
Expect OpenAI’s tools to become more persistent, stable, and baked into production environments.

Maybe Siri will finally get good? Apple will license Google’s Gemini for ~$1B/year, running it on Private Cloud Compute while handling sensitive queries on-device.
This confirms LLM integration into daily user interfaces. For mobile devs, it’s a wake-up call: conversational UI is about to be expected behavior across apps, not a novelty.

Built-in retrieval with citations lets you search documents directly inside models without additional RAG setup. This drastically lowers the barrier to entry for enterprise teams looking to augment chatbots with private data. Engineers can now build grounded assistants faster without wrangling vector stores or custom code.

Research and model breakthroughs 🔬

Tsinghua University and WeChat AI introduced CALM (Continuous Autoregressive Language Models), predicting compressed token vectors instead of individual tokens.
This slashes generation steps while maintaining output quality. This could mean faster inference, reduced cost, and more efficient models for constrained environments.

New research showed that diffusion-based LLMs outperform autoregressive ones when training data is scarce. This challenges the decade-long dominance of transformers and opens doors for smaller teams to train performant models with limited data. Expect easier fine-tuning and better performance on niche tasks.

Kimi K2 Thinking is an open-source agent model designed to handle long-horizon, multi-step problems. It maintains coherent reasoning across tool use and context shifts. It gives developers more transparent access to reasoning-focused models — great for debugging and customizing workflows.

Emerging Dev Tools 🚀

Backed by Microsoft, Databricks, and Nvidia, Inception is building new-generation diffusion models focused on structured outputs like code. It’s an important signal that “diffusion is only for images” is outdated. If successful, it could offer better control over codegen outputs with less hallucination than autoregressive models.

They’re building long-context agents optimized for decision making and grounded reasoning.
If successful, this could trickle down into dev workflows that involve ops planning, tradeoff analysis, and strategic tool routing.

Open Source Spotlight 🔍

Strix is an autonomous AI agent that behaves like a real attacker — executing code, exploiting vulnerabilities, and delivering PoC demos on the spot. Unlike traditional scanners, Strix uses dynamic analysis to confirm actual exploit paths in real time.

The project started to gain momentum in the last few weeks and it’s already at 10K+ GitHub stars. If you're tired of waiting weeks for manual pen tests, you should check it out.

Got a Question?

Got a burning engineering question? Just hit reply and we’ll tackle it. And if you enjoyed this issue, consider sharing it with your teammates.

Want to share your dev tool, research drops, or hot takes?

Submit your story here — we review every submission and highlight the best in future issues!

Till next time,

Future of DevEx

Keep Reading