Modular: Modverse #54: AMD AI DevDay, New Modular Offices, and a Community That Keeps Shipping

Gemma 4 launched day zero on Modular Cloud. When Google DeepMind released Gemma 4, Modular was ready. MAX-powered endpoints for all Gemma 4 variants (including the 31B dense model and the 26B A4B MoE) went live the same day, with 15% higher throughput than vLLM on NVIDIA B200 at no accuracy cost. Read the full story on the Modular blog.

New blog series: Software Pipelining for GPU Kernels. Part 1 covers the core challenge of overlapping memory transfers with compute to keep GPU hardware busy. If you've been following the Structured Mojo Kernels series, this is a good next read. Dig in with Part 1.

Structured Mojo Kernels, Part 4: Portability and the Road Ahead. The final post in the series covers how Modular's kernel abstractions handle cross-hardware portability, with the same kernel code targeting NVIDIA and AMD GPUs without rewrites. Read Part 4 here.

TileTensor, Part 1: Safer, More Efficient GPU Kernels. A new series introducing TileTensor, Modular's abstraction for structuring tensor data in GPU kernel development. The first post covers the core design and why it makes kernels both safer and faster to write. Start with Part 1.

How Frontier Coding Agents Built a Video Diffusion Pipeline on MAX. Claude, Cursor, and Codex built a working video diffusion pipeline on MAX using Mojo AI coding skills, without any of the GPU kernel code being written directly by a human. A useful case study in what AI-assisted GPU development looks like today. Read the full post on the Modular blog.

Inside MAX Serve: From Prompt to Response. A new video walkthrough of how MAX Serve handles the full lifecycle of an inference request, from token arrival through scheduling, batching, and GPU execution. Watch it on YouTube.

Edinburgh and San Francisco offices are open. Modular’s new Edinburgh office sits inside the Bayes Centre, where AI research and industry teams work alongside each other. The team also opened the doors to a Jackson Square location in San Francisco. Read the office announcement.

Modverse #54: AMD AI DevDay, New Modular Offices, and a Community That Keeps Shipping

Community Innovations

Modular Making Waves

Open Source Contributions

Modular News & Events: Stay Connected

Read more from Modular