Modular acquires BentoML to deliver production AI in the cloud!  - Read more

Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Illustration of a smiling astronaut and a cheerful orange flame character floating in front of a neon-lit triangular background.

Democratizing AI Compute Series

Go behind the scenes of the AI industry with Chris Lattner

Latest

🚨

News

Community

Modverse #53: Community Builds, Research Milestones, and a Growing Ecosystem

This edition captures everything happening across the Modular ecosystem, from developers building with MAX and Mojo🔥 to the broader impact Modular is having across AI infrastructure. Here's a look at what's been happening lately.

March 6, 2026

/

Inaara Walji

,  

🚨

News

Engineering

Structured Mojo Kernels Part 1 - Peak Performance, Half the Code

GPU programming has always demanded precision, but the cost of that precision keeps rising. A production matmul kernel written in C++ spans 3,000–5,000 lines of tightly coupled code where a misplaced barrier silently corrupts results. That complexity gatekeeps hardware that should be available to far more developers, and it's a direct product of how GPUs have evolved: with each architecture generation, more of the orchestration burden has shifted onto the programmer.

March 4, 2026

/

Fabio Riccardi

,  

Modular Kernel Team

,  

🚨

News

Engineering

The Claude C Compiler: What It Reveals About the Future of Software

Compilers occupy a special place in computer science. They're a canonical course in computer science education. Building one is a rite of passage. It forces you to confront how software actually works, by examining languages, abstractions, hardware, and the boundary between human intent and machine execution.

February 18, 2026

/

Chris Lattner

,  

🚨

News

Company

BentoML Joins Modular

Today, BentoML is joining Modular.

February 10, 2026

/

Chris Lattner

,  

Chaoyu Yang

,  

Tim Davis

,  

🚨

News

Engineering

The Five Eras of KVCache

vLLM, SGLang, TensorRT-LLM, and MAX Serve are all built on top of increasingly sophisticated KV cache management. This blog explores the evolution and role of the KV cache in these inference engines

February 5, 2026

/

Brian Zhang

,  

🚨

News

Product

Modular 26.1: A Big Step Towards More Programmable and Portable AI Infrastructure

Today we’re releasing Modular 26.1, a major step toward making high-performance AI computing easier to build, debug, and deploy across heterogeneous hardware. This release is focused squarely on developer velocity and programmability—helping advanced AI teams reduce time to market for their most important innovations.

January 29, 2026

/

Modular Team

,  

🚨

News

Community

How to Beat Unsloth's CUDA Kernel Using Mojo—With Zero GPU Experience

Traditional GPU programming has a steep learning curve. The performance gains are massive, but the path to get there (CUDA, PTX, memory hierarchies, occupancy tuning) stops most developers before they start. Mojo aims to flatten that curve: Python-like syntax, systems-level performance, no interop gymnastics, and the same performance gains.

January 14, 2026

/

David Robertson

,  

🚨

News

Community

🔥 Modular 2025 Year in Review

Our four-part series documenting the path to record-breaking matrix multiplication performance became essential reading for anyone serious about LLM optimization. The series walks through every optimization step—from baseline implementations to advanced techniques like warp specialization and async copies—showing you exactly how to extract maximum performance from cutting-edge hardware.

December 19, 2025

/

Michael Dunn-OConnor

,  

🚨

News

Product

The path to Mojo 1.0

While we are excited about this milestone, this of course won’t be the end of Mojo development! Some commonly requested capabilities for more general systems programming won’t be completed for 1.0, such as a robust async programming model and support for private members. Read below for more information on that!

December 5, 2025

/

Modular Team

,  

🚨

News

Community

Modverse #52: Advancing AI Together — Community Projects & Platform Milestones

The Modular universe is buzzing! From next-level community projects to recognition across the AI and developer space, here’s the latest from our growing ecosystem.

December 3, 2025

/

Inaara Walji

,  

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

View Editions
  • Person with blonde hair using a laptop with an Apple logo.

    Get started guide

    Install MAX with a few commands and deploy a GenAI model locally.

    Read Guide
  • Magnifying glass emoji with black handle and round clear lens.

    Browse open models

    500+ models, many optimized for lightning-fast performance

    Browse models