Modular acquires BentoML to deliver production AI in the cloud!  - Read more

Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Illustration of a smiling astronaut and a cheerful orange flame character floating in front of a neon-lit triangular background.

Democratizing AI Compute Series

Go behind the scenes of the AI industry with Chris Lattner

Latest

🚨

News

Product

Modular 25.7: Faster Inference, Safer GPU Programming, and a More Unified Developer Experience

Today, we’re excited to release Modular Platform 25.7, an update that deepens our vision of a unified, high-performance compute layer for AI. With a fully open MAX Python API, an experimental next-generation modeling API, expanded hardware support for NVIDIA Grace superchips, and a safer, more capable Mojo GPU programming experience, this release moves us closer to an ecosystem where developers spend less time fighting infrastructure and more time advancing what AI can do.

November 20, 2025

/

Modular Team

,  

🚨

News

Company

"TTS 1 Max" (powered by Modular Platform) Ranked #1 Speech Model on Artificial Analysis

Today the "Inworld TTS 1 Max" model (powered by the Modular Platform) is in #1 position on the Artificial Analysis speech leaderboard!

November 7, 2025

/

Modular Team

,  

🚨

News

Community

PyTorch and LLVM in 2025 — Keeping up With AI Innovation

Along with several teammates, I had the privilege of attending two recent developer events in the AI software stack: PyTorch Conference 2025 (October 22-23) in San Francisco and LLVM Developers' Meeting (October 28-29) in Santa Clara. In this post, I’ll share some observations that stood out among all the conference sessions and conversations I had with developers.

November 6, 2025

/

Michael Dunn-OConnor

,  

🚨

News

Engineering

Achieving State-of-the-Art Performance on AMD MI355 — in Just 14 Days

In late August, AMD and TensorWave reached out to collaborate on a presentation for AMD’s Media Tech Day—they asked if we could demo MAX on AMD Instinct™ MI355 on September 16th. There was just one problem: no one at Modular had access to an MI355.

October 17, 2025

/

Tracy Sharpe

,  

Anand Pratap Singh

,  

Prince Jain

,  

Abdul Dakkak

,  

🚨

News

Company

Modular Raises $250M to scale AI's Unified Compute Layer

Modular Raises $250M in Third Round to Unify AI Compute

September 24, 2025

/

Modular Team

,  

🚨

News

Product

Modular 25.6: Unifying the latest GPUs from NVIDIA, AMD, and Apple

We’re excited to announce Modular Platform 25.6 – a major milestone in our mission to build AI’s unified compute layer. With 25.6, we’re delivering the clearest proof yet of our mission: a unified compute layer that spans from laptops to the world’s most powerful datacenter GPUs. The platform now delivers:

September 22, 2025

/

Modular Team

,  

🚨

News

Series

Matrix Multiplication on Blackwell: Part 4 - Breaking SOTA

In this blog post, we’ll continue our journey to build a state-of-the-art (SOTA) matmul kernel on NVIDIA Blackwell by exploring the cluster launch control (CLC) optimization. At the end of the post we’ll improve our performance by another 15% and achieve 1772 TFLOPs, exceeding that of the current SOTA.

September 19, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

Abdul Dakkak

,  

🚨

News

Community

Modverse #51: Modular x Inworld x Oracle, Modular Meetup Recap and Community Projects

The Modular community has been buzzing this month, from our Los Altos Meetup talks and fresh engineering docs to big wins with Inworld and Oracle. Catch the highlights, new tutorials, and open-source contributions in this edition of Modverse.

September 19, 2025

/

Caroline Frasca

,  

🚨

News

Series

Matrix Multiplication on Blackwell: Part 3 - The Optimizations Behind 85% of SOTA Performance

In this post, we continue on this journey and discuss how to leverage the 2SM technique along with pipelining to increase our performance about 5x and get within 85% of state-of-the-art (SOTA).

September 12, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

Abdul Dakkak

,  

🚨

News

Series

Matrix Multiplication on Blackwell: Part 2 - Using Hardware Features to Optimize Matmul

In this post we are going to continue our journey and improve our performance by more than 50x our initial kernel benchmark. Along the way we are going to explain more GPU programming concepts and leverage novel Blackwell features.

September 5, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

Abdul Dakkak

,  

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

View Editions
  • Person with blonde hair using a laptop with an Apple logo.

    Get started guide

    Install MAX with a few commands and deploy a GenAI model locally.

    Read Guide
  • Magnifying glass emoji with black handle and round clear lens.

    Browse open models

    500+ models, many optimized for lightning-fast performance

    Browse models