FLUX.2 Image Generation in under 1 second. Read More →

Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Illustration of a smiling astronaut and a cheerful orange flame character floating in front of a neon-lit triangular background.

Democratizing AI Compute Series

Go behind the scenes of the AI industry with Chris Lattner

Latest

🚨

News

Series

Matrix Multiplication on Blackwell: Part 4 - Breaking SOTA

In this blog post, we’ll continue our journey to build a state-of-the-art (SOTA) matmul kernel on NVIDIA Blackwell by exploring the cluster launch control (CLC) optimization. At the end of the post we’ll improve our performance by another 15% and achieve 1772 TFLOPs, exceeding that of the current SOTA.

September 19, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

Abdul Dakkak

,  

🚨

News

Community

Modverse #51: Modular x Inworld x Oracle, Modular Meetup Recap and Community Projects

The Modular community has been buzzing this month, from our Los Altos Meetup talks and fresh engineering docs to big wins with Inworld and Oracle. Catch the highlights, new tutorials, and open-source contributions in this edition of Modverse.

September 19, 2025

/

Caroline Frasca

,  

🚨

News

Series

Matrix Multiplication on Blackwell: Part 3 - The Optimizations Behind 85% of SOTA Performance

In this post, we continue on this journey and discuss how to leverage the 2SM technique along with pipelining to increase our performance about 5x and get within 85% of state-of-the-art (SOTA).

September 12, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

Abdul Dakkak

,  

🚨

News

Series

Matrix Multiplication on Blackwell: Part 2 - Using Hardware Features to Optimize Matmul

In this post we are going to continue our journey and improve our performance by more than 50x our initial kernel benchmark. Along the way we are going to explain more GPU programming concepts and leverage novel Blackwell features.

September 5, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

Abdul Dakkak

,  

🚨

News

Series

Matrix Multiplication on Blackwell: Part 1 - Introduction

This series of blog posts will showcase how one can: 1. Write a high-performance GPU kernel on Blackwell that offers performance competitive to that of NVIDIA's cuBLAS implementation. 2. Shows how one can leverage Mojo's special features to make the kernel as simple as possible.

August 28, 2025

/

Ali Taha

,  

Jiexiang Liu

,  

Hengjie Wang

,  

🚨

News

Community

Modverse #50: Modular Platform 25.5, Community Meetups, and Mojo's Debut in the Stack Overflow Developer Survey

This past month brought a wave of community projects and milestones across the Modular ecosystem!Modular Platform 25.5 landed with Large Scale Batch Inference, leaner packages, and new integrations that make scaling AI easier than ever. It’s already powering production deployments like SF Compute’s Large Scale Inference Batch API, cutting costs by up to 80% while supporting more than 15 leading models.

August 21, 2025

/

Caroline Frasca

,  

🚨

News

Product

Modular Platform 25.5: Introducing Large Scale Batch Inference

Modular Platform 25.5 is here, and introduces Large Scale Batch Inference: a highly asynchronous, at-scale batch API built on open standards and powered by Mammoth. We're launching this new capability through our partner SF Compute, enabling high-volume AI performance with a fast, accurate, and efficient platform that seamlessly scales workloads across any hardware.

August 5, 2025

/

Modular Team

,  

🚨

News

Company

SF Compute and Modular Partner to Revolutionize AI Inference Economics

Modular has partnered with SF Compute to address a fundamental asymmetry in the AI ecosystem: while model capabilities advance exponentially, the economic structures governing compute costs remain anchored in legacy paradigms. 

July 31, 2025

/

Modular Team

,  

SF Compute Team

,  

🚨

News

Product

AI Agents for AWS Marketplace

Modular Inc. announces MAX High-Performance GenAI Serving and MAX Code Repo Agent now available in AWS Marketplace's new AI Agents and Tools category, delivering 10x performance improvements and streamlined AI deployment for enterprises.

July 16, 2025

/

Modular Team

,  

🚨

News

Community

Modverse #49: Modular Platform 25.4, Modular 🤝 AMD, and Modular Hack Weekend

Between a global hackathon, a major release, and standout community projects, last month was full of progress across the Modular ecosystem!Modular Platform 25.4 launched on June 18th, alongside the announcement of our official partnership with AMD, bringing full support for AMD Instinct™ MI300X and MI325X GPUs. You can now deploy the same container across both AMD and NVIDIA hardware with no code changes, no vendor lock-in, and no additional configuration!

July 9, 2025

/

Caroline Frasca

,  

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

View Editions
  • Person with blonde hair using a laptop with an Apple logo.

    Sign up today

    Signup to our Cloud Platform today to get started easily.

    Sign Up
  • Magnifying glass emoji with black handle and round clear lens.

    Browse open models

    Browse our model catalog, or deploy your own custom model

    Browse models