Blog

🚨

New

Qualcomm to Acquire Modular

June 24, 2026

🚨

New

ModCon 2026: Modular’s Developer Conference

June 17, 2026

🚨

New

Modular 26.4: SOTA MoE Serving, Model Bringup via Agent Skills, Mojo 1.0 Beta 2 and More

June 18, 2026

Latest

🚨

News

Company

Modular Raises $250M to scale AI's Unified Compute Layer

Modular Raises $250M in Third Round to Unify AI Compute

September 24, 2025

Modular Team

Read

🚨

News

Product

Modular 25.6: Unifying the latest GPUs from NVIDIA, AMD, and Apple

We’re excited to announce Modular Platform 25.6 – a major milestone in our mission to build AI’s unified compute layer. With 25.6, we’re delivering the clearest proof yet of our mission: a unified compute layer that spans from laptops to the world’s most powerful datacenter GPUs. The platform now delivers:

September 22, 2025

Modular Team

Read

🚨

News

Series

Matrix Multiplication on Blackwell: Part 4 - Breaking SOTA

In this blog post, we’ll continue our journey to build a state-of-the-art (SOTA) matmul kernel on NVIDIA Blackwell by exploring the cluster launch control (CLC) optimization. At the end of the post we’ll improve our performance by another 15% and achieve 1772 TFLOPs, exceeding that of the current SOTA.

September 19, 2025

Ali Taha

Jiexiang Liu

Hengjie Wang

Abdul Dakkak

Read

🚨

News

Community

Modverse #51: Modular x Inworld x Oracle, Modular Meetup Recap and Community Projects

The Modular community has been buzzing this month, from our Los Altos Meetup talks and fresh engineering docs to big wins with Inworld and Oracle. Catch the highlights, new tutorials, and open-source contributions in this edition of Modverse.

September 19, 2025

Caroline Frasca

Read

🚨

News

Series

Matrix Multiplication on Blackwell: Part 3 - The Optimizations Behind 85% of SOTA Performance

In this post, we continue on this journey and discuss how to leverage the 2SM technique along with pipelining to increase our performance about 5x and get within 85% of state-of-the-art (SOTA).

September 12, 2025

Ali Taha

Jiexiang Liu

Hengjie Wang

Abdul Dakkak

Read

🚨

News

Series

Matrix Multiplication on Blackwell: Part 2 - Using Hardware Features to Optimize Matmul

In this post we are going to continue our journey and improve our performance by more than 50x our initial kernel benchmark. Along the way we are going to explain more GPU programming concepts and leverage novel Blackwell features.

September 5, 2025

Ali Taha

Jiexiang Liu

Hengjie Wang

Abdul Dakkak

Read

🚨

News

Series

Matrix Multiplication on Blackwell: Part 1 - Introduction

This series of blog posts will showcase how one can: 1. Write a high-performance GPU kernel on Blackwell that offers performance competitive to that of NVIDIA's cuBLAS implementation. 2. Shows how one can leverage Mojo's special features to make the kernel as simple as possible.

August 28, 2025

Ali Taha

Jiexiang Liu

Hengjie Wang

Read

🚨

News

Community

Modverse #50: Modular Platform 25.5, Community Meetups, and Mojo's Debut in the Stack Overflow Developer Survey

This past month brought a wave of community projects and milestones across the Modular ecosystem!Modular Platform 25.5 landed with Large Scale Batch Inference, leaner packages, and new integrations that make scaling AI easier than ever. It’s already powering production deployments like SF Compute’s Large Scale Inference Batch API, cutting costs by up to 80% while supporting more than 15 leading models.

August 21, 2025

Caroline Frasca

Read

🚨

News

Product

Modular Platform 25.5: Introducing Large Scale Batch Inference

Modular Platform 25.5 is here, and introduces Large Scale Batch Inference: a highly asynchronous, at-scale batch API built on open standards and powered by Mammoth. We're launching this new capability through our partner SF Compute, enabling high-volume AI performance with a fast, accurate, and efficient platform that seamlessly scales workloads across any hardware.

August 5, 2025

Modular Team

Read

🚨

News

Company

SF Compute and Modular Partner to Revolutionize AI Inference Economics

Modular has partnered with SF Compute to address a fundamental asymmetry in the AI ecosystem: while model capabilities advance exponentially, the economic structures governing compute costs remain anchored in legacy paradigms.

July 31, 2025

Modular Team

SF Compute Team

Read

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

Get started - FREE

View Editions

Sign up today
Signup to our Cloud Platform today to get started easily.
Sign Up
Browse open models
Browse our model catalog, or deploy your own custom model
Browse models

Blog

Latest

Sign up for our newsletter