All Articles  (X)

Clear
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

🚨

NEW

Company

Modular Raises $250M to scale AI's Unified Compute Layer

Modular Raises $250M in Third Round to Unify AI Compute

September 24, 2025

/

Modular Team

Read

🚨

NEW

Product

Modular 25.6: Unifying the latest GPUs from NVIDIA, AMD, and Apple

We’re excited to announce Modular Platform 25.6 – a major milestone in our mission to build AI’s unified compute layer. With 25.6, we’re delivering the clearest proof yet of our mission: a unified compute layer that spans from laptops to the world’s most powerful datacenter GPUs. The platform now delivers:

September 22, 2025

/

Modular Team

Read

🚨

NEW

Community

Modverse #51: Modular x Inworld x Oracle, Modular Meetup Recap and Community Projects

The Modular community has been buzzing this month, from our Los Altos Meetup talks and fresh engineering docs to big wins with Inworld and Oracle. Catch the highlights, new tutorials, and open-source contributions in this edition of Modverse.

September 19, 2025

/

Caroline Frasca

Read

🚨

NEW

Engineering

Matrix Multiplication on Blackwell: Part 4 - Breaking SOTA

In this blog post, we’ll continue our journey to build a state-of-the-art (SOTA) matmul kernel on NVIDIA Blackwell by exploring the cluster launch control (CLC) optimization. At the end of the post we’ll improve our performance by another 15% and achieve 1772 TFLOPs, exceeding that of the current SOTA.

September 19, 2025

/

Ali Taha

Jiexiang Liu

Hengjie Wang

Abdul Dakkak

Read

🚨

NEW

Engineering

Matrix Multiplication on Blackwell: Part 3 - The Optimizations Behind 85% of SOTA Performance

In this post, we continue on this journey and discuss how to leverage the 2SM technique along with pipelining to increase our performance about 5x and get within 85% of state-of-the-art (SOTA).

September 12, 2025

/

Ali Taha

Jiexiang Liu

Hengjie Wang

Abdul Dakkak

Read

🚨

NEW

Engineering

Matrix Multiplication on Blackwell: Part 2 - Using Hardware Features to Optimize Matmul

In this post we are going to continue our journey and improve our performance by more than 50x our initial kernel benchmark. Along the way we are going to explain more GPU programming concepts and leverage novel Blackwell features.

September 5, 2025

/

Ali Taha

Jiexiang Liu

Hengjie Wang

Abdul Dakkak

Read

🚨

NEW

Engineering

Matrix Multiplication on Blackwell: Part 1 - Introduction

This series of blog posts will showcase how one can: 1. Write a high-performance GPU kernel on Blackwell that offers performance competitive to that of NVIDIA's cuBLAS implementation. 2. Shows how one can leverage Mojo's special features to make the kernel as simple as possible.

August 28, 2025

/

Ali Taha

Jiexiang Liu

Hengjie Wang

Read

🚨

NEW

Community

Modverse #50: Modular Platform 25.5, Community Meetups, and Mojo's Debut in the Stack Overflow Developer Survey

This past month brought a wave of community projects and milestones across the Modular ecosystem!Modular Platform 25.5 landed with Large Scale Batch Inference, leaner packages, and new integrations that make scaling AI easier than ever. It’s already powering production deployments like SF Compute’s Large Scale Inference Batch API, cutting costs by up to 80% while supporting more than 15 leading models.

August 21, 2025

/

Caroline Frasca

Read

🚨

NEW

Product

Modular Platform 25.5: Introducing Large Scale Batch Inference

Modular Platform 25.5 is here, and introduces Large Scale Batch Inference: a highly asynchronous, at-scale batch API built on open standards and powered by Mammoth. We're launching this new capability through our partner SF Compute, enabling high-volume AI performance with a fast, accurate, and efficient platform that seamlessly scales workloads across any hardware.

August 5, 2025

/

Modular Team

Read

🚨

NEW

Company

SF Compute and Modular Partner to Revolutionize AI Inference Economics

Modular has partnered with SF Compute to address a fundamental asymmetry in the AI ecosystem: while model capabilities advance exponentially, the economic structures governing compute costs remain anchored in legacy paradigms. 

July 31, 2025

/

Modular Team

SF Compute Team

Read

🤔

No results for this query