Blog

🚨

New

Modular 26.1: A Big Step Towards More Programmable and Portable AI Infrastructure

January 29, 2026

🚨

New

🔥 Modular 2025 Year in Review

January 29, 2026

🚨

New

The path to Mojo 1.0

January 29, 2026

Latest

🚨

News

Engineering

Evaluating Llama Guard with MAX 24.6 and Hugging Face

Imagine unlocking a world of open innovation while ensuring secure, reliable, and enterprise-ready Gen AI deployments—MAX 24.6 enables enterprise AI teams to seamlessly run a vast range of cutting-edge AI models from Hugging Face on NVIDIA GPUs.

December 19, 2024

Bill Welense

Read

🚨

News

Engineering

Build a Continuous Chat Interface with Llama 3 and MAX Serve

Build a Chat Application with Llama 3 and MAX Serve

December 17, 2024

Ehsan M. Kermani

Read

🚨

News

Product

Introducing MAX 24.6: A GPU Native Generative AI Platform

MAX 24.6 release bog featuring MAX GPU

December 17, 2024

Modular Team

Read

🚨

News

Engineering

MAX GPU: State of the Art Throughput on a New GenAI platform

Measuring state of the art GPU performance compared to vLLM on Modular's MAX 24.6

December 17, 2024

Max Hutchinson

Tyler Kenney

Read

🚨

News

Engineering

Understanding SIMD: Infinite Complexity of Trivial Problems

A deep dive into the complexities of optimizing code for SIMD instruction sets across multiple platforms.

October 25, 2024

Ash Vardanian

Read

🚨

News

Community

Community Spotlight: Writing Mojo with Cursor

October 10, 2024

Julian Acero

Caroline Frasca

Read

🚨

News

Engineering

Hands-on with Mojo 24.5

Hands-on with Mojo 24.5 and learn how to apply new language features in your code

October 1, 2024

Ehsan M. Kermani

Read

🚨

News

Product

MAX 24.5 - With SOTA CPU Performance for Llama 3.1

We’re excited to announce the release of MAX 24.5, which ships with significant improvements to Llama 3.1 CPU performance, new Python graph API bindings, our biggest update to Mojo ever, industry-standard packaging, and a clarified license.

September 13, 2024

Modular Team

Read

🚨

News

Engineering

Announcing stack-pr: an open source tool for managing stacked PRs on GitHub

We are pleased to announce the release of a new tool aimed at simplifying the management of stacked pull requests (PRs) on GitHub - stack-pr. This tool is still in its early development days, but we are excited to share it with the community and welcome your contributions.

July 23, 2024

Mikhail Zolotukhin

Read

🚨

News

Engineering

Debugging in Mojo🔥

Developer tooling is a big priority for Mojo and MAX, we want to vastly improve the debugging experience compared to the traditional Python, C++, and CUDA stack. Machine learning often requires inspecting the state of a program after a long running process, requiring more control than what "print debugging" gives you. Over time this tooling will extend to GPUs, allowing you to step through CPU code into GPU calls with the same developer experience.

July 16, 2024

Jack Clayton

Walter Erquinigo

Read

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

Get started - FREE

View Editions

Get started guide
Install MAX with a few commands and deploy a GenAI model locally.
Read Guide
Browse open models
500+ models, many optimized for lightning-fast performance
Browse models

Blog

Latest

Sign up for our newsletter