Download MAX

A high-performance inference framework for AI

MAX provides powerful libraries and tools to develop, optimize and deploy AI on GPUs fast.

Quickstart with MAX

Talk to our engineers

Why developers use MAX

Incredible Performance
MAX was built from the ground up to deliver out-of-the-box performance for AI workloads.  See how we measure performance.
Hardware Portability
MAX provides portability across CPU+GPU generations, and gives you incredible utilization benefits - driving real compute cost savings.
Complete Control
Optimize your model's performance, write custom ops, or build your own model. MAX gives you full control over every layer of the stack.

Deploy Gen AI in seconds with MAX

AI Agents

Scale AI Agents seamlessly with enterprise-grade speed, reliability, and efficiency.

RAG & CAG

Enhance decision-making, drive automation, and optimize enterprise operations for efficiency.

Chatbots

Automate conversations and enhance user engagement and satisfaction.

Batch processing

Optimize GPU utilization, reduce latency and improve throughput for AI workloads.

Code Generation

Generate accurate, efficient, and reliable code that aligns with coding standards.

AI Inference

Build a scalable, cost effective inference infrastructure for your AI models.

Develop custom GPU research with MAX

ADVANCED DEVELOPERS

MAX for Research

Advanced tools and libraries for model, kernel, and hardware developers to deliver even more precise control. Some of the many tools include:

Build custom graphs
Control single to multi-gpu scaling
Program heterogenous compute
Write custom GPU code
Low-level host and device control

Works with AI code editors

Plug into any AI Coding Editor

The Modular Platform works great with any AI code editor. Cursor, Claude Code, Windsurf - all supported with streamlined setup.

Deploy GenAI Now: Step by step examples

AI Weather Agent

Build an intelligent weather assistant with MAX Serve, FastAPI and NextJS.

Get the code

Continuous Chat App

Build a functional chat application using Llama 3 and MAX Serve.

Get the code

Use Open WebUI

Use Open WebUI with MAX Serve for RAG and Web Search.

Get the code

Offline Inference with MAX

Pair MAX with Hugging Face to perform inference locally and efficiently.

Get the code

Generate embeddings

Generate Embeddings with MAX Serve, using an OpenAI-compatible API.

Get the code

OpenAI Function Calling

Creating AI Agents with MAX Serve and OpenAI Function Calling

Get the code

FREE for everyone

Paid support for scaled enterprise deployments

What developers are saying about MAX

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273

Start building with Modular

Get started - Docs

A high-performance inference framework for AI

Why developers use MAX

Deploy Gen AI in seconds with MAX

Develop custom GPU research with MAX

MAX for Research

Works with AI code editors

Plug into any AI Coding Editor

Deploy GenAI Now: Step by step examples

FREE for everyone

What developers are saying about MAX

Quick start resources