Max
 Platform
accelerates the pace of AI.

It's  Programmable 

We rebuilt the modern AI software stack, from the ground up, to boost any AI pipeline, on any hardware.

Programmable, performant &  portable

Full programmability

MAX is built on top of Mojo from the ground up to empower AI engineers to unlock the full potential of AI hardware by combining the usability of Python, the safety of Rust, and the performance of C.

Unparalleled performance

MAX unlocks state-of-the-art performance for your AI models. Extend and optimize your AI pipelines without having to rewrite them, with unparalleled performance using a next generation compiler.

Seamless portability

Seamlessly move your models and AI pipelines to any hardware target, maximizing your performance to cost ratio and avoiding vendor lock-in.

Unparalleled latency & cost savings

MAX unlocks state-of-the-art latency and throughput for your AI pipeline, including generative models, helping you quickly productionize AI pipelines and realize massive cost savings on your cloud bill.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
1.7
x

vs

Tensorflow logo

Modular is 1.7x faster than TensorFlow when running [Stable Diffusion-UNet] on [CPU]

1.7
x

vs

Pytorch logo

Modular is 1.7x faster than PyTorch when running [Stable Diffusion-UNet] on [CPU]

Do these numbers seem too good to be true? View in more detail, then sign up to compare locally.

An integrated AI 
developer experience

The Modular Accelerated Xecution (MAX) platform is a unified set of tools and libraries that provides everything you need to deploy low-latency, high-throughput, real-time AI inference pipelines into production.

Max
components

Incredibly easy to get started

from max import engine

# Load your model
session = engine.InferenceSession()
model = session.load(MODEL_PATH)

# Prepare the inputs, then run an inference
outputs = model.execute(**inputs)
from max.graph import Dim, Module, MOTensor

@value
struct LLM:
    var params: ModelParams
    fn build(inout self, inout m: Module):
        var g = m.graph("llm",
									TypeTuple(MOTensor(
                    DType.float32,
                    Dim.dynamic(),
										Dim.dynamic())
									)
								)
        ...
        g.output((reshape(
										next_token, self.batch
									)))
from max.engine import InferenceSession

var sess = InferenceSession()
var txt_enc = sess.load_model('txt-encoder')
var img_dec = sess.load_model('img-decoder')
var img_dif = sess.load_model('img-diffuser')
var latent = ...
for step in range(n_steps):
    var prev = latent
    var latent = execute(img_dif, latent)
    var pred = ...
    latent = ...

var decoded = execute(img_dec, latent)
var pixels = decoded.to_numpy()
var img = Image.fromarray(pixels, 'RGB')
from max import engine

# Load your model
session = engine.InferenceSession()
model = session.load(MODEL_PATH)

# Prepare the inputs, then run an inference
outputs = model.execute(**inputs)
from max.graph import Dim, Module, MOTensor

@value
struct LLM:
    var params: ModelParams
    fn build(inout self, inout m: Module):
        var g = m.graph("llm",
									TypeTuple(MOTensor(
                    DType.float32,
                    Dim.dynamic(),
										Dim.dynamic())
									)
								)
        ...
        g.output((reshape(
										next_token, self.batch
									)))
from max.engine import InferenceSession

var sess = InferenceSession()
var txt_enc = sess.load_model('txt-encoder')
var img_dec = sess.load_model('img-decoder')
var img_dif = sess.load_model('img-diffuser')
var latent = ...
for step in range(n_steps):
    var prev = latent
    var latent = execute(img_dif, latent)
    var pred = ...
    latent = ...

var decoded = execute(img_dec, latent)
var pixels = decoded.to_numpy()
var img = Image.fromarray(pixels, 'RGB')

Why Modular?

01

Our team has built most of the world’s existing AI infrastructure, including TensorFlow, PyTorch, ONNX, and XLA, and we’ve built and scaled dev tools like Swift, LLVM, and MLIR. Now we’re focused on rebuilding AI infrastructure for the world.

02

To unlock the next wave of AI innovation, we started with a “first principles” approach to building  the lowest layers of the AI stack. We can’t pile on more and more layers of complexity on top of already over-complicated existing solutions.

03

We build technology that meets you where you are. We don’t require you to rewrite your models, workflows, or application code, grapple with confusing converters, or be a hardware expert to take advantage of bleeding-edge technology.

Try 
Max
 right now

Up and running in 5 minutes.