Join our newsletter

Get all our latest news, announcements and updates delivered directly to your inbox. Unsubscribe at anytime.

Return to page

Product

MODULAR PLATFORM

MAX Framework

GenAI serving framework

Mojo Language

The best GPU & CPU performance

Mammoth

Scale intelligently to any cluster

DEPLOYMENT OPTIONS

Editions

All the ways you can use Modular

AI Agents

Build agent workflows

RAG & CAG

AI retrieval and controlled generation

Chatbots

Conversations and interactions

Code Generation

Work with top open code gen models

Batch processing

Improve resource utilization

AI Inference

Fast, Scalable AI Inference

Research

Model & kernel development

Resources

Docs

Get up and running. Fast.

Models

500+ supported open models

Tutorials

Build amazing things

Recipes

Step-by-step guides

GPU Puzzles

Learn GPU Programming

Customers
Docs
Blog
Company

About

Build AI for anyone, anywhere.

Careers

We’re currently hiring!

Culture

What we believe

Contact Us

Request a demo

Request Demo
Get Started
Get started
close

Get early access to Modular Dedicated Endpoints & Enterprise

  • Built for advanced use cases

  • Control of the entire vertical stack

  • Easy to deploy for everyone


Our sales team reads each request and will reach out with next steps.

Thank you for your submission.

Your report has been received and is being reviewed by the Sales team. A member from our team will reach out to you shortly.

Thank you,

Modular Sales Team

~70% faster compared to vanilla vLLM

Igor Poletaev - Chief Science Officer - Inworld

"Our collaboration with Modular is a glimpse into the future of accessible AI infrastructure. Our API now returns the first 2 seconds of synthesized audio on average ~70% faster compared to vanilla vLLM based implementation, at just 200ms for 2 second chunks. This allowed us to serve more QPS with lower latency and eventually offer the API at a ~60% lower price than would have been possible without using Modular’s stack."

Read study

Slashed our inference costs by 80%

Evan Conrad - CEO - San Francisco Compute

"Modular’s team is world class. Their stack slashed our inference costs by 80%, letting our customer dramatically scale up. They’re fast, reliable, and real engineers who take things seriously. We’re excited to partner with them to bring down prices for everyone, to let AI bring about wide prosperity."

Read study

Confidently deploy our solution across NVIDIA and AMD

Evan Owen - CTO, Qwerky AI

"Modular allows Qwerky to write our optimized code and confidently deploy our solution across NVIDIA and AMD solutions without the massive overhead of re-writing native code for each system."

Read study

MAX Platform supercharges this mission

Bratin Saha - VP of Machine Learning & AI services

"At AWS we are focused on powering the future of AI by providing the largest enterprises and fastest-growing startups with services that lower their costs and enable them to move faster. The MAX Platform supercharges this mission for our millions of AWS customers, helping them bring the newest GenAI innovations and traditional AI use cases to market faster."

Read study

Supercharging and scaling

Dave Salvator - Director, AI and Cloud

"Developers everywhere are helping their companies adopt and implement generative AI applications that are customized with the knowledge and needs of their business. Adding full-stack NVIDIA accelerated computing support to the MAX platform brings the world’s leading AI infrastructure to Modular’s broad developer ecosystem, supercharging and scaling the work that is fundamental to companies’ business transformation."

Read study

Build, optimize, and scale AI systems on AMD

Vamsi Boppana - SVP of AI, AMD

"We're truly in a golden age of AI, and at AMD we're proud to deliver world-class compute for the next generation of large-scale inference and training workloads… We also know that great hardware alone is not enough. We've invested deeply in open software with ROCm, empowering developers and researchers with the tools they need to build, optimize, and scale AI systems on AMD. This is why we are excited to partner with Modular… and we’re thrilled that we can empower developers and researchers to build the future of AI."

Read study

Latest from our blog:

New

Achieving State-of-the-Art Performance on AMD MI355 — in Just 14 Days

Get the latest news,
announcements & updates:

Join our Newsletter
  • Product

    • Editions
    • Install
    • Mammoth
    • MAX
    • Mojo
  • Quick Start

    • Documentation
    • Tutorials
    • GenAI models
    • Run Gemma3
    • Run GPT-OSS
  • Solutions

    Batch inference

    AI Agents

    AI Inference

    Chatbots

    Code Generation

    RAG & CAG

    Research

  • Developers

    • Docs
    • Modular Help Forum
    • MAX Changelog
    • Mojo🔥 Changelog
  • Connect

    • Blog
    • Community
    • Report security issue
  • Company

    • About Us
    • Culture
    • Careers
    • Request a demo

Copyright © 2025 Modular Inc

Terms, Privacy & Acceptable Use