Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Illustration of a smiling astronaut and a cheerful orange flame character floating in front of a neon-lit triangular background.

Democratizing AI Compute Series

Go behind the scenes of the AI industry with Chris Lattner

Latest

🚨

News

Series

Modverse #46: MAX 25.1, MAX Builds, and Democratizing AI Compute

We recently introduced MAX 25.1, a major leap forward in AI development. This release enhances agentic and LLM workflows, introduces MAX Builds as a central hub for GenAI models and application recipes, and debuts a new GPU programming interface. Developers can now take advantage of GPU-accelerated embeddings, OpenAI-compatible function calling, structured output generation, and high-performance LLM optimizations like paged attention and prefix caching for improved efficiency.

February 27, 2025

/

Caroline Frasca

,  

🚨

News

Series

CUDA is the incumbent, but is it any good? (Democratizing AI Compute, Part 4)

Answering the question of whether CUDA is “good” is much trickier than it sounds.

February 20, 2025

/

Chris Lattner

,  

🚨

News

Product

MAX 25.1 - Introducing MAX Builds

February 18, 2025

/

Modular Team

,  

🚨

News

Series

How did CUDA succeed? (Democratizing AI Compute, Part 3)

If we as an ecosystem hope to make progress, we need to understand how the CUDA software empire became so dominant.

February 12, 2025

/

Chris Lattner

,  

🚨

News

Product

Paged Attention & Prefix Caching Now Available in MAX Serve

PagedAttention & Prefix Caching Now Available in MAX Serve

February 6, 2025

/

Ehsan M. Kermani

,  

🚨

News

Series

What exactly is “CUDA”? (Democratizing AI Compute, Part 2)

February 5, 2025

/

Chris Lattner

,  

🚨

News

Engineering

Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling

Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling

January 30, 2025

/

Ehsan M. Kermani

,  

🚨

News

Series

DeepSeek's Impact on AI (Democratizing AI Compute, Part 1)

Part 1 of an article that explores the future of hardware acceleration for AI beyond CUDA, framed in the context of the release of DeepSeek

January 30, 2025

/

Chris Lattner

,  

🚨

News

Engineering

Use MAX with Open WebUI for RAG and Web Search

Learn how quickly MAX and Open WebUI get you up-and-running with RAG, web search, and Llama 3.1 on GPU

January 23, 2025

/

Bill Welense

,  

🚨

News

Engineering

Hands-on with Mojo 24.6

Mojo 24.6 introduces key improvements in argument conventions, memory management, and reference tracking, enhancing code clarity and safety with features like 'mut' for mutable arguments, 'origins' for references, and new collection types.

January 21, 2025

/

Ehsan M. Kermani

,  

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

View Editions
  • Person with blonde hair using a laptop with an Apple logo.

    Get started guide

    Install MAX with a few commands and deploy a GenAI model locally.

    Read Guide
  • Magnifying glass emoji with black handle and round clear lens.

    Browse open models

    500+ models, many optimized for lightning-fast performance

    Browse models