Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Illustration of a smiling astronaut and a cheerful orange flame character floating in front of a neon-lit triangular background.

Democratizing AI Compute Series

Go behind the scenes of the AI industry with Chris Lattner

🚨

News

Engineering

How to setup a Mojo🔥 development environment with Docker containers

How do you guarantee that your software is portable, runs reliably, and scales easily in production environments? The short answer is: Use containers. Container technologies like Docker and Kubernetes are popular tools for building and deploying software applications, but until recently they were considered exotic infrastructure for IT/Ops experts. 

September 28, 2023

/

Shashank Prasanna

,  

🚨

News

Engineering

An easy introduction to Mojo🔥 for Python programmers

Learning a new programming language is hard. You have to learn new syntax, keywords, and best practices, all of which can be frustrating when you’re just starting. In this blog post, I want to share a gentle introduction to Mojo from a Python programmer’s perspective.

August 8, 2023

/

Shashank Prasanna

,  

🚨

News

Engineering

Modular natively supports dynamic shapes for AI workloads

Today’s AI infrastructure is difficult to evaluate - so many converge on simple and quantifiable metrics like QPS, Latency and Throughput. This is one reason why today’s AI industry is rife with bespoke tools that provide high performance on benchmarks but have significant usability challenges in real-world AI deployment scenarios.

June 22, 2023

/

Eric Johnson

,  

Kate Caldwell

,  

🚨

News

Engineering

The world's fastest unified matrix multiplication

In this post, we describe Modular’s approach to solving this problem and its game-changing benefits, including a new standard in state-of-the-art (SOTA) performance on CPU as compared to existing solutions.

April 20, 2023

/

Abdul Dakkak

,  

Chad Jarvis

,  

Eric Johnson

,  

Hengjie Wang

,  

Ian Tramble

,  

🚨

News

Engineering

AI’s compute fragmentation: what matrix multiplication teaches us

AI is powered by a virtuous circle of data, algorithms (“models”), and compute. Growth in one pushes needs in the others and can grossly affect the developer experience on aspects like usability and performance. Today, we have more data and more AI model research than ever before, but compute isn’t scaling at the same speed due to … well, physics.

March 23, 2023

/

Eric Johnson

,  

Abdul Dakkak

,  

Chad Jarvis

,  

🚨

News

Engineering

If AI serving tech can’t solve today’s problems, how do we scale into the future?

The technological progress that has been made in AI over the last ten years is breathtaking — from AlexNet in 2012 to the recent release of ChatGPT, which has taken large foundational models and conversational AI to another level.

December 8, 2022

/

Eric Johnson

,  

Tim Davis

,  

🚨

News

Engineering

Part 2: Increasing development velocity of giant AI models

The first four requirements address one fundamental problem with how we've been using MLIR: weights are constant data, but shouldn't be managed like other MLIR attributes. Until now, we've been trying to place a square peg into a round hole, creating a lot of wasted space that's costing us development velocity (and, therefore, money for users of the tools).

November 10, 2022

/

Abdul Dakkak

,  

Eric Johnson

,  

🚨

News

Engineering

Increasing development velocity of giant AI models

Machine learning models are getting larger and larger — some might even say, humongous. The world’s most advanced technology companies have been in an arms race to see who can train the largest model (MUM, OPT, GPT-3, Megatron), while other companies focused on production systems have scaled their existing models to great effect. Through all the excitement, what’s gone unsaid is the myriad of practical challenges larger models present for existing AI infrastructure and developer workflows.

August 12, 2022

/

Eric Johnson

,  

  • Series

    Democratizing Compute Series

    Go behind the scenes of the AI industry in this blog series by Chris Lattner. Trace the evolution of AI compute, dissect its current challenges, and discover how Modular is raising the bar with the world’s most open inference stack.

    11 part series

  • Series

    Matrix Multiplication on Blackwell

    Learn how to write a high-performance GPU kernel on Blackwell that offers performance competitive to that of NVIDIA's cuBLAS implementation while leveraging Mojo's special features to make the kernel as simple as possible.

    4 part series

No items found within this category

We couldn’t find anything. Try changing or resetting your filters.

Build the future of AI with Modular

View Editions
  • Person with blonde hair using a laptop with an Apple logo.

    Get started guide

    Install MAX with a few commands and deploy a GenAI model locally.

    Read Guide
  • Magnifying glass emoji with black handle and round clear lens.

    Browse open models

    500+ models, many optimized for lightning-fast performance

    Browse models