MAX on GPU waiting list

Be the first to get lightning fast inference speed on your GPUs. Be the envy of all your competitors and lower your compute spend.

Powerful CPU+GPU Programming

Mojo   is a pythonic language for blazing-fast CPU+GPU execution on the MAX   Platform

Write fast, portable code thats easy to maintain

Mojo is an innovative, high-performance programming language designed for writing systems-level code for AI workloads.

awesome.🔥
1 def run_inference():
2   # Create the graph
3   graph = create_inference_graph()
4   # Place the graph on a GPU, if available. Fall back to CPU if not.
5   device = CPU() if accelerator_count() == 0 else Accelerator()
6   session = InferenceSession(
7         devices=[device]
8   )
9   # Compile the graph.
10  model = session.load(graph)
11  # Perform the calculation on the target device.
12  result = model.execute()[0]
13
14
15
1 # a function that takes ownership of String
2 fn take_ownership(owned y: String):
3     y = String("Bye!")  # y is an LValue here
4     print(y)
5 fn main():
6     var x = String("hello, world!")
7     take_ownership(x^)  # x^ is passed as RValue
8     # print(x)  # This would cause a compile-time error
9     # as x's ownership has been transferred
10
11
12
13
14
15
1 # exp is parameterized by the stored
2 # SIMD data type and register width
3 def exp[dt: DType, elts: Int]
4     (x: SIMD[dt, elts]) -> SIMD[dt, elts]:
5         x = clamp(x, -88.3762626647, 88.37626266)
6         k = floor(x * INV_LN2 + 0.5)
7         r = k * NEG_LN2 + x
8         return ldexp(_exp_taylor(r), k)
9
10
11
12
13
14
15
1 trait Stacklike:
2     alias EltType: CollectionElement
3 
4     def pop(mut self) -> Self.EltType:
5         pass
6 
7 struct MyStack[T: CollectionElement](Stacklike):
8     ...
9     var list: Self.list_type
10 
11     def pop(mut self) -> Self.EltType:
12         return self.list.pop()
13
14
15
1 @compiler.register("add_one_custom", num_dps_outputs=1)
2 struct AddOneCustom:
3     @staticmethod
4     fn execute[
5         synchronous: Bool,
6         target: StringLiteral, # e.g. "CUDA" or "CPU"
7     ](
8         out: ManagedTensorSlice,
9         x: ManagedTensorSlice[out.type, out.rank],
10         ctx: MojoCallContextPtr,
11     ):
12         @parameter
13         @always_inline
14         fn func[width: Int](idx: IndexList[x.rank]) -> SIMD[x.type, width]:
15             return x.load[width](idx) + 1
16
17     foreach[func, synchronous, target](out, ctx)
18
file_name.🔥
1 def run_inference():
2   # Create the graph
3   graph = create_inference_graph()
4   # Place the graph on a GPU, if available. Fall back to CPU if not.
5   device = CPU() if accelerator_count() == 0 else Accelerator()
6   session = InferenceSession(
7         devices=[device]
8   )
9   # Compile the graph.
10  model = session.load(graph)
11  # Perform the calculation on the target device.
12  result = model.execute()[0]
13
14
15
1 # a function that takes ownership of String
2 fn take_ownership(owned y: String):
3     y = String("Bye!")  # y is an LValue here
4     print(y)
5 fn main():
6     var x = String("hello, world!")
7     take_ownership(x^)  # x^ is passed as RValue
8     # print(x)  # This would cause a compile-time error
9 # as x's ownership has been transferred
1 # exp is parameterized by the stored
2 # SIMD data type and register width
3 def exp[dt: DType, elts: Int]
4     (x: SIMD[dt, elts]) -> SIMD[dt, elts]:
5         x = clamp(x, -88.3762626647, 88.37626266)
6         k = floor(x * INV_LN2 + 0.5)
7         r = k * NEG_LN2 + x
8         return ldexp(_exp_taylor(r), k)
9
10
11
12
13
14
15
1 trait Stacklike:
2     alias EltType: CollectionElement
3 
4     def pop(mut self) -> Self.EltType:
5         pass
6 
7 struct MyStack[T: CollectionElement](Stacklike):
8     ...
9     var list: Self.list_type
10 
11     def pop(mut self) -> Self.EltType:
12         return self.list.pop()
13
14
15
1 @compiler.register("add_one_custom", num_dps_outputs=1)
2 struct AddOneCustom:
3     @staticmethod
4     fn execute[
5         synchronous: Bool,
6         target: StringLiteral, # e.g. "CUDA" or "CPU"
7     ](
8         out: ManagedTensorSlice,
9         x: ManagedTensorSlice[out.type, out.rank],
10         ctx: MojoCallContextPtr,
11     ):
12         @parameter
13         @always_inline
14         fn func[width: Int](idx: IndexList[x.rank]) -> SIMD[x.type, width]:
15             return x.load[width](idx) + 1
16
17     foreach[func, synchronous, target](out, ctx)
18
  • What is Mojo?

  • Advanced memory safety

  • Compile-time parameterization

  • Generalized types

  • GPU integration

Mojo fEATURES:

Progressive types

Zero cost abstractions

Ownership + borrow checker

Portable parametric algorithms

Language integrated auto-tuning

Write CPU+GPU code together, unlocking incredible performance

Pythonic GPU-code that combines the best of C++, Rust and CUDA.

The same code runs everywhere; no ROCm or CUDA required

Vendor independent GPU programmability

No vendor lock-in

Customizable graph compiler

Leading performance out of the box, with the optionality for full control

Developers love Mojo  

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

NL

“Mojo destroys Python in speed. 12x faster without even trying. The future is bright!”

svpino

“Tired of the two language problem. I have one foot in the ML world and one foot in the geospatial world, and both struggle with the "two-language" problem. Having Mojo - as one language all the way through is be awesome.”

fnands

"C is known for being as fast as assembly, but when we implemented the same logic on Mojo and used some of the out-of-the-box features, it showed a huge increase in performance... It was amazing."

Aydyn

“It’s fast which is awesome. And it’s easy. It’s not CUDA programming...easy to optimize.”

dorjeduck

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

NL

“Mojo destroys Python in speed. 12x faster without even trying. The future is bright!”

svpino

“Tired of the two language problem. I have one foot in the ML world and one foot in the geospatial world, and both struggle with the "two-language" problem. Having Mojo - as one language all the way through is be awesome.”

fnands

"C is known for being as fast as assembly, but when we implemented the same logic on Mojo and used some of the out-of-the-box features, it showed a huge increase in performance... It was amazing."

Aydyn

“It’s fast which is awesome. And it’s easy. It’s not CUDA programming...easy to optimize.”

dorjeduck

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

NL

“Mojo destroys Python in speed. 12x faster without even trying. The future is bright!”

svpino

“Tired of the two language problem. I have one foot in the ML world and one foot in the geospatial world, and both struggle with the "two-language" problem. Having Mojo - as one language all the way through is be awesome.”

fnands

"C is known for being as fast as assembly, but when we implemented the same logic on Mojo and used some of the out-of-the-box features, it showed a huge increase in performance... It was amazing."

Aydyn

“It’s fast which is awesome. And it’s easy. It’s not CUDA programming...easy to optimize.”

dorjeduck

“I'm excited, you're excited, everyone is excited to see what's new in Mojo and MAX and the amazing achievements of the team at Modular.”

Eprahim

“Max installation on Mac M2 and running llama3 in (q6_k and q4_k) was a breeze! Thank you Modular team!”

NL

“Mojo destroys Python in speed. 12x faster without even trying. The future is bright!”

svpino

“Tired of the two language problem. I have one foot in the ML world and one foot in the geospatial world, and both struggle with the "two-language" problem. Having Mojo - as one language all the way through is be awesome.”

fnands

"C is known for being as fast as assembly, but when we implemented the same logic on Mojo and used some of the out-of-the-box features, it showed a huge increase in performance... It was amazing."

Aydyn

“It’s fast which is awesome. And it’s easy. It’s not CUDA programming...easy to optimize.”

dorjeduck

Mojo destroys Python in speed. 12x faster without even trying. The future is bright!

mytechnotalent

“The Community is incredible and so supportive. It’s awesome to be part of.”

benny.n

“A few weeks ago, I started learning Mojo 🔥 and MAX. Mojo has the potential to take over AI development. It's Python++. Simple to learn, and extremely fast.”

svpino

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

"Mojo gives me the feeling of superpowers. I did not expect it to outperform a well-known solution like llama.cpp."

Aydyn

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273

“I am focusing my time to help advance @Modular. I may be starting from scratch but I feel it’s what I need to do to contribute to #AI for the next generation.”

mytechnotalent

“What @modular is doing with Mojo and the MaxPlatform is a completely different ballgame.”

scrumtuous

“Mojo and the MAX Graph API are the surest bet for longterm multi-arch future-substrate NN compilation”

pagilgukey

“I'm very excited to see this coming together and what it represents, not just for MAX, but my hope for what it could also mean for the broader ecosystem that mojo could interact with.”

strangemonad

“I tried MAX builds last night, impressive indeed. I couldn't believe what I was seeing... performance is insane.”

drdude81

“The more I benchmark, the more impressed I am with the MAX Engine.”

justin_76273