Modular: What’s new in Mojo SDK v0.5?

Mojo SDK v0.5 is now available for download and includes exciting new features. In this blog post, I’ll discuss what these features are and how to use them with code examples. ICYMI, in last week’s Modular community livestream, we dove deep into all things Mojo SDK v0.5 with live demos of the examples shared in this blog post, while answering your questions live! If you missed it, you should check out the recording here:

And don't forget to register for Modular’s first-ever annual conference, ModCon, happening on Dec 4th in San Francisco. Register now if you want to meet the team in person, get access to workshops and panels, epic swag, and even more surprises! 🔥

Update your Mojo🔥 SDK

Before we dive into examples and new features, make sure you’re running the latest version. If you already have Mojo SDK v0.4 installed run modular update mojo to get the latest release. If you don’t have Mojo or have an earlier version follow the getting started instructions in the documentation. For a complete list of what’s new, what changed, and what’s fixed in this release, I recommend reviewing the changelog. In this blog post I’ll focus on the following 5 features:

Keyword parameters and keyword arguments
Automatic parameterization of functions
Tensor enhancements: load and save to file, print() works on tensor types
String enhancements: new count(), find() and replace() functions
Benchmark enhancements: new print() function to print benchmark reports, and ability to take non-capturing functions

Let’s take a closer look at these features with examples. All the code examples shared below are available in a single Jupyter Notebook here. To access it:

Bash


git clone https://github.com/modularml/mojo.git
cd mojo/examples/blogs-videos/

If you want to follow along, open whats_new_v0.5.ipynb and run the examples in Visual Studio Code or using Jupyter Lab and run each cell as I discuss the features below.

New feature: Keyword parameters and keyword arguments

Mojo SDK v0.3 first introduced Python-style keyword arguments to specify argument default values and pass values with keyword argument names. In the current v0.5 release, you can do the same with keyword parameters. If you’re a Python user who’s new to Mojo, you might ask: “Aren’t parameters and arguments the same thing?”. In Mojo🔥 they mean different things. Unlike Python, Mojo is a compiled language, and parameters in Mojo represent compile-time values or types, whereas arguments represent runtime values. In code, we differentiate them by putting compile-time parameters in square brackets and runtime arguments in parentheses.

Below I have a simple Mojo struct called SquareMatrix. A struct in Mojo is similar to a class in Python, but Mojo structs are static and compile-time bound, unlike Python classes that are dynamic and allow changes at runtime. Here the SquareMatrix struct, as the name suggests, creates a square matrix by restricting the shape of the Tensor type during initialization. When initialized, it creates a square matrix with dimension dim, a compile-time value,and fills the matrix with val a run-time value. SquareMatrix also defines a function called print() to print the underlying tensor.

Mojo


from tensor import Tensor
from algorithm import vectorize

struct SquareMatrix[dtype: DType = DType.float32, dim: Int = 4]():
  var mat: Tensor[dtype]

  fn __init__(inout self, val: SIMD[dtype,1] = 5):
    self.mat = Tensor[dtype](self.dim,self.dim)
    alias simd_width = simdwidthof[dtype]()
    @parameter
    fn fill_val[simd_width: Int](idx: Int) -> None:
        self.mat.simd_store(idx, self.mat.simd_load[simd_width](idx).splat(val))
    vectorize[simd_width, fill_val](self.mat.num_elements())

  fn __getitem__(self,x:Int,y:Int)->SIMD[dtype,1]:
    return self.mat[x,y]

  fn print(self):
    print(self.mat)

Let’s instantiate SquareMatrix with default keyword parameters and print its results. Notice that we didn’t specify any parameters or arguments.

Mojo


SquareMatrix().print()

Output:

Bash


Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)

If you take a closer look at the SquareMatrix definition, you’ll see that we use the new keyword parameter feature to specify default keyword parameters: struct SquareMatrix[dtype: DType = DType.float32, dim: Int = 4]()

dtype: with default value is DType.float32
dim: with default value 4

Since we don't provide any keyword arguments or parameters all the default values were assumed. i.e. val = 5.0, dtype=DType.float32 and dim=4

Notice also that in SquareMatrix's print() function we’re calling print(self.mat) where self.mat is a Tensor type. In this new release print() function now works on Tensor types!

Let’s try a few different combinations of inputs. We can optimally only specify keyword arguments:

Mojo


SquareMatrix(10).print()
#or
SquareMatrix(val=10).print()

Output:

Bash


Tensor([[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0]], dtype=float32, shape=4x4)

Or specify a combination of keyword parameters and keyword arguments

Mojo


SquareMatrix[DType.float64](10).print()

Output:

Bash


Tensor([[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0],
[10.0, 10.0, 10.0, 10.0]], dtype=float64, shape=4x4)

And just like for Python arguments you can specify both positional and keyword parameters

Mojo


SquareMatrix[DType.float64,dim=3](1).print()

Output:

Bash


Tensor([[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0],
[1.0, 1.0, 1.0]], dtype=float64, shape=3x3)

You can also specify keyword arguments in __getitem__() dunder method:

Mojo


let sm = SquareMatrix()
sm.print()

print()
print('Keyword argument in __getitem__()')
print(sm[x=0, y=3])

Bash


Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)

Keyword argument in __getitem__()
5.0

New feature: Automatic parameterization of functions

Mojo also adds support for automatic parameterization of functions. If you have a function that takes in an argument that has parameters, those parameters are automatically also added to the function as its function parameters. For example:

fn multiply(sm: SquareMatrix, val: SIMD[sm.dtype,1])

Is equivalent to:

fn multiply[dtype: DType = DType.float32, dim: Int = 4](sm: SquareMatrix[dtype: DType, dim: Int], val: SIMD[dtype,1])

Also, notice that function argument input parameters can now be referenced within the signature of the function using sm.dtype since parameters are automatically added to our function. This enables you to write clean-looking code. This feature is better explained with an example, so let’s implement the multiply function that takes SquareMatrix sm and a floating point value val as function arguments, scales all the values in the matrix with val i.e. sm*val and, returns the scaled matrix.

Mojo


from math import mul
fn multiply(sm: SquareMatrix, val: SIMD[sm.dtype,1]) -> Tensor[sm.dtype]:
    alias simd_width: Int = simdwidthof[sm.dtype]()
    let result_tensor = Tensor[sm.dtype](sm.mat.shape())

    @parameter
    fn vectorize_multiply[simd_width: Int](idx: Int) -> None:
        result_tensor.simd_store[simd_width](idx, mul[sm.dtype,simd_width](sm.mat.simd_load[simd_width](idx),val))
    vectorize[simd_width, vectorize_multiply](sm.mat.num_elements())
    return result_tensor

fn main():
    let sm = SquareMatrix(5)
    let res = multiply(sm,100.0)
    print(res)
main()

The multiply function above is automatically parameterized with the parameters of SquareMatrix, so we don’t have to specify them. To access SquareMatrix parameters we can use the SquareMatrix variable: sm.dtype, sm.dim

Output:

Bash


Tensor([[500.0, 500.0, 500.0, 500.0],
[500.0, 500.0, 500.0, 500.0],
[500.0, 500.0, 500.0, 500.0],
[500.0, 500.0, 500.0, 500.0]], dtype=float32, shape=4x4)

New feature: Tensor and String enhancements

The Tensor type in the Mojo standard library allows us to work with n-dimensional arrays and in this release, it supports loading and saving tensors to disk as bytes. String manipulation also gets much easier with new count(), find() and replace() functions. As before, let’s take a look at an example to see how to use them.

We’ll extend our SquareMatrix structure to include two new functions prepare_filename which demonstrates the use of the new String features and load and save functions which demonstrate the use of loading and saving. The code below only shows the additional functions we added to SquareMatrix for the full example refer to the notebook that accompanies this blog post.

Mojo


from tensor import Tensor
from algorithm import vectorize
from time import now
from memory import memcpy

struct SquareMatrix[dtype: DType = DType.float32, dim: Int = 4]():
  var mat: Tensor[dtype]

# ...
# ...
# ...

  fn prepare_filename(self, fname: String)->String:
    var fpath = fname
    if fpath.count('.') < 2:
        fpath += '.data'
    fpath = fpath.replace(".","_"+self.mat.spec().__str__()+".")
    if fpath.find('/'):
        fpath = './'+fpath
    return fpath

  fn save(self, fname: String='saved_matrix') raises -> String:
    let fpath = self.prepare_filename(fname)
    self.mat.tofile(fpath)
    print('File saved:',fpath)
    return fpath

  @staticmethod
  fn load[dtype: DType,dim: Int](fpath:String) raises -> Tensor[dtype]:
    let load_mat = Tensor[dtype].fromfile(fpath)
    let new_tensor = Tensor[dtype](dim,dim)
    memcpy(new_tensor.data(),load_mat.data(),load_mat.num_elements())
    _ = load_mat
    return new_tensor

Let’s start with saving a Tensor

Mojo


let m = SquareMatrix()
m.print()
let fpath = m.save('saved_matrix')

Output:

Bash


Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)
File saved: ./saved_matrix_4x4xfloat32.data

The save() function takes in a file name and calls the prepare_filename() function to convert the file name into a file path with extension and save it to disk using the tofile() function. Note: tofile() does not preserve the Tensor shape, therefore it’s saved as a 1-dimensional tensor. If you know the shape ahead of time, we can reshape it to the original shape as we do in the load() function.

In prepare_filename() we use the new count(), find() and replace() functions. We use

count() to count occurrences of a . in the string to check if the provided filename has an extension
replace() to replace . with . + tensor shape + tensor's dtype
find() to find if ./ exists and add it to the beginning of the string to save the tensor in the current directory.

Note: I created prepare_filename() function purely to demonstrate the new String features. There are likely more easier and efficient ways to do the same without using count(), replace() and find() in the way I do.

Now, let’s load the file we just saved and reshape it to the original shape in SquareMatrix’s load() function.

Mojo


print('Loading Tensor from file:',fpath)
print()
let load_mat = SquareMatrix.load[DType.float32,4](fpath)
print(load_mat)

Output:

Bash


Loading Tensor from file: ./saved_matrix_4x4xfloat32.data

Tensor([[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0],
[5.0, 5.0, 5.0, 5.0]], dtype=float32, shape=4x4)

We defined our load() function as a static method which takes in the type and dimensions as parameters and file path as arguments and uses Tensor’s fromfile() function to load the tensor. To reshape the tensor we created a new tensor with the desired dimensions and copied the data to the new tensor.

New Feature: Benchmark enhancements

To demonstrate Benchmark’s new reporting feature, we’ll use a computationally intensive example that calculates the row-wise mean() of a matrix with few rows and large number of columns.

First, we’ll compute this naively using nested loops and then in a performant way by vectorizing across columns and parallelizing across rows. We’ll print benchmark reports for both using the new report printing feature and show speedups.

Mojo


from tensor import Tensor
from random import rand
import benchmark
from time import sleep
from algorithm import vectorize, parallelize

alias dtype = DType.float32
alias simd_width = simdwidthof[DType.float32]()

fn row_mean_naive[dtype: DType](t: Tensor[dtype]) -> Tensor[dtype]:
    var res = Tensor[dtype](t.dim(0),1)
    for i in range(t.dim(0)):
        for j in range(t.dim(1)):
            res[i] += t[i,j]
        res[i] /= t.dim(1)
    return res

fn row_mean_fast[dtype: DType](t: Tensor[dtype]) -> Tensor[dtype]:
    var res = Tensor[dtype](t.dim(0),1)
    @parameter
    fn parallel_reduce_rows(idx1: Int)->None:
        @parameter
        fn vectorize_reduce_row[simd_width: Int](idx2: Int) -> None:
            res[idx1] += t.simd_load[simd_width](idx1*t.dim(1)+idx2).reduce_add()
        vectorize[2*simd_width,vectorize_reduce_row](t.dim(1))
        res[idx1] /= t.dim(1)
    parallelize[parallel_reduce_rows](t.dim(0),t.dim(0))
    return res

fn main():
    let t = rand[dtype](1000,100000)
    var result = Tensor[dtype](t.dim(0),1)

    @parameter
    fn bench_mean():
        _ = row_mean_naive(t)
    
    @parameter
    fn bench_mean_fast():
        _ = row_mean_fast(t)

    let report = benchmark.run[bench_mean]()
    let report_fast = benchmark.run[bench_mean_fast]()
    report.print()
    report_fast.print()
    print("Speed up:",report.mean()/report_fast.mean())

main()

Benchmark can now print easy to read reports with average time, total time, iterations, and other useful benchmark details. On my Apple M2 Pro with 12 cores, I see a ~52x speedup with vectorization and parallelization implementation vs. naive nested-loop implementation, both in pure Mojo.

Output:

Bash


---------------------
Benchmark Report (s)
---------------------
Mean: 0.360315
Total: 1.080945
Iters: 3
Warmup Mean: 0.36441600000000002
Warmup Total: 0.72883200000000004
Warmup Iters: 2
Fastest Mean: 0.360315
Slowest Mean: 0.360315

---------------------
Benchmark Report (s)
---------------------
Mean: 0.006859210256410256
Total: 1.3375459999999999
Iters: 195
Warmup Mean: 0.010933
Warmup Total: 0.021866
Warmup Iters: 2
Fastest Mean: 0.0068472000000000003
Slowest Mean: 0.0068707272727272723

Speed up: 52.530099899367947

But wait, there is more!

In this blog post, I shared several new features but there’s more! This release also includes enhancements to SIMD type, TensorShape and Tensor Spec, and a host of bug fixes. Check out the changelog for a full list of what’s new, what’s changed, and bug fixes in this release: https://docs.modular.com/mojo/changelog.html

And, don’t forget to watch the recording of our Mojo SDK v0.5 demo and Q&A livestream with Modular engineers. If you prefer in-person meetings to virtual livestreams, come to ModCon to meet the Modular team and leading AI experts to discuss the future of AI development and deployment at our annual developer conference.

Until next time! 🔥

What’s new in Mojo SDK v0.5?

Update your Mojo🔥 SDK

New feature: Keyword parameters and keyword arguments

New feature: Automatic parameterization of functions

New feature: Tensor and String enhancements

New Feature: Benchmark enhancements

But wait, there is more!

Next blog post: