Learning a new programming language is hard. You have to learn new syntax, keywords, and best practices, all of which can be frustrating when youâ€™re just starting. In this blog post, I want to share a gentle introduction to Mojo from a Python programmerâ€™s perspective. Rather than focus on the language details such as Mojoâ€™s programming model and syntax, which you can find in the Mojo programming manual, Iâ€™ll focus on an example-driven introduction that will gently guide you to a land of Mojo familiarity. The example used in this blog post is available in the Mojo Playground, so if you havenâ€™t already signed up for Mojo - do so now!

### Mojođź”Ą: A familiar approach

Mojo should feel very familiar to any Python programmer, as it shares Pythonâ€™s syntax. But there are a few important differences that youâ€™ll see as we port a simple Python program to Mojo. The first thing youâ€™ll notice is that Mojo really shines in the performance department. â€śBut Python is no slouch â€“ NumPy is really fast!â€ť you might say, and youâ€™d be right. However, if you look under the hood of NumPyâ€™s elegant Python API, youâ€™ll see that all the computationally intensive code is written in C/C++, which is where its performance comes from.

With Mojo, you can write high-level code like Python and leverage Mojoâ€™s lower-level features to explicitly manage memory, add types, etc., to get the performance of C (or better!). This means you get the best of both worlds in Mojo and donâ€™t have to write your algorithms in multiple languages.

Before we get started, here are a couple of housekeeping items:

- Mojo is still very early in its development phase and the language, and tooling arenâ€™t ready to support the migration of large Python projects. We expect that Python users will initially port small, computationally demanding sections of their code to Mojo and then migrate more significant parts of their code base over time as the language and tooling mature. We are adding many new language features each week, and you should follow the regular Changelog to get updates.
- The Mojo Playground environment is not always stable. You will be able to reproduce the output results of the calculations but the performance (execution time) may vary, and you may not see the performance shown below. The goal of this blog post is to introduce you to Mojo, not to benchmark its performance.

### From Python to Mojo: A simple example

Letâ€™s start with a simple example that calculates the Euclidean distance between two vectors. This is mathematically expressed as the L2-norm of the difference vector $||\vec{a}-\vec{b}||$ where $\vec{a}$ and $\vec{b}$ are two n-dimensional vectors, and Iâ€™ll discuss the implementation details in the algorithm section below. Euclidean distance calculation is one of the most fundamental computations in scientific computing and machine learning, used in algorithms like k-nearest neighbors and similarity search. In this example, youâ€™ll see how you can get faster-than-NumPy performance on this task, using high-dimensional vectors with Mojo. Itâ€™s a computationally intensive problem, so weâ€™ll build a solution from scratch, starting with Python, and bring it over to Mojo to improve performance.Â

My goal with this example is not to build the fastest program for this task, but to introduce Mojo and its syntax as a Python programmer.Â

#### Where do I run this example? - The Mojo Playground

The code in this blog post can be copied and pasted into a new Jupyter notebook on Mojo Playground. First, head over to playground.modular.com to access or sign up to access Mojo on a hosted JupyterLab server. Once you have your playground open, create a new Notebook. Paste each code block in this blog post into a new Jupyter Notebook cell and press the Run button on the menu bar or hit Shift+Enter on your keyboard to run the cell and see the output. We're working to make the code in this blog post available as a Notebook in Mojo Playground soon!

â€Ť

#### Algorithm implementation details

Calculating the Euclidean distance is fairly straightforward:Â

- Calculate the element-wise difference between two vectors to create a difference vector
- Square each element in the difference vector
- Sum up all the squared elements of the difference vector
- Take the square root of the sum

These 4 steps are illustrated in the diagram below:

â€Ť

In our implementation, the dimension of the vector *n* is the number of elements in our array or list. In pure Python, youâ€™d write it down like this:

#### Euclidean distance in pure Python

First, letâ€™s set a baseline by running and benchmarking pure Python performance for the Euclidean distance calculation. To verify the distance calculation is numerically accurate across Python and Mojo implementations, weâ€™ll create two random NumPy arrays of 10 million elements each and re-use them throughout the example. For the pure Python implementation, weâ€™ll convert these NumPy arrays into Python lists, so we only use data structure native to Python.

**Mojo Playground tip:** Add the *%%python* at the top of the Jupyter to instruct the Mojo Jupyter kernel to run this code as Python interpreted code and not as Mojo compiled code.Â

First, letâ€™s create 2 random vectors with 10,000,000 elements using the code below.

â€Ť

Now, weâ€™re ready to calculate the Euclidean distance in pure Python.

â€Ť

The pure Python implementation takes about ~790 ms to run. Take a note of the Euclidean distance value of 1290.91809, weâ€™ll use that to verify that the subsequent implementations are numerically accurate.

#### Python + NumPy implementation

To be fair to Python, rarely do Python programmers use Python native data structures for machine learning and scientific computing. The de facto standard for such use cases is the NumPy package, which provides the n-dimensional array data structure and optimized functions that operate on them. Since we already created a random NumPy vector in the previous step, weâ€™ll use the same *numpy* arrays and calculate the euclidean distance using NumPyâ€™s vectorized *numpy.linalg.norm* function that computes the norm on the difference vector. We measure the execution time of the NumPy implementation below.

â€Ť

The time it took to calculate the Euclidean distance to the exact same value of 1290.91809 went from ~790 ms to ~24 ms: thatâ€™s about 30 times faster using NumPyâ€™s faster C/C++ implementation under the hood.

Can we run it faster with Mojo? Letâ€™s find out!

#### Our first Mojo implementation

Mojo offers Pythonâ€™s usability with optional low-level control like C. Letâ€™s start with a Python-like implementation in Mojo and see what performance we get. First, we need a data structure for our vectors. Mojo offers a Tensor data structure which allows us to work with n-dimensional arrays, and for this example weâ€™ll create two 1-dimensional Tensors and copy over the NumPy array data to it.

â€Ť

Letâ€™s dissect this piece of Mojo code. First, you'll notice that we have new variable declarations *let* and *var* which may look odd at first glance since this is not familiar Python syntax. Mojo offers optional (except in some cases, more on that later) variable declarations to declare variables as immutable with let (i.e. cannot be modified after creation) or mutable with var (i.e. can be modified). There are two benefits to using variable declarations - type safety and performance. Second, youâ€™ll also notice that the Tensor function has both square brackets and round brackets () with this format:

*Function[parameters](arguments)*

In Mojo "parameters" represent a compile-time value. In this example weâ€™re telling the compiler, Tensor is a container for 64-bit floating point values. And arguments in Mojo represent runtime values, in this case weâ€™re passing n=10000000 to Tensorâ€™s constructor to instantiate a 1-dimensional array of 1 million values.

Finally, in the for-loop we assign numpy array values to Mojo Tensor. Weâ€™re now ready to calculate the Euclidean distance measure in Mojo.

â€Ť**Calculating the Euclidean distance in Mojo**

Letâ€™s bring our Python example over to Mojo and make a few changes to it. Below is our Mojo function for calculating Euclidean distance. Can you spot the few key differences vs. the Python function?

â€Ť

Notice that this is very similar to our Python code, except that weâ€™ve added types in the function arguments: *a: Tensor[DType.float64], b: Tensor[DType.float64]* and return type *Float64*. Unlike Python, Mojo is a compiled language and even though you can still use flexible types like in Python, Mojo lets you declare types so the compiler can optimize the code based on those types, and improve performance.

Here *DType.float64 *parameter of our *Tensor* specifies that it contains 64-bit floating point values. *Float64* return type represents a Mojo *SIMD* type, which is a low-level scalar value on the machine register. We also declare the variable *s* with the var keyword which tells the Mojo compiler that *s* is a mutable variable of type *Float64*. Now weâ€™re ready to benchmark our Mojo code.

â€Ť

The execution time dropped down to ~70 ms from ~790 ms in pure Python, thatâ€™s about 11x faster. However, that is still slower than Python+NumPyâ€™s ~40 ms but pretty good without having to re-write our function in C/C++. But weâ€™re not done yet! Weâ€™re leaving a lot of performance on the table that we can recover with a few more minor code changes. Letâ€™s see how.

#### Speeding up our Mojođź”Ą code!

Just like in Python, def functions in Mojo are dynamic, flexible and types are optional which makes it easier to port Python functions to Mojo. However, there are a few key differences in how arguments are processed. In Python arguments to functions are references to objects and if modified, their changes are visible outside the function. In Mojo, def functions make a copy of all arguments and this introduces an overhead when dealing with large Tensors like we are. Therefore, to speed up our code further we need to:

- Pass Tensor values by reference so no copies are made
- Introduce strict typing and declare all variablesÂ

Hereâ€™s our updated function that addressed both (1) and (2)

â€Ť

The first change youâ€™ll notice is that the *def* has been replaced by *fn*. In Mojo, *fn* functions enforce strict type checking and variable declarations. The default behavior of *fn *is that arguments and return values must contain types and *fn* arguments are immutable variables. While *def* allows you to write more dynamic code, *fn* functions can improve performance by lowering overhead of figuring out data types at runtime and helps you avoid a variety of potential runtime errors. You can read more about the difference between *fn* and *def* in the Mojo programming manual.

Since all variables in *fn* functions have to be declared, we also declare *n* and *dist* with let and weâ€™re ready to benchmark our updated code.

â€Ť

Our Mojo code execution time dropped down to ~13 ms. Thatâ€™s almost 2x faster than the NumPy which is implemented in C/C++ and 60x faster than the pure Python implementation. Letâ€™s take a look at the Python and Mojo code side by side so you can appreciate how little you had to change the code to see the performance improvements.

### Conclusion

There is a lot more to discuss about Mojo. For now, I hope you found this blog post to be a quick and gentle introduction to Mojo from a Python programmerâ€™s perspective. There are more things to try to speed up our code, including better ways to allocate memory, vectorization, multi-core parallelization, and more.Â explore these topics in upcoming blog posts. The full Jupyter notebook is available on Mojo Playground â€“ head over to the Playground and run the example yourself!Â

Now itâ€™s your turn! How would you improve this code? Do you have ideas for other examples? Weâ€™d love to hear from you! Join our awesome community on Discord and share your Mojo journey with us on social media. Until next time đź”Ą!