May 15, 2024

What Does Joe Pamer, AI and PL expert, Want From Mojo?

Joe Pamer recently joined Modular to lead the Mojo engineering team. Joe is a veteran of Meta, Apple, and Microsoft and has led large scale infrastructure projects but can never seem to escape programming languages. We thought it would be fun to dive into Joe’s background and perspective and how he sees the opportunity ahead of Mojo and MAX and the industry at large.

Mojo is a new programming language that is a member of the Python family.  It is as easy to use as Python, but is distinguished by its high performance and ability to scale into GPU programming and other specialized domains like AI.

Chris Lattner: You’ve made major contributions to a wide variety of different programming languages (PL) including F#, TypeScript, Swift, Hack, Python (etc!) that cross-cut static and dynamic languages. What have you learned across these journeys and what makes you excited about Mojo?

Joe Pamer: While I keep bouncing between opposite ends of the design spectrum, I've honestly never been particularly adamant about any of the “great debates”: static vs. dynamic, functional vs. imperative, etc. Life’s too short!

To me, an ideal programming language allows you to adapt your style and approach to the situation, and what attracted me to all of the ones I’ve worked on is that they’ve strived to balance opposing design constraints. I like meeting the programmer where they’re at, and giving them new tools while you’re doing so. Maybe that’s the most important lesson I’ve learned: the best designs are rarely found at one end of the spectrum or the other.

For example: what’s better, static or dynamic? It depends! Static languages are great when you know exactly what you need to do, and want to do it in the most efficient, rigorous way possible - writing to the metal. Advanced typing features also help manage complexity as codebases scale. But sometimes you just need to experiment, sketch out a solution, orchestrate, or just hack. You don’t need to frame your approach up-front; you want to explore and are willing to accept the necessary tradeoffs. When that’s the case, when you’re figuring things out as you go, dynamic languages really shine.

What’s funny is that when you start with one you often begin to crave the other. Python is ubiquitous for the feature engineering and modeling parts of a pipeline, because it is great for experimentation and iteration.  But then you get into production and performance always becomes an issue, so you have to switch to C++ or Rust.  For serving or inference parts of a pipeline, C++ does make sense because you need extreme scale... but then your requirements change and its lack of flexibility means that you need to rewrite or throw everything away at some point.

Trying to find a balanced solution to the ML pipeline problem is what attracted me to Mojo: you can have it both ways, without compromises. It's like having the power of C++ and the expressiveness of Python elegantly fused together in a way that feels instinctive and powerful.

CL: You’ve scaled AI projects at Meta and are an expert in these systems. What are the biggest challenges you’ve faced in the past, and how do you think MAX will be able to help?

The challenge of “big AI” is that you’re facing all the challenges of modern software development at scale while moving at tremendous pace and under tremendous market pressure. We’re all surfing past the edge of the wave and coping best as we can, facing the same problems over and over again.

First, you’re up against the laws of physics. Everything is evolving so rapidly out from under you that your infrastructure is perpetually and fundamentally stressed. Compute resources, network capacity, energy capacity - you’re constantly bouncing against the very limits of what’s possible with existing technology, pushing the envelope. Engineering for efficiency is key, but it’s so hard to manage across all of these different systems and the tradeoffs are tricky - it’s all zero sum.

Next, modern ML pipelines are massive heterogeneous systems, all running different software stacks on different hardware. C++ and Python weren’t designed for this, and we have a crazy set of libraries and frameworks (PyTorch/Tensor-RT/CUBLAS/etc) - that try to paper over the problems. So you’re left to constantly battle this complexity, and every link in the chain is a potential point of failure. Even worse is that these are all opaque boxes, so when one of them breaks, you're stuck looking for a workaround instead of being able to fix things yourself.

Finally, modern AI is a multidisciplinary endeavor: you need ML engineers, software engineers, data scientists, and mathematicians to be able to work together, but they work very differently and often have contradictory needs. Traditional languages create a tooling gap for your team because they are literally speaking different languages, and cannot collaborate on a shared codebase that spans research to production. This complexity makes it more difficult to get AI into your products, and results in a slower innovation cycle.

I don’t think these problems - efficiency, heterogeneity, and productivity - can be solved separately. The tools we currently use were designed for the last generation of problems, and their lack of cohesion is preventing us from moving forward. That’s what drew me to Mojo/MAX. It’s tightly integrated, performant, and maximizes optionality at the upper and lower levels of the stack. It also allows whole classes of disparate developers to work their way, but using the same tools. It lets you get on with your life, and focus on building great AI products, not managing infrastructure.

CL: Where do you want to see Mojo in two years, how about ten years?

Mojo is at an exciting point in its development - it’s come so far in the past year, and is evolving rapidly.  The community is exploding in size, and we’re seeing people build new things every day.

We want Mojo to live up to its potential, so we building out the core technology, rounding out even better tooling (including package management, a long-standing pain point for Python developers), and building Mojo into a scalable open ecosystem. While it is very important to us that Mojo be a good member of the Python family, Mojo is capable of scaling into many more use-cases, and we intend to continue growing the Mojo ecosystem on its own terms - just as C++ grew into a superset of C.

This last goal connects to my ten-year vision in that I don’t see Mojo as “just” a language for AI - I think it can be great for many applications. Over the next decade I hope we will see Mojo expand into many domains beyond AI, with an identity all its own. It’s already a joy to use and I only see us getting better and more compelling to a wider audience of developers over time.

One thing I know for certain: for either our short or long-term goals to happen, we need to foster a large and empowered community, and this is only possible through open source. Mojo has a lot of really innovative technology under the hood that we’re looking forward to sharing with the world.  We recently open sourced the standard library, and our first Mojo Community Meeting is next week!

CL: As a “city person” you’ve experienced both the SF and the NY scenes, why choose New York?

SF is a wonderful city and a great place to work, but NYC is “home” to me and has always been close to my heart, so I’m unabashedly biased about this one 🙂.

I’d say the tech sector in NYC is pretty vibrant right right now, but it certainly wasn’t always this way. I love that NYC has a diverse set of professions, instead of being a tech-centric monoculture like SF or even Seattle. This lack of a professional monoculture here really changes the way people in this field relate to their work. Combined with the typical NYC hustle, the tech scene here feels “richer,” more vibrant and more energetic in a way that’s unique to the city.

One thing I'm very thankful for is that Modular is a remote-first company, which allows me to work from the city that I love. We already had an awesome group of engineers here (and up and down the east coast) when I joined, so it's never felt lonely either. I'm excited to keep growing out our presence here, and to contribute back to the local scene.

CL: What do you think about emojis in programming languages?

Naturally, I 🤔 they're 👍️ if ➡️👤 🙏 to 🛟 🛰️ ➕ ⌨️ ⏩⏱️… ❗️ 🔮, too... LOL.

CL: Awesome, it is such an exciting time Joe. Thank you for driving Mojo to the next level!

Before we jump, I just wanted to mention that we’re investing even more in Mojo compiler, language, tooling, and building out the next generation of Mojo libraries.  We recently posted a bunch of new roles, so if you’re interested in building the future for AI infrastructure and software, please check it out!

-Joe Pamer

Until next time! 🔥

Company

Joe Pamer
,
Mojo Distinguished Engineering Lead
Chris Lattner
,
Co-Founder & CEO