The world's fastest unified AI inference engine. Get models into production, faster.
The Modular Engine executes all of your TensorFlow and PyTorch models with no model rewriting or conversions. Bring your model as-is and deploy it anywhere, across server and edge, with unparalleled usability and performance.
Train in any framework,
deploy anywhere
Consolidate the bespoke AI toolchains you are using and simplify your AI deployment by orders of magnitude.
Framework optionality
Easily deploy models trained in any framework, such as TensorFlow or PyTorch, without retraining, conversions or pre-optimization steps, using a unified set of APIs. There are no tricks, no hacks - the Engine just works incredibly fast.
Compute portability
Seamlessly move your workloads to the best hardware for the job without rewriting or recompiling your models. Avoid lock-in and take advantage of price efficiencies and performance improvements without migration costs.
Maximize performance, minimize costs
Reduce latency, increase throughput, and improve resource efficiency across CPUs, GPUs, and accelerators. Productionize larger models and significantly reduce your computing costs.
Execute any model
with full compatibility
Never deal with model conversion challenges again. Run any model, including support for all native framework operators, dynamic shapes, low-precision, and your existing custom operators.
Works with your existing AI libraries and tools
Modular is designed to drop into your existing workflows and use cases. Our tools are... well... modular. They integrate with industry-standard infrastructure and open-source tools to minimize migration cost.
01.
Easily integrate the engine into your own custom server image or use Modular's off-the-shelf NVIDIA Triton and TensorFlow-Serving builds.
02.
Deploy the engine on-prem, in your own VPC on any major cloud provider, or get up and running quicker with out hosted solutions.
03.
The Modular Inference Engine works with industry-standard open-source tooling like Prometheus and Grafana.
Ready to try a preview?
Contact us to get early-access to the Modular Inference Engine.
Read the Modular Inference Engine docs