Hippocratic AI + Modular to power real-time patient conversations. Read More →
Flagship open-weight models running on Modular, the AI platform powered by the open-source MAX engine.
Closed video APIs lock you in: no fine-tuning, no on-prem, per-second pricing that punishes scale. Run Wan 2.2, LTX, and Hunyuan-class models on Modular instead — frontier quality, open weights, and the same deployment posture as the rest of your inference stack.
Run Wan 2.2, LTX-2.3, MiniMax (Hailuo), and Hunyuan-class video models on Modular. Open weights, no API lock-in, swap models whenever the frontier moves.
The same compiled artifact runs on NVIDIA B200 and AMD MI355X. Bid for capacity across clouds and silicon — never get locked in to a single vendor.
Modular compiles each video model end-to-end — DiT, scheduler, VAE — so you get more clips per GPU and the cost per second of generated video drops.
Run in Modular Cloud or bring it into your own VPC. No per-second pricing, no egress fees, no API surcharge stacked on top of your hardware spend.
Our team helps you tune prompts, fine-tune on your data, and ship to production. Direct access from POC through scale — not a ticket queue.
Wan 2.2 vs. Diffusers, end-to-end on B200
OpenResponses-style across image & video
NVIDIA B200 + AMD MI355X
Model Spotlight
Open and advanced large-scale video generation — text-to-video and image-to-video on a single MoE graph. Compiled end-to-end on Modular for 3.7× faster inference vs. Diffusers, with cinematic motion and aesthetic control built in.
Text → Video
Prompting for video is an art and a science. Let our solution engineers help you make sure your app interprets the user's needs as best as it can. We have mastered the art of image-after-image consistency, hero character alignment, style consistency, and more.
Check out inkwell.modular.com for an example of prompt consistency.
Image → Video
At production volume, video generation lives and dies on GPU efficiency — every 5-second clip is dollars of inference, not cents. Modular runs the same open-weight video models on NVIDIA B200 and AMD MI355 with impressive throughput, so you keep the quality and cut the cost per clip without changing the model.


Schedule a demo of Modular and explore a custom end-to-end deployment built around your models, hardware, and performance goals.
Distributed, large-scale online inference endpoints
Highest-performance to maximize ROI and latency
Deploy in Modular cloud or your cloud
View all features with a custom demo

Book a demo
Talk with our sales lead Jay!
30min demo. Evaluate with your workloads. Ask us anything.
Book a demo for a personalized walkthrough of Modular in your environment. Learn how teams use it to simplify systems and tune performance at scale.
Custom 30 min walkthrough of our platform
Cover specific model or deployment needs
Flexible pricing to fit your specific needs

Book a demo
Talk with our sales lead Jay!
Run any open source model in 5 minutes, then benchmark it. Scale it to millions yourself (for free!).
Install Mojo and get up and running in minutes. A simple install, familiar tooling, and clear docs make it easy to start writing code immediately.
Playground
Generate images and videos from a single browser playground. No credit card required.