Modular: Video Generation at Full Speed on Any Hardware

Frontier video, open weights, your infra

Closed video APIs lock you in: no fine-tuning, no on-prem, per-second pricing that punishes scale. Run Wan 2.2, LTX, and Hunyuan-class models on Modular instead — frontier quality, open weights, and the same deployment posture as the rest of your inference stack.

Models

Open-source models

Run Wan 2.2, LTX-2.3, MiniMax (Hailuo), and Hunyuan-class video models on Modular. Open weights, no API lock-in, swap models whenever the frontier moves.

Hardware

One binary, every GPU

The same compiled artifact runs on NVIDIA B200 and AMD MI355X. Bid for capacity across clouds and silicon — never get locked in to a single vendor.

Performance

Compiled for throughput

Modular compiles each video model end-to-end — DiT, scheduler, VAE — so you get more clips per GPU and the cost per second of generated video drops.

Deployment

Self-host or Modular Cloud

Run in Modular Cloud or bring it into your own VPC. No per-second pricing, no egress fees, no API surcharge stacked on top of your hardware spend.

Support

Solution engineers on call

Our team helps you tune prompts, fine-tune on your data, and ship to production. Direct access from POC through scale — not a ticket queue.

Model Spotlight

Wan 2.2

Open and advanced large-scale video generation — text-to-video and image-to-video on a single MoE graph. Compiled end-to-end on Modular for 3.7× faster inference vs. Diffusers, with cinematic motion and aesthetic control built in.

Effective MoE architecture — high-noise and low-noise denoising experts enlarge capacity without raising compute.
Cinematic-level aesthetics — controllable lighting, composition, contrast, and color tone from curated training labels.
Complex motion generation — trained on +65.6% more images and +83.2% more videos for richer motion and semantics.
High-definition hybrid TI2V — 5B variant hits 720p @ 24fps on consumer GPUs via 16×16×4 compression.
Top performance — leads benchmarks against open- and closed-source video generation models.

Run Wan 2.2 in Modular →

Wan 2.2 · I2V

Text → Video

Work with our engineers to architect incredible video prompts

Prompting for video is an art and a science. Let our solution engineers help you make sure your app interprets the user's needs as best as it can. We have mastered the art of image-after-image consistency, hero character alignment, style consistency, and more.

Check out inkwell.modular.com for an example of prompt consistency.

Suburban cul-de-sac at dusk, 1978. A teenage girl in tube socks roller-skates in lazy figure-eights under a streetlamp just flickering on, holding an astronaut stuffy. Shot on grainy Super 8 with faded warm tones, gentle gate weave, slight overexposure on the streetlamp. Static wide shot, slow 8-second zoom-in.

Hyper-stylized 3D render, Cinema 4D commercial aesthetic. A chubby rubbery astronaut in a glossy buttercup yellow suit with chrome cobalt-blue visor. Floating around: melting smiley popsicle, chrome banana, googly eye, Jell-O cube, lavender Croc, disco-ball cloud. Locked-off center, slow push-in.

Underwater documentary shot: an octopus in a coral reef shifts its skin from red-orange to mottled white in real time, then ripples its mantle and jets backward. Sunlight filters down in dappled god rays. A small metal astronaut sculpture sits 1/4 buried in the sand in the foreground corner of the screen - a small metal award. Camera holds steady with slight handheld drift. National Geographic color science. A school of silver fish passes through the background. Crystal clear water, vivid coral oranges and purples.

Extreme close-up of an 80-year-old woman's face as she begins to laugh — wrinkles around her eyes deepen, her head tilts back slightly. Soft north-facing window light from camera left, deep shadow on her right side. Static shot, shallow depth of field, 85mm lens. Skin texture sharp and natural.

Image → Video

Animate stills — keep the subject, gain the motion.

At production volume, video generation lives and dies on GPU efficiency — every 5-second clip is dollars of inference, not cents. Modular runs the same open-weight video models on NVIDIA B200 and AMD MI355 with impressive throughput, so you keep the quality and cut the cost per clip without changing the model.

One binary, every accelerator — same compiled artifact runs on NVIDIA B200 and AMD MI355X, so you bid for GPU capacity instead of getting locked in.
More throughput per GPU — Modular's compiled runtime means fewer GPUs at the same QPS, which drops cost per clip without changing the model.
Self-host, no per-call markup — deploy in Modular Cloud or your own VPC. No per-second pricing, no egress fees, no API surcharge on top of your hardware spend.

Try in our playground →

Input

Get started with Modular

Request a demo
Schedule a demo of Modular and explore a custom end-to-end deployment built around your models, hardware, and performance goals.
- Distributed, large-scale online inference endpoints
- Highest-performance to maximize ROI and latency
- Deploy in Modular cloud or your cloud
- View all features with a custom demo
Book a demo
Talk with our sales lead Jay!
30min demo. Evaluate with your workloads. Ask us anything.

Talk to us!
Book a demo for a personalized walkthrough of Modular in your environment. Learn how teams use it to simplify systems and tune performance at scale.
- Custom 30 min walkthrough of our platform
- Cover specific model or deployment needs
- Flexible pricing to fit your specific needs
Book a demo
Talk with our sales lead Jay!
Start using MAX
( FREE )
Run any open source model in 5 minutes, then benchmark it. Scale it to millions yourself (for free!).
Install MAX
What is MAX?
Start using Mojo
( FREE )
Install Mojo and get up and running in minutes. A simple install, familiar tooling, and clear docs make it easy to start writing code immediately.
Install Mojo🔥
What is Mojo🔥?

Video Generation at Full Speed on any hardware.

Frontier video, open weights, your infra

3.7×

1 API

2+ vendors

Wan 2.2

Work with our engineers to architect incredible video prompts

Animate stills — keep the subject, gain the motion.

Get started with Modular

Start using Mojo

Try in the Console