Hippocratic AI + Modular to power real-time patient conversations. Read More →

Generative Media on Modular

Video Generation at Full Speed on any hardware.

Flagship open-weight models running on Modular, the AI platform powered by the open-source MAX engine.

Available on Modular
LTX-2.3
Wan 2.1
Wan 2.2
MiniMax (Hailuo)

Frontier video, open weights, your infra

Closed video APIs lock you in: no fine-tuning, no on-prem, per-second pricing that punishes scale. Run Wan 2.2, LTX, and Hunyuan-class models on Modular instead — frontier quality, open weights, and the same deployment posture as the rest of your inference stack.

Models
Open-source models

Run Wan 2.2, LTX-2.3, MiniMax (Hailuo), and Hunyuan-class video models on Modular. Open weights, no API lock-in, swap models whenever the frontier moves.

Hardware
One binary, every GPU

The same compiled artifact runs on NVIDIA B200 and AMD MI355X. Bid for capacity across clouds and silicon — never get locked in to a single vendor.

Performance
Compiled for throughput

Modular compiles each video model end-to-end — DiT, scheduler, VAE — so you get more clips per GPU and the cost per second of generated video drops.

Deployment
Self-host or Modular Cloud

Run in Modular Cloud or bring it into your own VPC. No per-second pricing, no egress fees, no API surcharge stacked on top of your hardware spend.

Support
Solution engineers on call

Our team helps you tune prompts, fine-tune on your data, and ship to production. Direct access from POC through scale — not a ticket queue.

3.7×

Wan 2.2 vs. Diffusers, end-to-end on B200

1 API

OpenResponses-style across image & video

2+ vendors

NVIDIA B200 + AMD MI355X

Model Spotlight

Wan 2.2

Open and advanced large-scale video generation — text-to-video and image-to-video on a single MoE graph. Compiled end-to-end on Modular for 3.7× faster inference vs. Diffusers, with cinematic motion and aesthetic control built in.

  • Effective MoE architecture — high-noise and low-noise denoising experts enlarge capacity without raising compute.
  • Cinematic-level aesthetics — controllable lighting, composition, contrast, and color tone from curated training labels.
  • Complex motion generation — trained on +65.6% more images and +83.2% more videos for richer motion and semantics.
  • High-definition hybrid TI2V — 5B variant hits 720p @ 24fps on consumer GPUs via 16×16×4 compression.
  • Top performance — leads benchmarks against open- and closed-source video generation models.
Run Wan 2.2 in Modular →
Wan 2.2 · I2V

Text → Video

Work with our engineers to architect incredible video prompts

Prompting for video is an art and a science. Let our solution engineers help you make sure your app interprets the user's needs as best as it can. We have mastered the art of image-after-image consistency, hero character alignment, style consistency, and more.

Check out inkwell.modular.com for an example of prompt consistency.

Suburban cul-de-sac at dusk, 1978. A teenage girl in tube socks roller-skates in lazy figure-eights under a streetlamp just flickering on, holding an astronaut stuffy. Shot on grainy Super 8 with faded warm tones, gentle gate weave, slight overexposure on the streetlamp. Static wide shot, slow 8-second zoom-in.
Hyper-stylized 3D render, Cinema 4D commercial aesthetic. A chubby rubbery astronaut in a glossy buttercup yellow suit with chrome cobalt-blue visor. Floating around: melting smiley popsicle, chrome banana, googly eye, Jell-O cube, lavender Croc, disco-ball cloud. Locked-off center, slow push-in.
Underwater documentary shot: an octopus in a coral reef shifts its skin from red-orange to mottled white in real time, then ripples its mantle and jets backward. Sunlight filters down in dappled god rays. A small metal astronaut sculpture sits 1/4 buried in the sand in the foreground corner of the screen - a small metal award. Camera holds steady with slight handheld drift. National Geographic color science. A school of silver fish passes through the background. Crystal clear water, vivid coral oranges and purples.
Extreme close-up of an 80-year-old woman's face as she begins to laugh — wrinkles around her eyes deepen, her head tilts back slightly. Soft north-facing window light from camera left, deep shadow on her right side. Static shot, shallow depth of field, 85mm lens. Skin texture sharp and natural.

Image → Video

Animate stills — keep the subject, gain the motion.

At production volume, video generation lives and dies on GPU efficiency — every 5-second clip is dollars of inference, not cents. Modular runs the same open-weight video models on NVIDIA B200 and AMD MI355 with impressive throughput, so you keep the quality and cut the cost per clip without changing the model.

  • One binary, every accelerator — same compiled artifact runs on NVIDIA B200 and AMD MI355X, so you bid for GPU capacity instead of getting locked in.
  • More throughput per GPU — Modular's compiled runtime means fewer GPUs at the same QPS, which drops cost per clip without changing the model.
  • Self-host, no per-call markup — deploy in Modular Cloud or your own VPC. No per-second pricing, no egress fees, no API surcharge on top of your hardware spend.
Try in our playground →
Input
Input

Get started with Modular

  • Request a demo

    Schedule a demo of Modular and explore a custom end-to-end deployment built around your models, hardware, and performance goals.

    • Distributed, large-scale online inference endpoints

    • Highest-performance to maximize ROI and latency

    • Deploy in Modular cloud or your cloud

    • View all features with a custom demo

    Book a demo

    Talk with our sales lead Jay!

    30min demo.  Evaluate with your workloads.  Ask us anything.

  • Talk to us!

    Book a demo for a personalized walkthrough of Modular in your environment. Learn how teams use it to simplify systems and tune performance at scale.

    • Custom 30 min walkthrough of our platform

    • Cover specific model or deployment needs

    • Flexible pricing to fit your specific needs

    Book a demo

    Talk with our sales lead Jay!

  • Start using MAX

    ( FREE )

    Run any open source model in 5 minutes, then benchmark it. Scale it to millions yourself (for free!).

  • Start using Mojo

    ( FREE )

    Install Mojo and get up and running in minutes. A simple install, familiar tooling, and clear docs make it easy to start writing code immediately.

Playground

Try in the Console

Generate images and videos from a single browser playground. No credit card required.

Get Started with Modular Cloud