Now serving MiniMax-M3! Request access today. Read More →

Example Usage

    Model Details
    • Developed by
    • Model family
      my-kimi
    • Modality
      No items found.

    Why choose my kimi on Modular?

    • High performance, out of the box

      Run leading open models with strong default performance and the ability to optimize down to the kernel — extracting more from every GPU.

    • Lower Infrastructure Costs

      Deploy efficiently across NVIDIA and AMD hardware to reduce GPU count, increase throughput, and avoid expensive closed-model licensing.

    • Easy Integration

      Integrate through an OpenAI-compatible endpoint, swap models freely, and scale across clouds or hardware without redesigning your application stack.

    my kimi
    Want to self-host this model with our open source infrastructure?
    Read How

    🔥 Trending models

    Similar models

    No items found.

    Get started with Modular

    • Request a demo

      Schedule a demo of Modular and explore a custom end-to-end deployment built around your models, hardware, and performance goals.

      • Distributed, large-scale online inference endpoints

      • Highest-performance to maximize ROI and latency

      • Deploy in Modular cloud or your cloud

      • View all features with a custom demo

      Book a demo

      Talk with our sales lead Jay!

      30min demo.  Evaluate with your workloads.  Ask us anything.

    • Talk to us!

      Book a demo for a personalized walkthrough of Modular in your environment. Learn how teams use it to simplify systems and tune performance at scale.

      • Custom 30 min walkthrough of our platform

      • Cover specific model or deployment needs

      • Flexible pricing to fit your specific needs

      Book a demo

      Talk with our sales lead Jay!

    • Start using MAX

      ( FREE )

      Run any open source model in 5 minutes, then benchmark it. Scale it to millions yourself (for free!).

    • Start using Mojo

      ( FREE )

      Install Mojo and get up and running in minutes. A simple install, familiar tooling, and clear docs make it easy to start writing code immediately.