When diffusion models exploded, Flaviu and Ioana couldn’t believe how slow and expensive it was to generate images and video at scale. So, in 2023, they launched a side project with a simple but audacious ambition: to make generating an image as fast as finding one, and affordable for mass use.
The project spread by word of mouth as builders discovered a product that was not just faster, but dramatically cheaper than anything else, and therefore scalable to millions of workloads. Demand for an API arrived almost overnight, and the project became Runware: a vertically integrated media-inference platform that makes image, video and audio generation a scalable sub-second process via a single API.
Demand you can touch
Image and video dominate web and mobile consumption today, yet many platforms make users wait minutes for simple prompts. Why this matters is obvious every time we speak to customers. Developers and enterprises want generative media in their products – marketing teams need on-brand visuals in seconds, marketplaces want dynamic imagery, and creative tools are racing to add video and audio.
We’ve seen this appetite first-hand at multiple developer events we’ve hosted throughout the year. Over the summer, we partnered with Runware on their first in-person hackathon, an event that brought together 150 builders and saw the winning projects built on their stack. Another hackathon focused on generative media, this time in Berlin, drew the same energy.

Sub-second inference, built for scale
While the opportunity is tremendous, the media market has long been bottlenecked by three things: fragmented access and poor usability, latency that breaks user experiences, and unit costs that don’t make sense at scale.
Runware solves all three issues. On ease of use, it aggregates almost 300 model classes and 400k+ fine-tuned variants behind one consistent schema and endpoint, so teams can A/B test, route, or swap models without ripping up their code. On speed, it is consistently 30-40% faster than other AI media inference platforms. And on cost, Runware delivers up to 10x better price-performance than incumbents. This trifecta makes Runware unbeatable at scale, which is why large, fast-growing companies like Wix, Together.ai, ImagineArt, Quora, and Higgsfield are choosing Runware. Today, Runware is already used by +200k developers and +300m end users worldwide, all of whom care about performance and budgets in equal measure.
Lessons from bare metal
Building the best generative media platform required an entirely new approach – one that went all the way down the stack, right down to redesigning the metal. Flaviu and Ioana were the right people for this job, as they have been tinkering with hardware for decades. In their last venture together, they built and sold bare metal clusters to enterprises like Vodafone, Booking.com, and Rightmove, earning a reputation for squeezing extraordinary performance from commodity parts and for shipping reliable infrastructure under real-world load. That bias for practical engineering sits at the heart of Runware.
Runware’s competitive edge is further compounded by the “engine under the hood”. While competitors rely on commodity cloud layouts, Runware has built a vertically integrated “Sonic Inference Engine®” packaged in containerised pods that can be dropped next to low-cost power and dense compute. Runware’s inference pods are 10x cheaper and faster to build, and can be up and running in weeks. They tune everything from the kernel level up, pair the right model to the right silicon, and use their own PCIe switching and scheduling to keep GPUs saturated. The result is tight control over latency and a durable cost advantage that doesn’t vanish when the next chip generation arrives.
This vertical control matters so much because multimodal media inference demands a different playbook than text, since architectures differ across image, video, and audio, and the bottleneck is compute rather than memory.
One API to serve all AI
Runware’s coverage is broad and getting broader. Today, the platform supports the world’s leading image, video, and audio models, alongside a growing suite of features like upscaling, background removal, and captioning. Over the summer, the team added audio models, and text inference is on the roadmap so customers can power “voice + vision + text” experiences from a single place. This means customers can add audio to their video media in the same prompt.
Where Runware sits in the ecosystem is an equally important asset. The platform is a neutral home for closed and open models alike – the connective tissue between model producers and the builders who need them. As labs ship new releases and access policies change, Runware’s interchangeability lets customers chase quality or cost without re-platforming. For many, it is the first practical way to keep pace with a landscape that moves at internet speed.
The moment meets the team
Runware’s entire setup and success to date is a reflection of its founders. Flaviu and Ioana are relentless systems thinkers who know where performance actually comes from, and they move fast without breaking the things that matter. They have built the rare platform that delights developers, satisfies enterprise checklists, and bends the cost curve in the customer’s favour.
Over the coming months, we’ll be helping the Runware team scale their operations as they expand globally. The company is also investing in platform development, extending its custom AI inference platform and building on its Sonic Inference Engine®.
This is a huge, urgent market, and Runware has the right product at the right layer, built by the right team. We’re thrilled to be on the journey with Flaviu, Ioana and the entire crew, alongside Speedinvest and Comcast Ventures, and insiders including Insight Partners, a16z speedrun, Begin Capital, and Zero Prime Ventures.