How Aneta Handles Bursty GPU Workloads Without Overcommitting

90%

cost reduction

200ms

cold start times

1hr

migration time

The Problem

GPU Infrastructure That Couldn’t Keep Up—or Let Go

Aneta is a pre-seed startup building an intelligent ingestion and inference engine designed to help large language models handle more complex work. Starting with the biotech industry and planning to expand into other verticals, their platform dynamically pulls, structures, and surfaces high-value information for LLMs—helping transform disorganized datasets into something useful and queryable.

But to make that happen, they needed infrastructure as flexible as their workloads. And that’s where Runpod came in.

As Aneta scaled their ingestion pipeline, they found themselves running hundreds or even thousands of GPUs in parallel for just a few weeks at a time—followed by stretches of zero usage. That kind of volatility didn’t sit well with most providers.

“For two, three, four weeks at a time, we’ll be running hundreds of thousands of GPUs in parallel. Then we could go two months not needing to run a single GPU,” said founder Luke. “It becomes pretty difficult when cloud providers either want a long-term commitment or they want to charge extreme fees for pay-as-you-go.”

Traditional GPU vendors pushed Aneta into a corner: commit to always-on pricing, or get crushed by unpredictable costs. Neither option worked for a small, early-stage company trying to move fast and spend responsibly.

“We spent more time in meetings trying to get GPU access than actually using the GPUs themselves.”

The Solution

Bursty Compute on Demand—Without the Penalties

With Runpod, Aneta finally found the middle ground they needed. By starting with on-demand GPU pods and now transitioning to serverless infrastructure, they gained the elasticity to match their usage curve without waste.

“Runpod changed how we ship because we no longer have to wonder if we have access to GPUs,” Luke said. “We save probably 90% on our infrastructure bill, mainly because we can use bursty compute whenever we need it.”

Migrating to Runpod was simple—Aneta’s stack was containerized, and Runpod's Docker-friendly design made it easy to deploy with a single click.

“It was just: here’s the image, click a button, and you’re up.”

With their ingest pipeline updating every 24 hours (soon to be every hour), that level of operational efficiency matters. The team needs to move fast, process frequently, and scale at will—without locking themselves into spend commitments they can’t predict.

The Results

90% Savings, Sub-Second Boot Times, Faster Shipping

Runpod delivered immediate wins across every axis:

  • 💸 90% lower infrastructure cost thanks to burst-friendly pricing
  • ⚡ Sub-second boot times—vs. 10+ seconds from other providers
  • 📦 One-click deployment using existing Docker containers
  • 📈 Zero DevOps overhead, freeing the team to focus on product
“The more we used it, the more we realized Runpod just had better infrastructure.”

And the support? It didn’t feel like enterprise cloud-as-usual.

“You kind of get used to just having a high pain tolerance. With Runpod, things actually get addressed really fast.”

Conclusion

With plans to 4–5x their compute usage this year and expand across multiple verticals, Aneta’s infrastructure needs will only grow more dynamic. But with Runpod powering their ingest and inference engine, they’re well positioned to scale without compromise.

“For small teams with complex and unique GPU workloads, I think Runpod is the perfect infrastructure provider.”

About

Aneta is a pre-seed startup building an intelligent ingestion and inference engine designed to help large language models handle more complex work.

Industry

AI

Company size

Early-stage startup

Paint point

Bursty workloads led to cost inefficiencies with traditional cloud providers.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.