RunPod - The Cloud Built for AI

RunPod

Pricing Serverless Blog Docs

We just raised our 20 million Seed

Learn more

All in one cloud.

Develop, train, and scale AI
models with RunPod.

Get started Read the docs

RunPod works with Startups, Academic Institutions, and Enterprises.

Develop

Globally distributed GPU
cloud for your AI workloads

Deploy any GPU workload seamlessly, so you can focus less on
infrastructure and more on running ML models.

Check out the CLI docs

runpodctl -- zsh

> brew install runpod/runpodctl/runpodctl
> runpodctl project create

runpodctl -- zsh

> runpodctl project dev

Provisioning GPUs...
Installing dependencies...
Activating project environment...

Success! Test your changes locally by
connecting to the API server at:
> https://landing-page123.proxy.runpod.net

Instant hot-reloading for your local changes.

Run code in the cloud that's as seamless as running it locally. No need to push your container image every time you add a print statement.

Just hit the endpoint given to you by the CLI to test and deploy when you're confident everything works.

Choose from 50+ templates ready out-of-the-box, or bring your own custom container.

Get setup instantly with PyTorch, Tensorflow, or any other preconfigured environment you might need for your machine learning workflow.

Along with managed and community templates, we also let you configure your own template to fit your deployment needs.

PyTorch

Tensorflow

Docker

Runpod

PyTorch

ID: twnw98clgxxf2z

$0.69/hour

200 GB Disk: 200 GB Pod Volume

Volume Path: /workspace

1 x A40

9 vCPU 50 GB RAM

8654 Mbps

938 Mbps

963 MBps

1970-01-01T00:00:00.000Z

create pod network

1970-01-01T00:00:01.000Z

create 20GB network volume

1970-01-01T00:00:02.000Z

create container runpod/pytorch:3.10-2.0.0-117

1970-01-01T00:00:03.000Z

3.10-2.0.0-117 Pulling from runpod/pytorch

1970-01-01T00:00:04.000Z

Digest: sha256:2dbf81dd888d383620a486f83ad2ff47540c6cb5e02a61e74b8db03a715488d6

1970-01-01T00:00:05.000Z

Status: Image is up to date for runpod/pytorch:3.10-2.0.0-117

1970-01-01T00:00:06.000Z

start container

Spin up a GPU pod in seconds

it's a pain to having to wait upwards of 10 minutes for your pods to spin up - we've cut the cold-boot time down to milliseconds, so you can start building within seconds of deploying your pods.

Spin up a pod

Powerful & Cost-Effective GPUs
for Every Workload

See all GPUs

Thousands of GPUs across 30+ Regions

Deploy any container on Secure Cloud. Public and private image repos are supported. Configure your environment the way you want.

Zero fees for ingress/egress

Global interoperability

99.99% Uptime

$0.05/GB/month Network Storage

Starting from $3.39/hr

H100 PCIe

80GB VRAM

125GB RAM

12 vCPUs

$3.89/hr

Secure Cloud

$3.39/hr

Community Cloud

Starting from $3.89/hr

H100 SXM

80GB VRAM

125GB RAM

16 vCPUs

$4.69/hr

Secure Cloud

$3.89/hr

Community Cloud

Starting from $1.59/hr

A100 PCIe

80GB VRAM

117GB RAM

12 vCPUs

$1.89/hr

Secure Cloud

$1.59/hr

Community Cloud

Starting from $1.69/hr

A100 SXM

80GB VRAM

125GB RAM

16 vCPUs

$2.29/hr

Secure Cloud

Starting from $0.67/hr

A40

48GB VRAM

48GB RAM

9 vCPUs

$0.69/hr

Secure Cloud

$0.67/hr

Community Cloud

Starting from $0.50/hr

L40

48GB VRAM

58GB RAM

16 vCPUs

$1.14/hr

Secure Cloud

Starting from $1.19/hr

L40S

48GB VRAM

62GB RAM

8 vCPUs

$1.49/hr

Secure Cloud

$1.19/hr

Community Cloud

Starting from $0.69/hr

RTX A6000

48GB VRAM

50GB RAM

8 vCPUs

$0.79/hr

Secure Cloud

$0.69/hr

Community Cloud

Starting from $0.26/hr

RTX A5000

24GB VRAM

24GB RAM

4 vCPUs

$0.44/hr

Secure Cloud

$0.26/hr

Community Cloud

Starting from $0.54/hr

RTX 4090

24GB VRAM

24GB RAM

6 vCPUs

$0.74/hr

Secure Cloud

$0.54/hr

Community Cloud

Starting from $0.26/hr

RTX 3090

24GB VRAM

24GB RAM

4 vCPUs

$0.44/hr

Secure Cloud

$0.26/hr

Community Cloud

Starting from $0.21/hr

RTX A4000 Ada

20GB VRAM

31GB RAM

4 vCPUs

$0.39/hr

Secure Cloud

$0.21/hr

Community Cloud

Scale

Scale your ML inference
with Serverless

Run your AI models with autoscaling, job queueing and
sub 250ms cold start time.

Deploy now

runpodctl -- zsh

> runpodctl project deploy

Autoscale in seconds

Respond to user demand in real time with GPU workers that
scale from 0 to 100s in a minute.

Flex

Workers

Active

Workers

10 GPUs

6:24AM

100 GPUs

11:34AM

20 GPUs

1:34PM

Usage Analytics

Real-time usage analytics for your endpoint with metrics on completed and failed requests. Useful for endpoints that have fluctuating usage profiles throughout the day.

See the console

Active

Requests

Completed:

2,277

Retried:

Failed:

Execution Time

Total:

1,420s

P70:

P90:

19s

P98:

22s

Execution Time Analytics

Debug your endpoints with detailed metrics on execution time. Useful for hosting models that have varying execution times, like large language models. You can also monitor delay time, cold start time, cold start count, GPU utilization, and more.

See the console

Real-Time Logs

Get descriptive, real-time logs to show you exactly what's happening across your active and flex GPU workers at all times.

See the console

worker logs -- zsh

2024-03-15T19:56:00.8264895Z INFO | Started job db7c79
2024-03-15T19:56:03.2667597Z
0% | | 0/28 [00:00<?, ?it/s]
12% |██ | 4/28 [00:00<00:01, 12.06it/s]
38% |████ | 12/28 [00:00<00:01, 12.14it/s]
77% |████████ | 22/28 [00:01<00:00, 12.14it/s]
100% |██████████| 28/28 [00:02<00:00, 12.13it/s]
2024-03-15T19:56:04.7438407Z INFO | Completed job db7c79 in 2.9s
2024-03-15T19:57:00.8264895Z INFO | Started job ea1r14
2024-03-15T19:57:03.2667597Z
0% | | 0/28 [00:00<?, ?it/s]
15% |██ | 4/28 [00:00<00:01, 12.06it/s]
41% |████ | 12/28 [00:00<00:01, 12.14it/s]
80% |████████ | 22/28 [00:01<00:00, 12.14it/s]
100% |██████████| 28/28 [00:02<00:00, 12.13it/s]
2024-03-15T19:57:04.7438407Z INFO | Completed job ea1r14 in 2.9s
2024-03-15T19:58:00.8264895Z INFO | Started job gn3a25
2024-03-15T19:58:03.2667597Z
0% | | 0/28 [00:00<?, ?it/s]
18% |██ | 4/28 [00:00<00:01, 12.06it/s]
44% |████ | 12/28 [00:00<00:01, 12.14it/s]
83% |████████ | 22/28 [00:01<00:00, 12.14it/s]
100% |██████████| 28/28 [00:02<00:00, 12.13it/s]
2024-03-15T19:58:04.7438407Z INFO | Completed job gn3a25 in 2.9s

Loved

by the

developer

community

RunPod is built by developers, for developers. Our community of 10,000+ developers on Discord is here for support while you get started.

Join our Discord

CTO, LOVO AI

Hara Kang

"There are definitely providers who offer much cheaper pricing than Runpod. But everytime, they have an inferior developer experience. If you're paying 50% less for a GPU elsewhere, that cost is coming out somewhere else, be it developer time or lack of reliability. For the value, Runpod provides competitive prices and we're willing to pay a premium to reduce the headache that normally comes with ML ops."

Case Study

CEO, Coframe

Josh Payne

"The setup process was great! Very quick and easy. RunPod had the exact GPUs we needed for AI inference and the pricing was very fair based on what I saw out on the market. The main value proposition for us was the flexibility RunPod offered. We were able to scale up effortlessly to meet the demand at launch."

Case Study

CPO, KRNL.ai

Giacomo Locci

"The cost savings on RunPod have been incredible. Since switching, our team has been able to focus on building the product instead of the infrastructure. We often have unpredictable demand from our users which makes it hard to manage our cloud costs. But with RunPod, we've been able to scale up and down quickly and painlessly. Great reliability in multiple regions and great customer support is why we've been with them for over a year now."

Case Study

Everything your app needs. All in

one cloud.

99.99%

guaranteed uptime

10PB+

network storage

4,185,619,130

requests

AI Inference

We handle millions of inference requests a day. Scale your machine learning inference while keeping costs low with RunPod serverless.

AI Training

Run machine learning training tasks that can take up to 7 days. Train on our available NVIDIA H100s and A100s or reserve AMD MI300Xs and AMD MI250s a year in advance.

Autoscale

Serverless GPU workers scale from 0 to n with 8+ regions distributed globally. You only pay when your endpoint receives and processes a request.

Bring Your Own Container

Deploy any container on our AI cloud. Public and private image repositories are supported. Configure your environment the way you want.

Zero Ops Overhead

RunPod handles all the operational aspects of your infrastructure from deploying to scaling. You bring the models, let us handle the ML infra.

Network Storage

Serverless workers can access network storage volume backed by NVMe SSD with up to 100Gbps network throughput. 100TB+ storage size is supported, contact us if you need 1PB+.

Easy-to-use CLI

Use our CLI tool to automatically hot reload local changes while developing, and deploy on Serverless when you’re done tinkering.

Secure & Compliant

RunPod AI Cloud is built on enterprise-grade GPUs with world-class compliance and security to best serve your machine learning models.

Lightning Fast Cold-Start

With Flashboot, watch your cold-starts drop to sub 250 milliseconds. No more waiting for GPUs to warm up when usage is unpredictable.

Launch your AI application in minutes

Start building with the most cost-effective platform for developing and scaling machine learning models.

Get Started

Products

Secure Cloud Community Cloud Serverless

Resources

Docs FAQ Blog Become a Host

Company

Careers Compliance Cookie Policy Disclaimer Privacy Policy Terms of Service

Contact

RunPod

Instant hot-reloading for your local changes.

Choose from 50+ templates ready out-of-the-box, or bring your own custom container.

Spin up a GPU pod in seconds

Powerful & Cost-Effective GPUsfor Every Workload

Powerful & Cost-Effective GPUs
for Every Workload