RunPod - The Cloud Built for AI

RunPod

Pricing Serverless Blog Docs

New pricing: More AI power, less cost!

Learn more

All in one cloud.

Train, fine-tune and deploy AI
models with RunPod.

RunPod works with Startups, Academic Institutions, and Enterprises.

Develop

Globally distributed GPU
cloud for your AI workloads

Deploy any GPU workload seamlessly, so you can focus less on
infrastructure and more on running ML models.

PyTorch

ID: twnw98clgxxf2z

$2.89/hour

200 GB Disk: 200 GB Pod Volume

Volume Path: /workspace

1 x H100 PCIe

9 vCPU 50 GB RAM

8654 Mbps

938 Mbps

963 MBps

1970-01-01T00:00:00.000Z

create pod network

1970-01-01T00:00:01.000Z

create 20GB network volume

1970-01-01T00:00:02.000Z

create container runpod/pytorch:3.10-2.0.0-117

1970-01-01T00:00:03.000Z

3.10-2.0.0-117 Pulling from runpod/pytorch

1970-01-01T00:00:04.000Z

Digest: sha256:2dbf81dd888d383620a486f83ad2ff47540c6cb5e02a61e74b8db03a715488d6

1970-01-01T00:00:05.000Z

Status: Image is up to date for runpod/pytorch:3.10-2.0.0-117

1970-01-01T00:00:06.000Z

start container

Spin up a GPU pod in seconds

it's a pain to having to wait upwards of 10 minutes for your pods to spin up - we've cut the cold-boot time down to milliseconds, so you can start building within seconds of deploying your pods.

Spin up a pod

Choose from 50+ templates ready out-of-the-box, or bring your own custom container.

Get setup instantly with PyTorch, Tensorflow, or any other preconfigured environment you might need for your machine learning workflow.

Along with managed and community templates, we also let you configure your own template to fit your deployment needs.

PyTorch

Tensorflow

Docker

Runpod

Powerful & Cost-Effective GPUs
for Every Workload

See all GPUs

Thousands of GPUs across 30+ Regions

Deploy any container on Secure Cloud. Public and private image repos are supported. Configure your environment the way you want.

Zero fees for ingress/egress

Global interoperability

99.99% Uptime

$0.05/GB/month Network Storage

Starting from $2.99/hr

MI300X

192GB VRAM

283GB RAM

24 vCPUs

$2.99/hr

Secure Cloud

Starting from $2.49/hr

H100 PCIe

80GB VRAM

188GB RAM

16 vCPUs

$2.69/hr

Secure Cloud

$2.49/hr

Community Cloud

Starting from $1.19/hr

A100 PCIe

80GB VRAM

117GB RAM

8 vCPUs

$1.64/hr

Secure Cloud

$1.19/hr

Community Cloud

Starting from $1.89/hr

A100 SXM

80GB VRAM

125GB RAM

16 vCPUs

$1.89/hr

Secure Cloud

Starting from $0.39/hr

A40

48GB VRAM

50GB RAM

9 vCPUs

$0.39/hr

Secure Cloud

$0.47/hr

Community Cloud

Starting from $0.99/hr

L40

48GB VRAM

94GB RAM

8 vCPUs

$0.99/hr

Secure Cloud

Starting from $0.79/hr

L40S

48GB VRAM

62GB RAM

12 vCPUs

$1.03/hr

Secure Cloud

$0.79/hr

Community Cloud

Starting from $0.44/hr

RTX A6000

48GB VRAM

50GB RAM

8 vCPUs

$0.76/hr

Secure Cloud

$0.44/hr

Community Cloud

Starting from $0.22/hr

RTX A5000

24GB VRAM

24GB RAM

8 vCPUs

$0.36/hr

Secure Cloud

$0.22/hr

Community Cloud

Starting from $0.34/hr

RTX 4090

24GB VRAM

27GB RAM

6 vCPUs

$0.69/hr

Secure Cloud

$0.34/hr

Community Cloud

Starting from $0.22/hr

RTX 3090

24GB VRAM

24GB RAM

4 vCPUs

$0.43/hr

Secure Cloud

$0.22/hr

Community Cloud

Starting from $0.20/hr

RTX A4000 Ada

20GB VRAM

31GB RAM

4 vCPUs

$0.38/hr

Secure Cloud

$0.20/hr

Community Cloud

Scale

Scale ML inference
with Serverless

Run your AI models with autoscaling, job queueing and
sub 250ms cold start time.

Deploy Now

Autoscale in seconds

Respond to user demand in real time with GPU workers that
scale from 0 to 100s in seconds.

Flex

Workers

Active

Workers

10 GPUs

6:24AM

100 GPUs

11:34AM

20 GPUs

1:34PM

Usage Analytics

Real-time usage analytics for your endpoint with metrics on completed and failed requests. Useful for endpoints that have fluctuating usage profiles throughout the day.

See the console

Active

Requests

Completed:

2,277

Retried:

Failed:

Execution Time

Total:

1,420s

P70:

P90:

19s

P98:

22s

Execution Time Analytics

Debug your endpoints with detailed metrics on execution time. Useful for hosting models that have varying execution times, like large language models. You can also monitor delay time, cold start time, cold start count, GPU utilization, and more.

See the console

Real-Time Logs

Get descriptive, real-time logs to show you exactly what's happening across your active and flex GPU workers at all times.

See the console

worker logs -- zsh

2024-03-15T19:56:00.8264895Z INFO | Started job db7c79
2024-03-15T19:56:03.2667597Z
0% | | 0/28 [00:00<?, ?it/s]
12% |██ | 4/28 [00:00<00:01, 12.06it/s]
38% |████ | 12/28 [00:00<00:01, 12.14it/s]
77% |████████ | 22/28 [00:01<00:00, 12.14it/s]
100% |██████████| 28/28 [00:02<00:00, 12.13it/s]
2024-03-15T19:56:04.7438407Z INFO | Completed job db7c79 in 2.9s
2024-03-15T19:57:00.8264895Z INFO | Started job ea1r14
2024-03-15T19:57:03.2667597Z
0% | | 0/28 [00:00<?, ?it/s]
15% |██ | 4/28 [00:00<00:01, 12.06it/s]
41% |████ | 12/28 [00:00<00:01, 12.14it/s]
80% |████████ | 22/28 [00:01<00:00, 12.14it/s]
100% |██████████| 28/28 [00:02<00:00, 12.13it/s]
2024-03-15T19:57:04.7438407Z INFO | Completed job ea1r14 in 2.9s
2024-03-15T19:58:00.8264895Z INFO | Started job gn3a25
2024-03-15T19:58:03.2667597Z
0% | | 0/28 [00:00<?, ?it/s]
18% |██ | 4/28 [00:00<00:01, 12.06it/s]
44% |████ | 12/28 [00:00<00:01, 12.14it/s]
83% |████████ | 22/28 [00:01<00:00, 12.14it/s]
100% |██████████| 28/28 [00:02<00:00, 12.13it/s]
2024-03-15T19:58:04.7438407Z INFO | Completed job gn3a25 in 2.9s

Everything your app needs. All in

one cloud.

99.99%

guaranteed uptime

10PB+

network storage

5,953,275,022

requests

AI Inference

We handle millions of inference requests a day. Scale your machine learning inference while keeping costs low with RunPod serverless.

AI Training

Run machine learning training tasks that can take up to 7 days. Train on our available NVIDIA H100s and A100s or reserve AMD MI300Xs and AMD MI250s a year in advance.

Autoscale

Serverless GPU workers scale from 0 to n with 8+ regions distributed globally. You only pay when your endpoint receives and processes a request.

Bring Your Own Container

Deploy any container on our AI cloud. Public and private image repositories are supported. Configure your environment the way you want.

Zero Ops Overhead

RunPod handles all the operational aspects of your infrastructure from deploying to scaling. You bring the models, let us handle the ML infra.

Network Storage

Serverless workers can access network storage volume backed by NVMe SSD with up to 100Gbps network throughput. 100TB+ storage size is supported, contact us if you need 1PB+.

Easy-to-use CLI

Use our CLI tool to automatically hot reload local changes while developing, and deploy on Serverless when you’re done tinkering.

Secure & Compliant

RunPod AI Cloud is built on enterprise-grade GPUs with world-class compliance and security to best serve your machine learning models.

Lightning Fast Cold-Start

With Flashboot, watch your cold-starts drop to sub 250 milliseconds. No more waiting for GPUs to warm up when usage is unpredictable.

Pending Certifications

RunPod is in the process of getting SOC 2, ISO 27001, and HIPAA. We aim to have all three by early Q4, 2024.

Launch your AI application in minutes

Start building with the most cost-effective platform for developing and scaling machine learning models.

Get started

RunPod

Spin up a GPU pod in seconds

Choose from 50+ templates ready out-of-the-box, or bring your own custom container.

Powerful & Cost-Effective GPUsfor Every Workload

Powerful & Cost-Effective GPUs
for Every Workload