RunPod Instant Clusters

Multi-Node GPU Clusters in Minutes, Not Months

Launch a cluster in minutes

Deploy multi-node GPU clusters in minutes, not months, with simple self-service provisioning through our intuitive console

Direct comparison to traditional solutions that take weeks to deploy

Eliminate procurement delays and infrastructure setup time

Run any docker workload

Bring your own Docker containers or choose from our optimized templates for inference, training, and research workloads across all major AI frameworks

Billed by the second

Pay only for what you use, with precise per-second billing and no minimum commitments or upfront costs required

Stop your cluster at any time

Complete freedom to terminate your cluster when not in use, with no termination fees or minimum runtime requirements — perfect for intermittent workloads

Instant Clusters at a Glance

Built for Speed, Scale, and Savings

H100 SXM

GPU Model Available

16-64

GPUs per Cluster

800-3200Gbps

East-West Bandwidth

Per-second

Billing Precision

37 seconds

Av Cold-Boot Time with Pytorch

Run Slurm

Compatible with Slurm

Simple, Transparent Pricing

Pay Only for What You Use

H100 GPU

$3.58

/hour

Per GPU, billed by the second

Launch a Cluster

H100 GPUs at $3.58/hr per GPU

Billed by the second

No long-term commitments

Scale up or down anytime

No hidden egress or operational fees

Cost Calculator

HARDWARE TYPES

NVIDIA H100

$3.58 GPU/hr

NVIDIA H200

Coming soon

NVIDIA B200

Coming soon

NVIDIA GB200

Coming soon

TYPE OF GPU:

NUMBER OF GPUS:

DURATION (IN HOURS):

UNIT:

ESTIMATED GPU COST:

$0.00

Instant Clusters vs Traditional Solutions

Why Instant Clusters Stand Out

Feature	RunPod Instant Clusters	Traditional Providers
Deployment Speed	Minutes	Days to Weeks
Billing Model	Per-second	Monthly/Annual Contracts
Minimum Commitment	None	3-36 Months
Cost per H100	$3.58/hr	$8+/hr

Instant Provisioning

Clusters available in minutes, not weeks, with no pre-approval or long sales processes

Per-Second Billing

Pay only for what you use — turn off your cluster anytime with no penalties

No Contracts or Minimums

Never commit to a single GPU second longer than you need

Self-Service Console

Fully automated provisioning with no sales calls or manual setups

Launch Your Cluster in Minutes

From Zero to Launch in Minutes

Select your cluster size

Choose how many nodes you need for your workload, from a single node up to 8 nodes with 8 GPUs each.

Configure your cluster

Set the number of nodes, GPUs per node, and other specifications based on your needs.

Deploy

Deploy your cluster with a single click and connect via SSH, Jupyter, or other methods based on your template.

Launch Your First Cluster

Frequently Asked Questions

Clearing Up the Details

What is the difference between a GPU pod and an Instant Cluster?

A GPU pod is a single instance with one or more GPUs within the same node. An Instant Cluster consists of multiple nodes interconnected with high-speed networking, allowing for workloads that span across multiple machines. Clusters are ideal for large model inference and distributed training that exceeds the capacity of a single node.

What is the minimum and maximum cluster size?

Anyone can access 2 nodes on-demand with up to 16 GPUs. To access larger clusters up to 8 nodes (64 GPUs), you'll need to request a spend limit increase.

How is billing handled for Instant Clusters?

Instant Clusters are billed by the second, just like our regular GPU pods. You're only charged for the compute time you actually use, with no minimum commitments or upfront costs. When you're done with your work, simply terminate the cluster to stop billing.

What network bandwidth is available between nodes?

Our data centers provide robust network connectivity with WWAN capacity ranging from 20Gbps to 400Gbps, and east-west bandwidth between servers ranging from 800Gbps to 3200Gbps depending on your configuration.

What storage solutions are available for large models?

RunPod offers native Network Storage integration where available, providing a shared filesystem layer that can be utilized across all nodes in your cluster. This is ideal for storing large models ranging from tens to hundreds of gigabytes close to your computing resources.

Can I connect my cluster to AWS?

Yes, you can establish connections between your RunPod cluster and AWS environment through application layer mTLS, enabling secure bridging of workloads between platforms.

Do you support Kubernetes or other container orchestration tools?

Currently, Instant Clusters are not compatible with Kubernetes. The cluster environment is managed by RunPod's native orchestration system, eliminating the need for additional container orchestration tools or CNI configuration.

Can I run Slurm on Instant Clusters?

Yes, Instant Clusters fully support Slurm for workload management.

Are there any minimum lease terms or contract requirements?

No, there are absolutely no minimum lease terms for Instant Clusters. You have complete flexibility to deploy and terminate clusters as needed to support your workloads, with no long-term commitments or contract obligations.

Get started with RunPod

today.

We handle millions of gpu requests a day. Scale your machine learning workloads while keeping costs low with RunPod.

Get Started

RunPod