RunPod Instant Clusters

Multi-Node GPU Clusters in Minutes, Not Months

Launch a cluster in minutes
Deploy multi-node GPU clusters in minutes, not months, with simple self-service provisioning through our intuitive console
Direct comparison to traditional solutions that take weeks to deploy
Eliminate procurement delays and infrastructure setup time
Launch a cluster in minutes
Run any docker workload
Bring your own Docker containers or choose from our optimized templates for inference, training, and research workloads across all major AI frameworks
Run any docker workload
Billed by the second
Pay only for what you use, with precise per-second billing and no minimum commitments or upfront costs required
Billed by the second
Stop your cluster at any time
Complete freedom to terminate your cluster when not in use, with no termination fees or minimum runtime requirements — perfect for intermittent workloads
Stop your cluster at any time

Instant Clusters at a Glance

Built for Speed, Scale, and Savings
H100 SXM
GPU Model Available
16-64
GPUs per Cluster
800-3200Gbps
East-West Bandwidth
Per-second
Billing Precision
37 seconds
Av Cold-Boot Time with Pytorch
Run Slurm
Compatible with Slurm

Simple, Transparent Pricing

Pay Only for What You Use
H100 GPU
$3.58
/hour
Per GPU, billed by the second
Launch a Cluster
H100 GPUs at $3.58/hr per GPU
Billed by the second
No long-term commitments
Scale up or down anytime
No hidden egress or operational fees
Cost Calculator
HARDWARE TYPES
NVIDIA H100
$3.58 GPU/hr
NVIDIA H200
Coming soon
NVIDIA B200
Coming soon
NVIDIA GB200
Coming soon
TYPE OF GPU:
NUMBER OF GPUS:
DURATION (IN HOURS):
UNIT:
ESTIMATED GPU COST:
$0.00

Instant Clusters vs Traditional Solutions

Why Instant Clusters Stand Out
FeatureRunPod Instant ClustersTraditional Providers
Deployment Speed
Minutes
Days to Weeks
Billing Model
Per-second
Monthly/Annual Contracts
Minimum Commitment
None
3-36 Months
Cost per H100
$3.58/hr
$8+/hr
Instant Provisioning
Clusters available in minutes, not weeks, with no pre-approval or long sales processes
Per-Second Billing
Pay only for what you use — turn off your cluster anytime with no penalties
No Contracts or Minimums
Never commit to a single GPU second longer than you need
Self-Service Console
Fully automated provisioning with no sales calls or manual setups

Launch Your Cluster in Minutes

From Zero to Launch in Minutes
1
Select your cluster size
Choose how many nodes you need for your workload, from a single node up to 8 nodes with 8 GPUs each.
2
Configure your cluster
Set the number of nodes, GPUs per node, and other specifications based on your needs.
3
Deploy
Deploy your cluster with a single click and connect via SSH, Jupyter, or other methods based on your template.

Frequently Asked Questions

Clearing Up the Details
A GPU pod is a single instance with one or more GPUs within the same node. An Instant Cluster consists of multiple nodes interconnected with high-speed networking, allowing for workloads that span across multiple machines. Clusters are ideal for large model inference and distributed training that exceeds the capacity of a single node.
Anyone can access 2 nodes on-demand with up to 16 GPUs. To access larger clusters up to 8 nodes (64 GPUs), you'll need to request a spend limit increase.
Instant Clusters are billed by the second, just like our regular GPU pods. You're only charged for the compute time you actually use, with no minimum commitments or upfront costs. When you're done with your work, simply terminate the cluster to stop billing.
Our data centers provide robust network connectivity with WWAN capacity ranging from 20Gbps to 400Gbps, and east-west bandwidth between servers ranging from 800Gbps to 3200Gbps depending on your configuration.
RunPod offers native Network Storage integration where available, providing a shared filesystem layer that can be utilized across all nodes in your cluster. This is ideal for storing large models ranging from tens to hundreds of gigabytes close to your computing resources.
Yes, you can establish connections between your RunPod cluster and AWS environment through application layer mTLS, enabling secure bridging of workloads between platforms.
Currently, Instant Clusters are not compatible with Kubernetes. The cluster environment is managed by RunPod's native orchestration system, eliminating the need for additional container orchestration tools or CNI configuration.
Yes, Instant Clusters fully support Slurm for workload management.
No, there are absolutely no minimum lease terms for Instant Clusters. You have complete flexibility to deploy and terminate clusters as needed to support your workloads, with no long-term commitments or contract obligations.
Get started with RunPod 
today.
We handle millions of gpu requests a day. Scale your machine learning workloads while keeping costs low with RunPod.
Get Started