Launch a cluster in minutes
Deploy multi-node GPU clusters in minutes, not months, with simple self-service provisioning through our intuitive console
Direct comparison to traditional solutions that take weeks to deploy
Eliminate procurement delays and infrastructure setup time
Run any docker workload
Bring your own Docker containers or choose from our optimized templates for inference, training, and research workloads across all major AI frameworks
Billed by the second
Pay only for what you use, with precise per-second billing and no minimum commitments or upfront costs required
Stop your cluster at any time
Complete freedom to terminate your cluster when not in use, with no termination fees or minimum runtime requirements — perfect for intermittent workloads
Instant Clusters at a Glance
Built for Speed, Scale, and Savings
H100 SXM
GPU Model Available
16-64
GPUs per Cluster
800-3200Gbps
East-West Bandwidth
Per-second
Billing Precision
37 seconds
Av Cold-Boot Time with Pytorch
Run Slurm
Compatible with Slurm
Simple, Transparent Pricing
Pay Only for What You Use
H100 GPUs at $3.58/hr per GPU
Billed by the second
No long-term commitments
Scale up or down anytime
No hidden egress or operational fees
Cost Calculator
HARDWARE TYPES
NVIDIA H100
$3.58 GPU/hr
NVIDIA H200
Coming soon
NVIDIA B200
Coming soon
NVIDIA GB200
Coming soon
TYPE OF GPU:
NVIDIA H100 80GB
NUMBER OF GPUS:
16
DURATION (IN HOURS):
UNIT:
Hours
ESTIMATED GPU COST:
$0.00
Instant Clusters vs Traditional Solutions
Why Instant Clusters Stand Out
Feature | RunPod Instant Clusters | Traditional Providers |
---|---|---|
Deployment Speed | Minutes | Days to Weeks |
Billing Model | Per-second | Monthly/Annual Contracts |
Minimum Commitment | None | 3-36 Months |
Cost per H100 | $3.58/hr | $8+/hr |
Instant Provisioning
Clusters available in minutes, not weeks, with no pre-approval or long sales processes
Per-Second Billing
Pay only for what you use — turn off your cluster anytime with no penalties
No Contracts or Minimums
Never commit to a single GPU second longer than you need
Self-Service Console
Fully automated provisioning with no sales calls or manual setups
Launch Your Cluster in Minutes
From Zero to Launch in Minutes
1
Select your cluster size
Choose how many nodes you need for your workload, from a single node up to 8 nodes with 8 GPUs each.
2
Configure your cluster
Set the number of nodes, GPUs per node, and other specifications based on your needs.
3
Deploy
Deploy your cluster with a single click and connect via SSH, Jupyter, or other methods based on your template.
Frequently Asked Questions
Clearing Up the Details
What is the difference between a GPU pod and an Instant Cluster?
A GPU pod is a single instance with one or more GPUs within the same node. An Instant Cluster consists of multiple nodes interconnected with high-speed networking, allowing for workloads that span across multiple machines. Clusters are ideal for large model inference and distributed training that exceeds the capacity of a single node.
What is the minimum and maximum cluster size?
Anyone can access 2 nodes on-demand with up to 16 GPUs. To access larger clusters up to 8 nodes (64 GPUs), you'll need to request a spend limit increase.
How is billing handled for Instant Clusters?
Instant Clusters are billed by the second, just like our regular GPU pods. You're only charged for the compute time you actually use, with no minimum commitments or upfront costs. When you're done with your work, simply terminate the cluster to stop billing.
What network bandwidth is available between nodes?
Our data centers provide robust network connectivity with WWAN capacity ranging from 20Gbps to 400Gbps, and east-west bandwidth between servers ranging from 800Gbps to 3200Gbps depending on your configuration.
What storage solutions are available for large models?
RunPod offers native Network Storage integration where available, providing a shared filesystem layer that can be utilized across all nodes in your cluster. This is ideal for storing large models ranging from tens to hundreds of gigabytes close to your computing resources.
Can I connect my cluster to AWS?
Yes, you can establish connections between your RunPod cluster and AWS environment through application layer mTLS, enabling secure bridging of workloads between platforms.
Do you support Kubernetes or other container orchestration tools?
Currently, Instant Clusters are not compatible with Kubernetes. The cluster environment is managed by RunPod's native orchestration system, eliminating the need for additional container orchestration tools or CNI configuration.
Can I run Slurm on Instant Clusters?
Yes, Instant Clusters fully support Slurm for workload management.
Are there any minimum lease terms or contract requirements?
No, there are absolutely no minimum lease terms for Instant Clusters. You have complete flexibility to deploy and terminate clusters as needed to support your workloads, with no long-term commitments or contract obligations.
Get started with RunPod
today.
We handle millions of gpu requests a day. Scale your machine learning workloads while keeping costs low with RunPod.
Get Started