Compare GPU Performance on AI Workloads
L40 48GB
L40 48GB
Vs.
L40S 48GB
L40S 48GB
LLM Benchmarks
Benchmarks were run on RunPod gpus using vllm. For more details on vllm, check out the vllm github repository.
Output Token Throughput (tok/s)
Llama 8b Instruct
1
Output Token Throughput (tok/s)
Get started with RunPod
today.
We handle millions of gpu requests a day. Scale your machine learning workloads while keeping costs low with RunPod.
Get Started