GPU type(s) | Memory | Flex | Active | Description |
---|---|---|---|---|
H200 PRO | 141 GB | $0.00155 | $0.00124 | Extreme throughput for huge models |
A100 | 80 GB | $0.00076 | $0.00060 | High throughput GPU, yet still very cost-effective |
H100 PRO | 80 GB | $0.00116 | $0.00093 | Extreme throughput for big models |
A6000, A40 | 48 GB | $0.00034 | $0.00024 | A cost-effective option for running big models |
L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | $0.00037 | Extreme inference throughput on LLMs like Llama 3 7B |
L4, A5000, 3090 | 24 GB | $0.00019 | $0.00013 | Great for small-to-medium sized inference workloads |
4090 PRO | 24 GB | $0.00031 | $0.00021 | Extreme throughput for small-to-medium models |
A4000, A4500, RTX 4000 | 16 GB | $0.00016 | $0.00011 | The most cost-effective for small models |