GPU Reference
Review memory tiers, quantization support, and deployment positioning for hardware selection.
GPU Reference
Review memory tiers, architecture capability, quantization support, and throughput summaries for faster hardware selection.
Quantization tags are deployment-oriented: INT4/INT8 mean mainstream low-bit inference kernels are practical on the architecture; FP8/BF16/FP16 indicate hardware precision tiers.
| Memory type | Throughput summary | Quantization capability | ||||||
|---|---|---|---|---|---|---|---|---|
GeForce RTX 2060 6GB | 2019-01-15 | 6GB | 336 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 52 | FP16INT4INT8 | Consumer |
GeForce RTX 2060 SUPER 8GB | 2019-07-09 | 8GB | 448 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 57 | FP16INT4INT8 | Consumer |
GeForce RTX 2070 8GB | 2018-10-17 | 8GB | 448 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 59.7 | FP16INT4INT8 | Consumer |
GeForce RTX 2070 SUPER 8GB | 2019-07-09 | 8GB | 448 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 73 | FP16INT4INT8 | Consumer |
GeForce RTX 2080 8GB | 2018-09-20 | 8GB | 448 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 80.5 | FP16INT4INT8 | Consumer |
GeForce RTX 2080 SUPER 8GB | 2019-07-23 | 8GB | 496 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 89 | FP16INT4INT8 | Consumer |
GeForce RTX 2080 Ti 11GB | 2018-08-20 | 11GB | 616 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 107.6 | FP16INT4INT8 | Consumer |
GeForce RTX 3060 12GB | 2021-01-12 | 12GB | 360 GB/s | GDDR6 | Ampere / SM 8.0-8.68.0/8.6 | FP16 101 | FP16BF16INT4INT8 | Consumer |
GeForce RTX 3090 24GB | 2020-09-01 | 24GB | 936 GB/s | GDDR6X | Ampere / SM 8.0-8.68.0/8.6 | FP16 142 | FP16BF16INT4INT8 | Consumer |
GeForce RTX 4060 Ti 16GB | 2023-05-18 | 16GB | 288 GB/s | GDDR6 | Ada / SM 8.98.9 | AI 353 | FP16BF16INT4INT8FP8 | Consumer |
GeForce RTX 4080 16GB | 2022-09-20 | 16GB | 717 GB/s | GDDR6X | Ada / SM 8.98.9 | FP8 389.6 / BF16 194.8 / FP16 194.8 | FP16BF16INT4INT8FP8 | Consumer |
GeForce RTX 4090 24GB | 2022-09-20 | 24GB | 1008 GB/s | GDDR6X | Ada / SM 8.98.9 | FP8 660.6 / BF16 330.3 / FP16 330.3 | FP16BF16INT4INT8FP8 | Consumer |
GeForce RTX 5050 8GB | 2025-06-24 | 8GB | 320 GB/s | GDDR6 | Blackwell / CC 10.0+>=10.0 | AI 421 | FP16BF16INT4FP8 | Consumer |
GeForce RTX 5060 8GB | 2025-04-15 | 8GB | 448 GB/s | GDDR7 | Blackwell / CC 10.0+>=10.0 | AI 614 | FP16BF16INT4FP8 | Consumer |
GeForce RTX 5060 Ti 16GB | 2025-04-15 | 16GB | 448 GB/s | GDDR7 | Blackwell / CC 10.0+>=10.0 | AI 759 | FP16BF16INT4FP8 | Consumer |
GeForce RTX 5070 12GB | 2025-01-06 | 12GB | 672 GB/s | GDDR7 | Blackwell / CC 10.0+>=10.0 | AI 988 | FP16BF16INT4FP8 | Consumer |
GeForce RTX 5070 Ti 16GB | 2025-01-06 | 16GB | 896 GB/s | GDDR7 | Blackwell / CC 10.0+>=10.0 | AI 1406 | FP16BF16INT4FP8 | Consumer |
GeForce RTX 5080 16GB | 2025-01-06 | 16GB | 960 GB/s | GDDR7 | Blackwell / CC 10.0+>=10.0 | AI 1801 | FP16BF16INT4FP8 | Consumer |
GeForce RTX 5090 32GB | 2025-01-06 | 32GB | 1792 GB/s | GDDR7 | Blackwell / CC 10.0+>=10.0 | AI 3352 | FP16BF16INT4FP8 | Consumer |
NVIDIA A10 24GB | 2021-04-12 | 24GB | 600 GB/s | GDDR6 | Ampere / SM 8.0-8.68.0/8.6 | BF16 125 / FP16 125 / INT8 250 | FP16BF16INT4INT8 | Datacenter |
NVIDIA A100 40GB | 2020-05-14 | 40GB | 1555 GB/s | HBM2e | Ampere / SM 8.0-8.68.0/8.6 | BF16 312 / FP16 312 / INT8 624 | FP16BF16INT4INT8 | Datacenter |
NVIDIA A100 80GB | 2020-11-16 | 80GB | 2039 GB/s | HBM2e | Ampere / SM 8.0-8.68.0/8.6 | BF16 312 / FP16 312 / INT8 624 | FP16BF16INT4INT8 | Datacenter |
NVIDIA A2 16GB | 2021-11-09 | 16GB | 200 GB/s | GDDR6 | Ampere / SM 8.0-8.68.0/8.6 | BF16 18 / FP16 18 / INT8 72 | FP16BF16INT4INT8 | Datacenter |
NVIDIA A30 24GB | 2021-04-12 | 24GB | 933 GB/s | HBM2 | Ampere / SM 8.0-8.68.0/8.6 | BF16 165 / FP16 165 / INT8 330 | FP16BF16INT4INT8 | Datacenter |
NVIDIA A40 48GB | 2020-10-16 | 48GB | 696 GB/s | GDDR6 | Ampere / SM 8.0-8.68.0/8.6 | BF16 149.7 / FP16 149.7 / INT8 299.3 | FP16BF16INT4INT8 | Datacenter |
NVIDIA H100 80GB | 2022-03-22 | 80GB | 2000 GB/s | HBM3 | Hopper / SM 9.09.0 | FP8 1513 / BF16 756 / FP16 756 | FP16BF16INT4INT8FP8 | Datacenter |
NVIDIA H200 141GB | 2023-11-13 | 141GB | 4800 GB/s | HBM3e | Hopper / SM 9.09.0 | FP8 1979 / BF16 989 / FP16 989 | FP16BF16INT4INT8FP8 | Datacenter |
NVIDIA L20 48GB | 2023-11-16 | 48GB | 864 GB/s | GDDR6 | Ada / SM 8.98.9 | FP8 239 / BF16 119.5 / FP16 119.5 | FP16BF16INT4INT8FP8 | Datacenter |
NVIDIA L4 24GB | 2023-03-21 | 24GB | 300 GB/s | GDDR6 | Ada / SM 8.98.9 | FP8 242 / BF16 121 / FP16 121 | FP16BF16INT4INT8FP8 | Datacenter |
NVIDIA L40 48GB | 2022-09-20 | 48GB | 864 GB/s | GDDR6 | Ada / SM 8.98.9 | FP8 362 / BF16 181 / FP16 181 | FP16BF16INT4INT8FP8 | Datacenter |
NVIDIA L40S 48GB | 2023-08-08 | 48GB | 864 GB/s | GDDR6 | Ada / SM 8.98.9 | FP8 733 / BF16 366 / FP16 366 | FP16BF16INT4INT8FP8 | Datacenter |
NVIDIA RTX 6000 Ada 48GB | 2022-09-20 | 48GB | 960 GB/s | GDDR6 | Ada / SM 8.98.9 | FP8 421.2 / BF16 210.6 / FP16 210.6 | FP16BF16INT4INT8FP8 | Workstation |
NVIDIA RTX A6000 48GB | 2020-10-05 | 48GB | 768 GB/s | GDDR6 | Ampere / SM 8.0-8.68.0/8.6 | BF16 154.8 / FP16 154.8 / INT8 309.7 | FP16BF16INT4INT8 | Workstation |
NVIDIA T4 16GB | 2018-09-12 | 16GB | 320 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 65.1 / INT8 130 | FP16INT4INT8 | Datacenter |
NVIDIA TITAN RTX 24GB | 2018-12-03 | 24GB | 672 GB/s | GDDR6 | Turing / SM 7.57.5 | FP16 130.5 / INT8 261 | FP16INT4INT8 | Consumer |