Skip to content
Home/Reference Info/GPU Reference

GPU Reference

Review memory tiers, quantization support, and deployment positioning for hardware selection.

GPU Reference

Review memory tiers, architecture capability, quantization support, and throughput summaries for faster hardware selection.

Quantization tags are deployment-oriented: INT4/INT8 mean mainstream low-bit inference kernels are practical on the architecture; FP8/BF16/FP16 indicate hardware precision tiers.

Architecture
Memory
Quantization
Product line
Memory typeThroughput summaryQuantization capability
GeForce RTX 2060 6GB
2019-01-156GB336 GB/sGDDR6
Turing / SM 7.57.5
FP16 52
FP16INT4INT8
Consumer
GeForce RTX 2060 SUPER 8GB
2019-07-098GB448 GB/sGDDR6
Turing / SM 7.57.5
FP16 57
FP16INT4INT8
Consumer
GeForce RTX 2070 8GB
2018-10-178GB448 GB/sGDDR6
Turing / SM 7.57.5
FP16 59.7
FP16INT4INT8
Consumer
GeForce RTX 2070 SUPER 8GB
2019-07-098GB448 GB/sGDDR6
Turing / SM 7.57.5
FP16 73
FP16INT4INT8
Consumer
GeForce RTX 2080 8GB
2018-09-208GB448 GB/sGDDR6
Turing / SM 7.57.5
FP16 80.5
FP16INT4INT8
Consumer
GeForce RTX 2080 SUPER 8GB
2019-07-238GB496 GB/sGDDR6
Turing / SM 7.57.5
FP16 89
FP16INT4INT8
Consumer
GeForce RTX 2080 Ti 11GB
2018-08-2011GB616 GB/sGDDR6
Turing / SM 7.57.5
FP16 107.6
FP16INT4INT8
Consumer
GeForce RTX 3060 12GB
2021-01-1212GB360 GB/sGDDR6
Ampere / SM 8.0-8.68.0/8.6
FP16 101
FP16BF16INT4INT8
Consumer
GeForce RTX 3090 24GB
2020-09-0124GB936 GB/sGDDR6X
Ampere / SM 8.0-8.68.0/8.6
FP16 142
FP16BF16INT4INT8
Consumer
GeForce RTX 4060 Ti 16GB
2023-05-1816GB288 GB/sGDDR6
Ada / SM 8.98.9
AI 353
FP16BF16INT4INT8FP8
Consumer
GeForce RTX 4080 16GB
2022-09-2016GB717 GB/sGDDR6X
Ada / SM 8.98.9
FP8 389.6 / BF16 194.8 / FP16 194.8
FP16BF16INT4INT8FP8
Consumer
GeForce RTX 4090 24GB
2022-09-2024GB1008 GB/sGDDR6X
Ada / SM 8.98.9
FP8 660.6 / BF16 330.3 / FP16 330.3
FP16BF16INT4INT8FP8
Consumer
GeForce RTX 5050 8GB
2025-06-248GB320 GB/sGDDR6
Blackwell / CC 10.0+>=10.0
AI 421
FP16BF16INT4FP8
Consumer
GeForce RTX 5060 8GB
2025-04-158GB448 GB/sGDDR7
Blackwell / CC 10.0+>=10.0
AI 614
FP16BF16INT4FP8
Consumer
GeForce RTX 5060 Ti 16GB
2025-04-1516GB448 GB/sGDDR7
Blackwell / CC 10.0+>=10.0
AI 759
FP16BF16INT4FP8
Consumer
GeForce RTX 5070 12GB
2025-01-0612GB672 GB/sGDDR7
Blackwell / CC 10.0+>=10.0
AI 988
FP16BF16INT4FP8
Consumer
GeForce RTX 5070 Ti 16GB
2025-01-0616GB896 GB/sGDDR7
Blackwell / CC 10.0+>=10.0
AI 1406
FP16BF16INT4FP8
Consumer
GeForce RTX 5080 16GB
2025-01-0616GB960 GB/sGDDR7
Blackwell / CC 10.0+>=10.0
AI 1801
FP16BF16INT4FP8
Consumer
GeForce RTX 5090 32GB
2025-01-0632GB1792 GB/sGDDR7
Blackwell / CC 10.0+>=10.0
AI 3352
FP16BF16INT4FP8
Consumer
NVIDIA A10 24GB
2021-04-1224GB600 GB/sGDDR6
Ampere / SM 8.0-8.68.0/8.6
BF16 125 / FP16 125 / INT8 250
FP16BF16INT4INT8
Datacenter
NVIDIA A100 40GB
2020-05-1440GB1555 GB/sHBM2e
Ampere / SM 8.0-8.68.0/8.6
BF16 312 / FP16 312 / INT8 624
FP16BF16INT4INT8
Datacenter
NVIDIA A100 80GB
2020-11-1680GB2039 GB/sHBM2e
Ampere / SM 8.0-8.68.0/8.6
BF16 312 / FP16 312 / INT8 624
FP16BF16INT4INT8
Datacenter
NVIDIA A2 16GB
2021-11-0916GB200 GB/sGDDR6
Ampere / SM 8.0-8.68.0/8.6
BF16 18 / FP16 18 / INT8 72
FP16BF16INT4INT8
Datacenter
NVIDIA A30 24GB
2021-04-1224GB933 GB/sHBM2
Ampere / SM 8.0-8.68.0/8.6
BF16 165 / FP16 165 / INT8 330
FP16BF16INT4INT8
Datacenter
NVIDIA A40 48GB
2020-10-1648GB696 GB/sGDDR6
Ampere / SM 8.0-8.68.0/8.6
BF16 149.7 / FP16 149.7 / INT8 299.3
FP16BF16INT4INT8
Datacenter
NVIDIA H100 80GB
2022-03-2280GB2000 GB/sHBM3
Hopper / SM 9.09.0
FP8 1513 / BF16 756 / FP16 756
FP16BF16INT4INT8FP8
Datacenter
NVIDIA H200 141GB
2023-11-13141GB4800 GB/sHBM3e
Hopper / SM 9.09.0
FP8 1979 / BF16 989 / FP16 989
FP16BF16INT4INT8FP8
Datacenter
NVIDIA L20 48GB
2023-11-1648GB864 GB/sGDDR6
Ada / SM 8.98.9
FP8 239 / BF16 119.5 / FP16 119.5
FP16BF16INT4INT8FP8
Datacenter
NVIDIA L4 24GB
2023-03-2124GB300 GB/sGDDR6
Ada / SM 8.98.9
FP8 242 / BF16 121 / FP16 121
FP16BF16INT4INT8FP8
Datacenter
NVIDIA L40 48GB
2022-09-2048GB864 GB/sGDDR6
Ada / SM 8.98.9
FP8 362 / BF16 181 / FP16 181
FP16BF16INT4INT8FP8
Datacenter
NVIDIA L40S 48GB
2023-08-0848GB864 GB/sGDDR6
Ada / SM 8.98.9
FP8 733 / BF16 366 / FP16 366
FP16BF16INT4INT8FP8
Datacenter
NVIDIA RTX 6000 Ada 48GB
2022-09-2048GB960 GB/sGDDR6
Ada / SM 8.98.9
FP8 421.2 / BF16 210.6 / FP16 210.6
FP16BF16INT4INT8FP8
Workstation
NVIDIA RTX A6000 48GB
2020-10-0548GB768 GB/sGDDR6
Ampere / SM 8.0-8.68.0/8.6
BF16 154.8 / FP16 154.8 / INT8 309.7
FP16BF16INT4INT8
Workstation
NVIDIA T4 16GB
2018-09-1216GB320 GB/sGDDR6
Turing / SM 7.57.5
FP16 65.1 / INT8 130
FP16INT4INT8
Datacenter
NVIDIA TITAN RTX 24GB
2018-12-0324GB672 GB/sGDDR6
Turing / SM 7.57.5
FP16 130.5 / INT8 261
FP16INT4INT8
Consumer