CDNA_3

AMD CDNA 2
Release date	November 8, 2021; (2 years ago)
Fabrication process	TSMC N6
History
Predecessor	CDNA 1
Successor	CDNA 3

AMD CDNA 1
Release date	November 16, 2020; (3 years ago)
Fabrication process	TSMC N7 (FinFET)
History
Predecessor	AMD FirePro
Successor	CDNA 2

AMD CDNA
Release date	November 16, 2020; (3 years ago)
Designed by	AMD
Fabrication process	TSMC N7; TSMC N6; TSMC N5;
History
Predecessor	AMD FirePro
Variant	RDNA (consumer, professional)

CDNA (microarchitecture)

AMD compute-focused GPU microarchitecture

CDNA (Compute DNA) is a compute-centered graphics processing unit (GPU) microarchitecture designed by AMD for datacenters. Mostly used in the AMD Instinct line of data center graphics cards, CDNA is a successor to the Graphics Core Next (GCN) microarchitecture; the other successor being RDNA (Radeon DNA), a consumer graphics focused microarchitecture.

Quick Facts Release date, Designed by ...

The first generation of CDNA was announced on March 5th, 2020,^[2] and was featured in the AMD Instinct MI100, launched November 16th, 2020.^[3] This is CDNA 1's only produced product, manufactured on TSMC's N7 FinFET process.

The second iteration of the CDNA line implemented a multi-chip module (MCM) approach, differing from its predecessor's monolithic approach. Featured in the AMD Instinct MI250X and MI250, this MCM design used an elevated fanout bridge (EFB)^[4] to connect the dies. These two products were announced November 8th, 2021, and launched November 11th. The CDNA 2 line includes an additional latecomer using a monolithic design, the MI210.^[5] The MI250X and MI250 were the first AMD products to use the Open Compute Project (OCP)'s OCP Accelerator Module (OAM) socket form factor. Lower wattage PCIe versions are available.

The third iteration of CDNA switches to a MCM design utilizing different chiplets manufactured on multiple nodes. Currently consisting of the MI300X and MI300A, this product contains 15 unique dies and is connected with advanced 3D packaging techniques. The MI300 series was announced on January 5, 2023, and launched in H2 2023.

CDNA 1

Quick Facts Release date, Fabrication process ...

The CDNA family consists of one die, named Arcturus. The die is 750 millimeters squared, contains 25.6 billion transistors and is manufactured on TSMC's N7 node.^[6] The Arcturus die possesses 120 compute units and a 4096-bit memory bus, connected to four HBM2 placements, giving the die 32 GB of memory, and just over 1200 GB/s of memory bandwidth. Compared to its predecessor, CDNA has removed all hardware related to graphics acceleration. This removal includes but is not limited to: graphics caches, tessellation hardware, render output units (ROPs), and the display engine. CDNA retains the VCN media engine for HEVC, H.264, and VP9 decoding.^[7] CDNA has also added dedicated matrix compute hardware, similar to those added in Nvidia's Volta Architecure.

CDNA 2

Quick Facts Release date, Fabrication process ...

Like CDNA, CDNA 2 also consists of one die, named Aldebaran. This die is estimated to be 790 millimeters squared, and contains 28 billion transistors while being manufactured on TSMC's N6 node.^[12] The Aldebaran die contains only 112 compute units, a 6.67% decrease from Arcturus. Like the previous generation, this die contains a 4096-bit memory bus, now using HBM2e with a doubling in capacity, up to 64 GB. The largest change in CDNA 2 is the ability for two dies to be placed on the same package. The MI250X consists of 2 Aldebaran dies, 220 CUs (110 per die) and 128 GB of HBM2e. These dies are connected with 4 Infinity Fabric links, and addressed as independent GPUs by the host system.^[13]

Product Comparisons

More information Model (Code name), Release date ...

Model (Code name)	Release date	Architecture & fab	Transistors & die size	Core		Fillrate^{[lower-alpha 1]}		Vector Processing power^{[lower-alpha 1]}^{[lower-alpha 2]} (TFLOPS)			Matrix Processing power^{[lower-alpha 1]}^{[lower-alpha 2]} (TFLOPS)					Memory				TBP	Software Interface	Physical Interface
Model (Code name)	Release date	Architecture & fab	Transistors & die size	Config^{[lower-alpha 3]}	Clock^{[lower-alpha 1]} (MHz)	Texture^{[lower-alpha 4]} (GT/s)	Pixel^{[lower-alpha 5]} (GP/s)	Half (FP16)	Single (FP32)	Double (FP64)	INT8	BF16	FP16	FP32	FP64	Bus type & width	Size (GB)	Clock (MT/s)	Bandwidth (GB/s)	TBP	Software Interface	Physical Interface
Tesla V100 (PCIE) (GV100)^[19]^[20]	May 10, 2017	Volta TSMC 12 nm	12.1×10⁹ 815 mm²	5120:320:128:640 80 SM	1370	438.4	175.36	28.06	14.03	7.01	N/A	N/A	N/A	112.23	N/A	HBM2 4096 bit	16 32	1750	900	250 W	PCIe 3.0 ×16	PCIe ×16
Tesla V100 (SXM) (GV100)^[21]^[22]	May 10, 2017	Volta TSMC 12 nm	12.1×10⁹ 815 mm²	5120:320:128:640 80 SM	1455	465.6	186.24	29.80	14.90	7.46	N/A	N/A	N/A	119.19	N/A	HBM2 4096 bit	16 32	1750	900	300 W	NVLINK	SXM2
Radeon Instinct MI50 (Vega 20)^[23]^[24]^[25]^[26]^[27]^[28]	Nov 18, 2018	GCN 5 TSMC 7 nm	13.2×10⁹ 331 mm²	3840:240:64 60 CU	1450 1725	348.0 414.0	92.80 110.4	22.27 26.50	11.14 13.25	5.568 6.624	N/A	N/A	26.5	13.3	?	HBM2 4096-bit	16 32	2000	1024	300 W	PCIe 4.0 ×16	PCIe ×16
Radeon Instinct MI60 (Vega 20)^[24]^[29]^[30]^[31]	Nov 18, 2018	GCN 5 TSMC 7 nm	13.2×10⁹ 331 mm²	4096:256:64 64 CU	1500 1800	384.0 460.8	96.00 115.2	24.58 29.49	12.29 14.75	6.144 7.373	N/A	N/A	32	16	?	HBM2 4096-bit		2000	1024	300 W	PCIe 4.0 ×16	PCIe ×16
Tesla A100 (PCIE) (GA100)^[32] ^[33]	May 14, 2020	Ampere TSMC 7 nm	54.2×10⁹ 826 mm²	6912:432:-:432 108 SM	1065 1410	460.08 609.12	-	58.89 77.97	14.72 19.49	7.36 9.75	942.24 1247.47	235.56 311.87	235.56 311.87	117.78 155.93	14.72 19.49	HBM2 5120 bit	40 80	3186	2039	250 W	PCIe 4.0 ×16	PCIe ×16
Tesla A100 (SXM) (GA100))^[34] ^[35]	May 14, 2020	Ampere TSMC 7 nm	54.2×10⁹ 826 mm²	6912:432:-:432 108 SM	1275 1410	550.80 609.12	-	70.50 77.97	17.63 19.49	8.81 9.75	1128.04 1247.47	282.01 311.87	282.01 311.87	141.00 155.93	17.63 19.49	HBM2 5120 bit	40 80	3186	2039	400 W	NVLINK	SXM4
AMD Instinct MI100 (Arcturus)^[36]^[37]	Nov 16, 2020	CDNA TSMC 7 nm	25.6×10⁹ 750 mm²	7860:480:-:480 120 CU	1000 1502	480 720.96	-	?	15.72 23.10	7.86 11.5	122.88 184.57	61.44 92.28	122.88 184.57	30.72 46.14	15.36 23.07	HBM2 4096-bit	32	2400	1228	300 W	PCIe 4.0 ×16	PCIe ×16
AMD Instinct MI250X (PCIE) (Aldebaran)	Nov 8, 2021	CDNA 2 TSMC 6 nm	58×10⁹ 1540 mm²	14080:880:-:880 220 CU
AMD Instinct MI250X (OAM) (Aldebaran)	Nov 8, 2021	CDNA 2 TSMC 6 nm	58×10⁹ 1540 mm²	14080:880:-:880 220 CU
Tesla H100 (PCIE) (GH100)	Mar 22, 2022	Hopper TSMC 4 nm	80×10⁹ 814 mm²
Tesla H100 (SXM) (GH100)	Mar 22, 2022	Hopper TSMC 4 nm	80×10⁹ 814 mm²

Boost values (if available) are stated below the base value in italic.
Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.
Unified shaders : Texture mapping units : Render output units : AI accelerators and Compute units (CU) / Streaming multiprocessors (SM)
Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.
Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.

Share this article:

This article uses material from the Wikipedia article CDNA_3, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[Boost-10] Boost values (if available) are stated below the base value in italic.

[FLOPS-11] Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.

[Core_config-12] Unified shaders : Texture mapping units : Render output units and Compute units (CU)

[Texture_fill-13] Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.

[Pixel_fill-14] Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.

[Boost-24] Boost values (if available) are stated below the base value in italic.

[FLOPS-25] Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.

[Core_config-26] Unified shaders : Texture mapping units : Render output units : AI accelerators and Compute units (CU) / Streaming multiprocessors (SM)

[Texture_fill-27] Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.

[Pixel_fill-28] Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.

[1] [1]
Smith, Ryan (June 9, 2022). "AMD: Combining CDNA 3 and Zen 4 for MI300 Data Center APU in 2023". AnandTech. Retrieved December 20, 2022.

[2] [2]
Smith, Ryan. "AMD Unveils CDNA GPU Architecture: A Dedicated GPU Architecture for Data Centers". www.anandtech.com. Retrieved September 20, 2022.

[3] [3]
"GPU Database: AMD Radeon Instinct MI100". TechPowerUp. Retrieved September 20, 2022.

[4] [4]
Smith, Ryan. "AMD Announces Instinct MI200 Accelerator Family: Taking Servers to Exascale and Beyond". www.anandtech.com. Retrieved September 21, 2022.

[5] [5]
Smith, Ryan. "AMD Releases Instinct MI210 Accelerator: CDNA 2 On a PCIe Card". www.anandtech.com. Retrieved September 21, 2022.

[6] [6]
Kennedy, Patrick (November 16, 2020). "AMD Instinct MI100 32GB CDNA GPU Launched". ServeTheHome. Retrieved September 22, 2022.

[:0-7] [7]
"AMD CDNA Whitepaper" (PDF). amd.com. March 5, 2020. Retrieved September 22, 2022.

[8] [8]
""AMD Instinct MI100" Instruction Set Architecture, Reference Guide" (PDF). developer.amd.com. December 14, 2020. Retrieved September 22, 2022.

[9] [9]
Aaron Klotz (December 14, 2022). "Samsung Soups Up 96 AMD MI100 GPUs With Radical Computational Memory". Tom's Hardware. Retrieved December 23, 2022.

[15] [10]
"AMD Instinct MI100 Brochure" (PDF). AMD. Retrieved December 25, 2022.

[16] [11]
"AMD CDNA Whitepaper" (PDF). AMD. Retrieved December 25, 2022.

[17] [12]
Anton Shilov (November 17, 2021). "AMD's Instinct MI250X OAM Card Pictured: Aldebaran's Massive Die Revealed". Tom's Hardware. Retrieved November 20, 2022.

[:1-18] [13]
"Hot Chips 34 – AMD's Instinct MI200 Architecture". Chips and Cheese. September 18, 2022. Retrieved November 10, 2022.

[:2-19] [14]
"INTRODUCING AMD CDNA™ 2 ARCHITECTURE" (PDF). AMD.com. Retrieved November 20, 2022.

[20] [15]
""AMD Instinct MI200" Instruction Set Architecture" (PDF). developer.amd.com. February 4, 2022. Retrieved October 11, 2022.

[:3-21] [16]
Smith, Ryan. "CES 2023: AMD Instinct MI300 Data Center APU Silicon In Hand - 146B Transistors, Shipping H2'23". www.anandtech.com. Retrieved January 22, 2023.

[:4-22] [17]
Paul Alcorn (January 5, 2023). "AMD Instinct MI300 Data Center APU Pictured Up Close: 13 Chiplets, 146 Billion Transistors". Tom's Hardware. Retrieved January 22, 2023.

[23] [18]
Kennedy, Patrick (December 6, 2023). "AMD Instinct MI300X GPU and MI300A APUs Launched for AI Era". ServeTheHome. Retrieved April 15, 2024.

[29] [19]
Oh, Nate (December 16, 2022). "Nvidia Formally Announced PCIe Tesla V100". AnandTech.

[30] [20]
"NVIDIA Tesla V100 PCIe 16GB". TechPowerUp.

[31] [21]
Smith, Ryan (December 19, 2022). "Nvidia Volta Unveiled". AnandTech.

[32] [22]
"NVIDIA Tesla V100 SXM3 32GB". TechPowerUp.

[33] [23]
Walton, Jarred (January 10, 2019). "Hands on with the AMD Radeon VII". PC Gamer.

[NH-DWP-34] [24]
"Next Horizon – David Wang Presentation" (PDF). AMD.

[35] [25]
"AMD Radeon Instinct MI50 Accelerator (16GB)". AMD.

[36] [26]
"AMD Radeon Instinct MI50 Accelerator (32GB)". AMD.

[37] [27]
"AMD Radeon Instinct MI50 Datasheet" (PDF). AMD.

[38] [28]
"AMD Radeon Instinct MI50 Specs". TechPowerUp. Retrieved May 27, 2022.

[39] [29]
"Radeon Instinct MI60". AMD. Archived from the original on November 22, 2018. Retrieved May 27, 2022.

[40] [30]
"AMD Radeon Instinct MI60 Datasheet" (PDF). AMD.

[41] [31]
"AMD Radeon Instinct MI60 Specs". TechPowerUp. Retrieved May 27, 2022.

[42] [32]
"Nvidia A100 Tensor Core GPU Archiecture" (PDF). Nvidia. Retrieved December 12, 2022.

[43] [33]
"Nvidia A100 PCIE 80 GB Specs". TechPowerUp. Retrieved December 12, 2022.

[44] [34]
"Nvidia A100 Tensor Core GPU Archiecture" (PDF). Nvidia. Retrieved December 12, 2022.

[45] [35]
"Nvidia A100 SXM4 80 GB Specs". TechPowerUp. Retrieved December 12, 2022.

[46] [36]
"AMD Instinct MI100 Brochure" (PDF). AMD. Retrieved December 25, 2022.

[47] [37]
"AMD CDNA Whitepaper" (PDF). AMD. Retrieved December 25, 2022.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[lower-alpha 1]

[lower-alpha 2]

[lower-alpha 3]

[lower-alpha 4]

[lower-alpha 5]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[lower-alpha 1]

[lower-alpha 2]

[lower-alpha 3]

[lower-alpha 4]

[lower-alpha 5]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

CDNA_3

CDNA (microarchitecture)

CDNA 1

Architecture

Memory system

Experimental PIM implementation

Changes from GCN

Products

CDNA 2

Architecture

Memory system

Interconnect

Changes from CDNA

Products

CDNA 3

Products

Product Comparisons

See also

References

External links

Share this article:

History

Release date	November 16, 2020 (3 years ago) (2020-11-16)
Designed by	AMD
Fabrication process	TSMC N7 TSMC N6 TSMC N5^[1]
Predecessor	AMD FirePro
Variant	RDNA (consumer, professional)

Model (Code name)	Released	Architecture & fab	Transistors & die size	Core		Fillrate^{[lower-alpha 1]}		Processing power (TFLOPS)								Memory				TBP	Software interface	Physical interface
				Core		Fillrate^{[lower-alpha 1]}		Vector^{[lower-alpha 1]}^{[lower-alpha 2]}			Matrix^{[lower-alpha 1]}^{[lower-alpha 2]}					Memory
				Config^{[lower-alpha 3]}	Clock^{[lower-alpha 1]} (MHz)	Texture^{[lower-alpha 4]} (GT/s)	Pixel^{[lower-alpha 5]} (GP/s)	Half (FP16)	Single (FP32)	Double (FP64)	INT8	BF16	FP16	FP32	FP64	Bus type & width	Size (GB)	Clock (MT/s)	Bandwidth (GB/s)
AMD Instinct MI100 (Arcturus)^[10]^[11]	Nov 16, 2020	CDNA TSMC N7	25.6×10⁹ 750 mm²	7680:480:- 120 CU	1000 1502	480 720.96	-		15.72 23.10	7.86 11.5	122.88 184.57	61.44 92.28	122.88 184.57	30.72 46.14	15.36 23.07	HBM2 4096-bit	32	2400	1228	300 W	PCIe 4.0 ×16	PCIe ×16

History
Release date	November 8, 2021 (2 years ago) (2021-11-08)
Fabrication process	TSMC N6
Predecessor	CDNA 1
Successor	CDNA 3

History
Release date	December 6, 2023 (5 months ago) (2023-12-06)
Fabrication process	TSMC N5 & N6
Predecessor	CDNA 2