AMD_Instinct_accelerators

AMD Instinct
Release date	June 20, 2017; 6 years ago
Designed by	AMD
Marketed by	AMD
Architecture	GCN 3; GCN 4; GCN 5; CDNA; CDNA 2; CDNA 3;
Models	MI Series
Cores	36-304 Compute Units (CUs)
Transistors	5.7B (Polaris10) 14 nm; 8.9B (Fiji) 28 nm; 12.5B (Vega10) 14 nm; 13.2B (Vega20) 7 nm; 25.6B (Arcturus) 7 nm; 58.2B (Aldebaran) 6 nm; 146B (Antares) 5 nm; 153B (Aqua Vanjaram) 5 nm;
History
Predecessor	AMD FirePro; Radeon Sky series;

AMD Instinct

Brand name by AMD; data center GPUs for high-performance-computing, machine learning

AMD Instinct is AMD's brand of data center GPUs.^[1]^[2] It replaced AMD's FirePro S brand in 2016. Compared to the Radeon brand of mainstream consumer/gamer products, the Instinct product line is intended to accelerate deep learning, artificial neural network, and high-performance computing/GPGPU applications.

Quick Facts Release date, Designed by ...

The AMD Instinct product line directly competes with Nvidia's Tesla and Intel's Xeon Phi and Data Center GPU lines of machine learning and GPGPU cards.

The brand was originally known as AMD Radeon Instinct, but AMD dropped the Radeon brand from the name before AMD Instinct MI100 was introduced in November 2020.

In June 2022, supercomputers based on AMD's Epyc CPUs and Instinct GPUs took the lead on the Green500 list of the most power-efficient supercomputers with over 50% lead over any other, and held the top first 4 spots.^[3] One of them, the AMD-based Frontier is since June 2022 and as of 2023 the fastest supercomputer in the world on the TOP500 list.^[4]^[5]

Products

More information Accelerator, Architecture ...

The three initial Radeon Instinct products were announced on December 12, 2016, and released on June 20, 2017, with each based on a different architecture.^[6]^[7]

MI300 series

The MI300A and MI300X are data center accelerators that use the CDNA 3 architecture, which is optimized for high-performance computing (HPC) and generative artificial intelligence (AI) workloads. The CDNA 3 architecture features a scalable chiplet design that leverages TSMC’s advanced packaging technologies, such as CoWoS (chip-on-wafer-on-substrate) and InFO (integrated fan-out), to combine multiple chiplets on a single interposer. The chiplets are interconnected by AMD’s Infinity Fabric, which enables high-speed and low-latency data transfer between the chiplets and the host system.

The MI300A is an accelerated processing unit (APU) that integrates 24 Zen 4 CPU cores with four CDNA 3 GPU cores, resulting in a total of 228 CUs in the GPU section, and 128 GB of HBM3 memory. The Zen 4 CPU cores are based on the 5 nm process node and support the x86-64 instruction set, as well as AVX-512 and BFloat16 extensions. The Zen 4 CPU cores can run general-purpose applications and provide host-side computation for the GPU cores. The MI300A has a peak performance of 61.3 TFLOPS of FP64 (122.6 TFLOPS FP64 matrix) and 980.6 TFLOPS of FP16 (1961.2 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300A supports PCIe 5.0 and CXL 2.0 interfaces, which allow it to communicate with other devices and accelerators in a heterogeneous system.

The MI300X is a dedicated generative AI accelerator that replaces the CPU cores with additional GPU cores and HBM memory, resulting in a total of 304 CUs (64 cores per CU) and 192 GB of HBM3 memory. The MI300X is designed to accelerate generative AI applications, such as natural language processing, computer vision, and deep learning. The MI300X has a peak performance of 653.7 TFLOPS of TP32 (1307.4 TFLOPS with sparsity) and 1307.4 TFLOPS of FP16 (2614.9 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300X also supports PCIe 5.0 and CXL 2.0 interfaces, as well as AMD’s ROCm software stack, which provides a unified programming model and tools for developing and deploying generative AI applications on AMD hardware.^[11]^[12]^[13]

Chipset table

More information Model (Code name), Launch ...

Model (Code name)	Launch	Architecture & fab	Transistors & die size	Core		Fillrate^{[lower-alpha 1]}^{[lower-alpha 2]}^{[lower-alpha 3]}		Processing power^{[lower-alpha 1]}^{[lower-alpha 4]} (TFLOPS)			Memory				TBP	Bus interface
Model (Code name)	Launch	Architecture & fab	Transistors & die size	Config^{[lower-alpha 5]}	Clock^{[lower-alpha 1]} (MHz)	Texture (GT/s)	Pixel (GP/s)	Half	Single	Double	Size (GB)	Bus type & width	Bandwidth (GB/s)	Clock (MT/s)	TBP	Bus interface
Radeon Instinct MI6 (Polaris 10)^[15]^[16]^[17]^[18]^[19]^[20]	Jun 20, 2017	GCN 4 GloFo 14LP	5.7×10⁹ 232 mm²	2304:144:32 36 CU	1120 1233	161.3 177.6	35.84 39.46	5.161 5.682	5.161 5.682	0.323 0.355	16	GDDR5 256-bit	224	7000	150 W	PCIe 3.0 ×16
Radeon Instinct MI8 (Fiji)^[15]^[16]^[17]^[21]^[22]^[23]		GCN 3 TSMC 28 nm	8.9×10⁹ 596 mm²	4096:256:64 64 CU	1000	256.0	64.00	8.192	8.192	0.512	4	HBM 4096-bit	512	1000	175 W
Radeon Instinct MI25 (Vega 10)^[15]^[16]^[17]^[24]^[25]^[26]^[27]		GCN 5 GloFo 14LP	12.5×10⁹ 510 mm²	4096:256:64 64 CU	1400 1500	358.4 384.0	89.60 96.00	22.94 24.58	11.47 12.29	0.717 0.768	16	HBM2 2048-bit	484	1890	300 W
Radeon Instinct MI50 (Vega 20)^[28]^[29]^[30]^[31]^[32]^[33]	Nov 18, 2018	GCN 5 TSMC N7	13.2×10⁹ 331 mm²	3840:240:64 60 CU	1450 1725	348.0 414.0	92.80 110.4	22.27 26.50	11.14 13.25	5.568 6.624	16 32	HBM2 4096-bit	1024	2000		PCIe 4.0 ×16
Radeon Instinct MI60 (Vega 20)^[29]^[34]^[35]^[36]	Nov 18, 2018	GCN 5 TSMC N7	13.2×10⁹ 331 mm²	4096:256:64 64 CU	1500 1800	384.0 460.8	96.00 115.2	24.58 29.49	12.29 14.75	6.144 7.373	32		1024	2000
AMD Instinct MI100 (Arcturus)^[37]^[38]^[39]	Nov 16, 2020	CDNA TSMC N7	25.6×10⁹ 750 mm²	7680:480:- 120 CU	1000 1502	480.0 721.0	—	122.9 184.6	15.36 23.07	7.680 11.54	32		1228.8	2400
AMD Instinct MI210 (Aldebaran)^[40]^[41]^[42]	Mar 22, 2022	CDNA 2 TSMC N6	28 x 10⁹ ~770 mm²	6656:416:- 104 CU (1 × GCD)^{[lower-alpha 6]}	1000 1700	416.0 707.2		106.5 181.0	13.31 22.63	13.31 22.63	64	HBM2E 4096-bit	1638.4	3200
AMD Instinct MI250 (Aldebaran)^[43]^[44]^[45]	Nov 8, 2021		58 x 10⁹ 1540 mm²	13312:832:- 208 CU (2 × GCD)		832.0 1414		213.0 362.1	26.62 45.26	26.62 45.26	2 × 64	HBM2E 2 × 4096-bit^{[lower-alpha 7]}	2 × 1638.4		500 W 560 W (Peak)
AMD Instinct MI250X (Aldebaran)^[46]^[44]^[47]	Nov 8, 2021		58 x 10⁹ 1540 mm²	14080:880:- 220 CU (2 × GCD)		880.0 1496		225.3 383.0	28.16 47.87	28.16 47.87	2 × 64	HBM2E 2 × 4096-bit^{[lower-alpha 7]}	2 × 1638.4		500 W 560 W (Peak)
AMD Instinct MI300A (Antares)^[48]^[49]^[50]^[51]	Dec 6, 2023	CDNA 3 TSMC N5 & N6	146 x 10⁹ 1017 mm²	14592:912:- 228 CU (6 × XCD) (24 AMD Zen 4 x86 CPU cores)	2100	912.0 1550.4		980.6 1961.2 (With Sparsity)	122.6	61.3 122.6 (FP64 Matrix)	128	HBM3 8192-bit	5300	5200	550 W 760 W (Liquid Cooling)	PCIe 5.0 ×16
AMD Instinct MI300X (Aqua Vanjaram)^[52]^[53]^[54]^[55]	Dec 6, 2023	CDNA 3 TSMC N5 & N6	153 x 10⁹ 1017 mm²	19456:1216:- 304 CU (8 × XCD)	2100	1216.0 2062.1		1307.4 2614.9 (With Sparsity)	163.4	81.7 163.4 (FP64 Matrix)	192	HBM3 8192-bit	5300	5200	750 W	PCIe 5.0 ×16

Boost values (if available) are stated below the base value in italic.
Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.
Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.
Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.
Unified shaders : Texture mapping units : Render output units and Compute units (CU)
GCD Refers to a Graphics Compute Die. Each GCD is a different piece of silicon.
CDNA 2.0 Based cards adopt a design using two dies on the same package.They are linked with 400GB/s Bidirectional Infinity Fabric link, The dies are addressed as individual GPUs by the host system.

Share this article:

This article uses material from the Wikipedia article AMD_Instinct_accelerators, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[boost-15] Boost values (if available) are stated below the base value in italic.

[texture_fill-16] Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.

[pixel_fill-17] Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.

[FLOPS-18] Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.

[cconfig-19] Unified shaders : Texture mapping units : Render output units and Compute units (CU)

[GCD_cdna2-48] GCD Refers to a Graphics Compute Die. Each GCD is a different piece of silicon.

[mgpu_cdna2-52] CDNA 2.0 Based cards adopt a design using two dies on the same package.They are linked with 400GB/s Bidirectional Infinity Fabric link, The dies are addressed as individual GPUs by the host system.

[anand-1] [1]
Smith, Ryan (December 12, 2016). "AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming in 2017". Anandtech. Retrieved December 12, 2016.

[pcper-2] [2]
Shrout, Ryan (December 12, 2016). "Radeon Instinct Machine Learning GPUs include Vega, Preview Performance". PC Per. Retrieved December 12, 2016.

[3] [3]
"Green500 Release June 2022". TOP500. Retrieved May 9, 2024.

[4] [4]
"Top500 Release June 2022". TOP500. Retrieved May 9, 2024.

[5] [5]
"Top500 Release November 2023". TOP500. Retrieved May 9, 2024.

[6] [6]
WhyCry (December 12, 2016). "AMD announces first VEGA accelerator:RADEON INSTINCT MI25 for deep-learning". VideoCardz. Retrieved June 6, 2022.

[7] [7]
Mujtaba, Hassan (June 21, 2017). "AMD Radeon Instinct MI25 Accelerator With 16 GB HBM2 Specifications Detailed – Launches Today Along With Instinct MI8 and Instinct MI6". Wccftech. Retrieved June 6, 2022.

[8] [8]
"Radeon Instinct MI6". Radeon Instinct. AMD. Retrieved June 22, 2017.^{[permanent dead link]}

[9] [9]
"Radeon Instinct MI8". Radeon Instinct. AMD. Retrieved June 22, 2017.^{[permanent dead link]}

[10] [10]
"Radeon Instinct MI25". Radeon Instinct. AMD. Retrieved June 22, 2017.^{[permanent dead link]}

[11] [11]
"AMD CDNA 3 Architecture" (PDF). AMD CDNA Architecture. AMD. Retrieved December 7, 2023.

[12] [12]
"AMD INSTINCT MI300A APU" (PDF). AMD Instinct Accelerators. AMD. Retrieved December 7, 2023.

[13] [13]
"AMD INSTINCT MI300X APU" (PDF). AMD Instinct Accelerators. AMD. Retrieved December 7, 2023.

[TR2-14] [14]
Kampman, Jeff (December 12, 2016). "AMD opens up machine learning with Radeon Instinct". TechReport. Retrieved December 12, 2016.

[anandtech-Instinct-20] [15]
Smith, Ryan (December 12, 2016). "AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming in 2017". AnandTech. Retrieved December 12, 2016.

[instinct_pcper-21] [16]
Shrout, Ryan (December 12, 2016). "Radeon Instinct Machine Learning GPUs include Vega, Preview Performance". PCPerspective. Retrieved December 12, 2016.

[TR-22] [17]
Kampman, Jeff (December 12, 2016). "AMD opens up machine learning with Radeon Instinct". Tech Report. Retrieved December 12, 2016.

[23] [18]
"Radeon Instinct MI6". AMD. Archived from the original on August 1, 2017. Retrieved May 27, 2022.

[24] [19]
"AMD Radeon Instinct MI6 Datasheet" (PDF). usermanual.wiki. Retrieved May 27, 2022.

[25] [20]
"AMD Radeon Instinct MI6 Specs". TechPowerUp. Retrieved May 27, 2022.

[26] [21]
"Radeon Instinct MI8". AMD. Archived from the original on August 1, 2017. Retrieved May 27, 2022.

[27] [22]
"AMD Radeon Instinct MI8 Datasheet" (PDF). usermanual.wiki. Retrieved May 27, 2022.

[28] [23]
"AMD Radeon Instinct MI8 Specs". TechPowerUp. Retrieved May 27, 2022.

[anand_vega-29] [24]
Smith, Ryan (January 5, 2017). "The AMD Vega Architecture Teaser: Higher IPC, Tiling, & More, coming in H1'2017". AnandTech. Retrieved January 10, 2017.

[30] [25]
"Radeon Instinct MI25". AMD. Archived from the original on August 1, 2017. Retrieved May 27, 2022.

[31] [26]
"AMD Radeon Instinct MI25 Datasheet" (PDF). AMD. Retrieved May 27, 2022.

[32] [27]
"AMD Radeon Instinct MI25 Specs". TechPowerUp. Retrieved May 27, 2022.

[33] [28]
Walton, Jarred (January 10, 2019). "Hands on with the AMD Radeon VII". PC Gamer.

[NH-DWP-34] [29]
"Next Horizon – David Wang Presentation" (PDF). AMD.

[35] [30]
"AMD Radeon Instinct MI50 Accelerator (16GB)". AMD. Retrieved December 24, 2022.

[36] [31]
"AMD Radeon Instinct MI50 Accelerator (32GB)". AMD. Retrieved December 24, 2022.

[37] [32]
"AMD Radeon Instinct MI50 Datasheet" (PDF). AMD. Retrieved December 24, 2022.

[38] [33]
"AMD Radeon Instinct MI50 Specs". TechPowerUp. Retrieved May 27, 2022.

[39] [34]
"Radeon Instinct MI60". AMD. Archived from the original on November 22, 2018. Retrieved May 27, 2022.

[40] [35]
"AMD Radeon Instinct MI60 Datasheet" (PDF). AMD. Retrieved December 24, 2022.

[41] [36]
"AMD Radeon Instinct MI60 Specs". TechPowerUp. Retrieved May 27, 2022.

[42] [37]
"AMD Instinct MI100 Accelerator". AMD. Retrieved May 27, 2022.

[43] [38]
"AMD Instinct MI100 Accelerator Brochure" (PDF). AMD. Retrieved May 27, 2022.

[44] [39]
"AMD Radeon Instinct MI100 Specs". TechPowerUp. Retrieved May 26, 2022.

[45] [40]
"AMD Instinct MI210 Accelerator". AMD. Retrieved May 27, 2022.

[46] [41]
"AMD Instinct MI210 Accelerator Brochure" (PDF). AMD. Retrieved May 27, 2022.

[47] [42]
"AMD Radeon Instinct MI210 Specs". TechPowerUp. Retrieved May 27, 2022.

[49] [43]
"AMD Instinct MI250 Accelerator". AMD. Retrieved May 27, 2022.

[amd-MI250series-datasheet-50] [44]
"AMD Instinct MI200 Series Accelerator Datasheet" (PDF). AMD. Retrieved December 24, 2022.

[51] [45]
"AMD Radeon Instinct MI250 Specs". TechPowerUp. Retrieved May 26, 2022.

[53] [46]
"AMD Instinct MI250X Accelerator". AMD. Retrieved May 27, 2022.

[54] [47]
"AMD Radeon Instinct MI250X Specs". TechPowerUp. Retrieved May 26, 2022.

[55] [48]
"AMD Instinct MI300A APU". AMD. Retrieved December 12, 2023.

[amd-MI300Aseries-datasheet-56] [49]
"AMD Instinct MI300A Series Accelerator Datasheet" (PDF). AMD. Retrieved December 12, 2023.

[57] [50]
"AMD Radeon Instinct MI300 Specs". TechPowerUp. Retrieved December 12, 2023.

[58] [51]
"AMD-CDNA3-white-paper" (PDF). AMD. Retrieved December 12, 2023.

[59] [52]
"AMD Instinct MI300X GPU". AMD. Retrieved December 12, 2023.

[amd-MI300Xseries-datasheet-60] [53]
"AMD Instinct MI300X Series Accelerator Datasheet" (PDF). AMD. Retrieved December 12, 2023.

[61] [54]
"AMD Radeon Instinct MI300 Specs". TechPowerUp. Retrieved December 12, 2023.

[62] [55]
"AMD-CDNA3-white-paper" (PDF). AMD. Retrieved December 12, 2023.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[lower-alpha 1]

[lower-alpha 2]

[lower-alpha 3]

[lower-alpha 4]

[lower-alpha 5]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[lower-alpha 6]

[43]

[44]

[45]

[lower-alpha 7]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

AMD_Instinct_accelerators

AMD Instinct

Products

MI6

MI8

MI25

MI300 series

Software

ROCm

MxGPU

MIOpen

Chipset table

See also

References

External links

Share this article:

History

Release date	June 20, 2017; 6 years ago (2017-06-20)
Designed by	AMD
Marketed by	AMD
Architecture	GCN 3 GCN 4 GCN 5 CDNA CDNA 2 CDNA 3
Models	MI Series
Cores	36-304 Compute Units (CUs)
Transistors	5.7B (Polaris10) 14 nm 8.9B (Fiji) 28 nm 12.5B (Vega10) 14 nm 13.2B (Vega20) 7 nm 25.6B (Arcturus) 7 nm 58.2B (Aldebaran) 6 nm 146B (Antares) 5 nm 153B (Aqua Vanjaram) 5 nm
Predecessor	AMD FirePro Radeon Sky series

Accelerator	Architecture	Lithography	Compute Units	Memory		PCIe support	Form factor	Processing power								TBP
Accelerator	Architecture	Lithography	Compute Units	Size	Type	PCIe support	Form factor	FP16	BF16	FP32	FP32 matrix	FP64 performance	FP64 matrix	INT8	INT4	TBP
MI6	GCN 4	14 nm	36	16 GB	GDDR5	3.0	PCIe	5.7 TFLOPS	N/A	5.7 TFLOPS	N/A	358 GFLOPS	N/A	N/A	N/A	150 W
MI8	GCN 3	28 nm	64	4 GB	HBM			8.2 TFLOPS		8.2 TFLOPS		512 GFLOPS				175 W
MI25	GCN 5	14 nm	64	16 GB	HBM2			26.4 TFLOPS		12.3 TFLOPS		768 GFLOPS				300 W
MI50		7 nm	60	16 GB		4.0		26.5 TFLOPS		13.3 TFLOPS		6.6 TFLOPS		53 TOPS		300 W
MI60			64	32 GB				29.5 TFLOPS		14.7 TFLOPS		7.4 TFLOPS		59 TOPS		300 W
MI100	CDNA		120	32 GB				184.6 TFLOPS	92.3 TFLOPS	23.1 TFLOPS	46.1 TFLOPS	11.5 TFLOPS		184.6 TOPS		300 W
MI210	CDNA 2	6 nm	104	64 GB	HBM2e			181 TFLOPS		22.6 TFLOPS	45.3 TFLOPS	22.6 TFLOPS	45.3 TFLOPS	181 TOPS		300 W
MI250			208	128 GB			OAM	362.1 TFLOPS		45.3 TFLOPS	90.5 TFLOPS	45.3 TFLOPS	90.5 TFLOPS	362.1 TOPS		560 W
MI250X			220	128 GB			OAM	383 TFLOPS		47.92 TFLOPS	95.7 TFLOPS	47.9 TFLOPS	95.7 TFLOPS	383 TOPS		560 W
MI300A	CDNA 3	6 & 5 nm	228	128 GB	HBM3	5.0	APU SH5 socket	980.6 TFLOPS 1961.2 TFLOPS (with Sparsity)		122.6 TFLOPS		61.3 TFLOPS	122.6 TFLOPS	1961.2 TOPS 3922.3 TOPS (with Sparsity)	N/A	550 W 760 W (with liquid cooling)
MI300X	CDNA 3	6 & 5 nm	304	192 GB	HBM3	5.0	OAM	1307.4 TFLOPS 2614.9 TFLOPS (with Sparsity)		163.4 TFLOPS		81.7 TFLOPS	163.4 TFLOPS	2614.9 TOPS 5229.8 TOPS (with Sparsity)	N/A	750 W