Hardware_acceleration

Hardware acceleration

Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

A cryptographic accelerator card allows cryptographic operations to be performed at a faster rate.

To perform computing tasks more efficiently, generally one can invest time and money in improving the software, improving the hardware, or both. There are various approaches with advantages and disadvantages in terms of decreased latency, increased throughput and reduced energy consumption. Typical advantages of focusing on software may include greater versatility, more rapid development, lower non-recurring engineering costs, heightened portability, and ease of updating features or patching bugs, at the cost of overhead to compute general operations. Advantages of focusing on hardware may include speedup, reduced power consumption,^[1] lower latency, increased parallelism^[2] and bandwidth, and better utilization of area and functional components available on an integrated circuit; at the cost of lower ability to update designs once etched onto silicon and higher costs of functional verification, times to market, and need for more parts. In the hierarchy of digital computing systems ranging from general-purpose processors to fully customized hardware, there is a tradeoff between flexibility and efficiency, with efficiency increasing by orders of magnitude when any given application is implemented higher up that hierarchy.^[3] This hierarchy includes general-purpose processors such as CPUs,^[4] more specialized processors such as programmable shaders in a GPU,^[5] fixed-function implemented on field-programmable gate arrays (FPGAs),^[6] and fixed-function implemented on application-specific integrated circuits (ASICs).^[7]

Hardware acceleration is advantageous for performance, and practical when the functions are fixed so updates are not as needed as in software solutions. With the advent of reprogrammable logic devices such as FPGAs, the restriction of hardware acceleration to fully fixed algorithms has eased since 2010, allowing hardware acceleration to be applied to problem domains requiring modification to algorithms and processing control flow.^[8]^[9] The disadvantage however, is that in many open source projects, it requires proprietary libraries that not all vendors are keen to distribute or expose, making it difficult to integrate in such projects.

Overview

Integrated circuits can be created to perform arbitrary operations on analog and digital signals. Most often in computing, signals are digital and can be interpreted as binary number data. Computer hardware and software operate on information in binary representation to perform computing; this is accomplished by calculating Boolean functions on the bits of input and outputting the result to some output device downstream for storage or further processing.

Share this article:

This article uses material from the Wikipedia article Hardware_acceleration, and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[1] [1]
"Microsoft Supercharges Bing Search With Programmable Chips". WIRED. 16 June 2014.

[2] [2]
"Embedded". Archived from the original on 2007-10-08. Retrieved 2012-08-18. "FPGA Architectures from 'A' to 'Z'" by Clive Maxfield 2006

[3] [3]
Sinan, Kufeoglu; Mahmut, Ozkuran (2019). "Figure 5. CPU, GPU, FPGA, and ASIC minimum energy consumption between difficulty recalculation.". Energy Consumption of Bitcoin Mining. doi:10.17863/CAM.41230.

[4] [4]
Kim, Yeongmin; Kong, Joonho; Munir, Arslan (2020). "CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge". IEEE Access. 8: 211422–211433. doi:10.1109/ACCESS.2020.3039278. ISSN 2169-3536.

[5] [5]
Lin, Yibo; Jiang, Zixuan; Gu, Jiaqi; Li, Wuxi; Dhar, Shounak; Ren, Haoxing; Khailany, Brucek; Pan, David Z. (April 2021). "DREAMPlace: Deep Learning Toolkit-Enabled GPU Acceleration for Modern VLSI Placement". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 40 (4): 748–761. doi:10.1109/TCAD.2020.3003843. ISSN 1937-4151. S2CID 225744481.

[6] [6]
Lyakhov, Pavel; Valueva, Maria; Valuev, Georgii; Nagornov, Nikolai (2020-12-18). "A Method of Increasing Digital Filter Performance Based on Truncated Multiply-Accumulate Units". Applied Sciences. 10 (24): 9052. doi:10.3390/app10249052. ISSN 2076-3417. Hardware simulation on FPGA increased the digital filter performance.

[7] [7]
Mohan, Prashanth; Wang, Wen; Jungk, Bernhard; Niederhagen, Ruben; Szefer, Jakub; Mai, Ken (October 2020). "ASIC Accelerator in 28 nm for the Post-Quantum Digital Signature Scheme XMSS". 2020 IEEE 38th International Conference on Computer Design (ICCD). Hartford, CT, USA: IEEE. pp. 656–662. doi:10.1109/ICCD50377.2020.00112. ISBN 978-1-7281-9710-4. S2CID 229330964.

[BingFPGA-8] [8]
Morgan, Timothy Pricket (2014-09-03). "How Microsoft Is Using FPGAs To Speed Up Bing Search". Enterprise Tech. Retrieved 2018-09-18.

[ProjCatapult-9] [9]
"Project Catapult". Microsoft Research.

[10] [10]
MicroBlaze Soft Processor: Frequently Asked Questions Archived 2011-10-27 at the Wayback Machine

[11] [11]
Vassányi, István (1998). "Implementing processor arrays on FPGAs". Field-Programmable Logic and Applications from FPGAs to Computing Paradigm. Lecture Notes in Computer Science. Vol. 1482. pp. 446–450. doi:10.1007/BFb0055278. ISBN 978-3-540-64948-9.

[12] [12]
Zhoukun WANG and Omar HAMMAMI. "A 24 Processors System on Chip FPGA Design with Network on Chip".

[13] [13]
John Kent. "Micro16 Array - A Simple CPU Array"

[14] [14]
Kit Eaton. "1,000 Core CPU Achieved: Your Future Desktop Will Be a Supercomputer". 2011.

[15] [15]
"Scientists Squeeze Over 1,000 Cores onto One Chip". 2011. Archived 2012-03-05 at the Wayback Machine

[16] [16]
Kienle, Frank; Wehn, Norbert; Meyr, Heinrich (December 2011). "On Complexity, Energy- and Implementation-Efficiency of Channel Decoders". IEEE Transactions on Communications. 59 (12): 3301–3310. arXiv:1003.3792. doi:10.1109/tcomm.2011.092011.100157. ISSN 0090-6778. S2CID 13863870.

[wellho-17] [17]
"Regular Expressions in hardware". Retrieved 17 July 2014.

[18] [18]
"Compression Accelerators - Microsoft Research". Microsoft Research. Retrieved 2017-10-07.

[Farabet-19] [19]
Farabet, Clément, et al. "Hardware accelerated convolutional neural networks for synthetic vision systems^{[dead link]}." Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on. IEEE, 2010.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

Application	Hardware accelerator	Acronym
Computer graphics General-purpose computing GP computing, on Nvidia graphics cards Ray tracing Video codec	Graphics processing unit General-purpose computing on GPU CUDA architecture Ray-tracing hardware Various video acceleration hardware	GPU GPGPU CUDA RTX N/A
Digital signal processing	Digital signal processor	DSP
Analog signal processing	Field-programmable analog array Field-programmable RF	FPAA FPRF
Sound processing	Sound card and sound card mixer	N/A
Computer networking on a chip TCP Input/output	Network processor and network interface controller Network on a chip TCP offload engine I/O Acceleration Technology	NPU and NIC NoC TCPOE or TOE I/OAT or IOAT
Cryptography Encryption ISA SSL/TLS Attack Random number generation	Cryptographic accelerator and secure cryptoprocessor Hardware-based encryption AES instruction set SSL acceleration Custom hardware attack Hardware random number generator	N/A
Artificial intelligence Machine vision/computer vision Neural networks Brain simulation	AI accelerator Vision processing unit Physical neural network Neuromorphic engineering	N/A VPU PNN N/A
Multilinear algebra	Tensor processing unit	TPU
Physics simulation	Physics processing unit	PPU
Regular expressions^[17]	Regular expression coprocessor	N/A
Data compression^[18]	Data compression accelerator	N/A
In-memory processing	Network on a chip and Systolic array	NoC; N/A
Data processing	Data processing unit	DPU
Any computing task	Computer hardware Field-programmable gate arrays^[19] Application-specific integrated circuits^[19] Complex programmable logic devices Systems-on-Chip Multi-processor system-on-chip Programmable system-on-chip	HW (sometimes) FPGA ASIC CPLD SoC MPSoC PSoC

Hardware_acceleration

Hardware acceleration

Overview

Computational equivalence of hardware and software

Stored-program computers

Hardware execution units

Emerging hardware architectures

Implementation metrics

Applications

Hardware acceleration units by application

See also

References

External links

Share this article: