Deep Learning Hardware Accelerator Design in Fpga
Essay by Muhammad Ammad • August 23, 2018 • Research Paper • 464 Words (2 Pages) • 1,028 Views
Deep Learning Hardware Accelerator Design in FPGA
FPGA based neural network accelerator is the currently trending topic. The hardware is designed specifically for FPGA so that it can surpass GPU in terms of energy and speed efficiency and thus becomes the next possible solution. Since then, various optimization techniques regarding software and hardware of the accelerator based on FPGA have been proposed. FPGAs make a brilliant platform for attaining efficient energy and neural network processing. High parallelism can be implemented by FPGAs with the neural network oriented hardware design. The additional logic can be removed by using the neural network computation properties. [1]
Challenges to FPGA based accelerator design
There are two major challenges in flexibility and performance to the FPGA based accelerator design.
- The working frequency of present FPGAs is usually 100 to 300 MHz. This frequency is very less as compared to the GPU and CPU. The logic overhead of FPGA for reconfigurability will also decrease the performance of system. Therefore, high energy efficiency and high performance can be hardly achieved by the straightforward FPGA design.
- It is hard to implement FPGA based neural networks whereas with GPUs or CPUs, it is much simpler. With CPU or GPU, there are development frameworks such as Tensorflow and Caffe whereas, it is absent for FPGA. [1]
FPGA- based accelerator
For algorithm acceleration, the FPGA provides a promising solution. If other platforms like CPU, DSP or GPU are taken into account, the hardware and software are to be designed separately. However, in case of FPGA, the developers only need to focus on target algorithm and based on that, implement the required logic. FPGAs can prove to be highly efficient by removing the redundancy in the general hardware platforms. [2]
Some large on-chip storage units are contained in the FPGA chips like the SRAMs and registers. The other platform models (CPU, GPU) implement 100 to 1000 MB parameters whereas even the largest FPGA chip cannot implement even 50MB. Thus, for this gap; DDR, SDRAM or other external memory is required. The system performance will be limited by power consumption and bandwidth of DDR. [1]
The researchers have proposed various optimization methods for coping up with all the challenges related to FPGA based hardware accelerator. According to the researchers, with a co-design of software and hardware, the FPGA are expected to improve energy efficiency to 13 times more than the GPU whereby consuming just 30% power. Thus, FPGA is surely a very good candidate for the neural network acceleration.[2]
References
[1] | K. GUO, S. ZENG, J. YU, Y. WANG and H. YANG, "A Survey of FPGA Based Neural Network Accelerator," ACM Transactions on Reconfgurable Technology and Systems, vol. 9, no. 4, pp. 11:1-11:22, 2017. |
[2] | K. Abdelouahab, M. Pelcat, . J. Sérot and . F. Berry, "Accelerating CNN inference on FPGAs: A Survey," HAL, France, 2018. |
...
...