SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Yazdani, Reza; Ruwase, Olatunji; Zhang, Minjia; He, Yuxiong; Arnau, Jose-Maria; Gonzalez, Antonio

Computer Science > Machine Learning

arXiv:1911.01258 (cs)

[Submitted on 4 Nov 2019 (v1), last revised 21 May 2023 (this version, v3)]

Title:SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Authors:Reza Yazdani, Olatunji Ruwase, Minjia Zhang, Yuxiong He, Jose-Maria Arnau, Antonio Gonzalez

View PDF

Abstract:The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, prior work has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today's RNN accelerators. In particular, we first show the problem of low resource-utilization and low adaptiveness for the state-of-the-art RNN implementations on GPU, FPGA and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching mechanism for increasing the adaptiveness of RNN computation, in order to efficiently handle the data dependencies. To do so, we propose Sharp as a hardware accelerator, which pipelines RNN computation using an effective scheduling scheme to hide most of the dependent serialization. Furthermore, Sharp employs dynamic reconfigurable architecture to adapt to the model's characteristics. Sharp achieves 2x, 2.8x, and 82x speedups on average, considering different RNN models and resource budgets, compared to the state-of-the-art ASIC, FPGA, and GPU implementations, respectively. Furthermore, we provide significant energy-reduction with respect to the previous solutions, due to the low power dissipation of Sharp (321 GFLOPS/Watt).

Subjects:	Machine Learning (cs.LG); Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE); Performance (cs.PF)
Cite as:	arXiv:1911.01258 [cs.LG]
	(or arXiv:1911.01258v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1911.01258

Submission history

From: Reza Yazdani Aminabadi [view email]
[v1] Mon, 4 Nov 2019 14:51:27 UTC (1,224 KB)
[v2] Sun, 20 Mar 2022 18:03:32 UTC (3,995 KB)
[v3] Sun, 21 May 2023 04:12:46 UTC (2,051 KB)

Computer Science > Machine Learning

Title:SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators