Designing an Energy-Efficient Fully-Asynchronous Deep Learning Convolution Engine

Mattia Vezzoli, Lukas Nel, Kshitij Bhardwaj, Rajit Manohar, Maya Gokhale

In the face of exponential growth in semiconductor energy usage, there is a significant push towards highly energy-efficient microelectronics design. While the traditional circuit designs typically employ clocks to synchronize the computing operations, these circuits incur significant performance and energy overheads due to their data-independent worst-case operation and complex clock tree networks. In this paper, we explore asynchronous or clockless techniques where clocks are replaced by request, acknowledge handshaking signals. To quantify the potential energy and performance gains of asynchronous logic, we design a highly energy-efficient asynchronous deep learning convolution engine, which uses 87% of total DL accelerator energy. Our asynchronous design shows 5.06x lower energy and 5.09x lower delay than the synchronous one.
 
  
Yale