Stanford EE Computer Systems Colloquium

4:30 PM, Wednesday, January 6, 2016
NEC Auditorium, Gates Computer Science Building Room B3
Stanford University
http://ee380.stanford.edu

Deep Compression and EIE: Deep Neural Network Model Compression and Hardware Acceleration

Song Han
Stanford University
About the talk:

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, we first introduce "deep compression" to reduce the storage requirement of neural networks without affecting their accuracy. On the ImageNet dataset, our method reduced the storage required by AlexNet by 35x from 240MB to 6.9MB, VGG-16 by 49x from 552MB to 11.3MB, both with no loss of accuracy. Our compression method also facilitates the use of complex neural networks in mobile applications where application size and download bandwidth are constrained. This also allows fitting the model into on-chip SRAM cache rather than off-chip DRAM memory.

Next we propose an energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the inherent modified sparse matrix-vector multiplication. Evaluated on nine DNN benchmarks, EIE is 189x and 13x faster when compared to CPU and GPU implementations of the DNN without compression. EIE with processing power of 102 GOPS at only 600mW is also 24,000x and 3,000x more energy efficient than a CPU and GPU respectively.

Slides:

Download the slides for this presentation in PDF format.

Videos:

About the speaker:

[speaker photo] Song Han is a fourth year PhD student with Prof. Bill Dally at Stanford University. His research interest is computer architecture and high performance computing for deep learning. Currently his research is improving the energy efficiency of neural networks targeting mobile and embedded systems. He worked on model compression and hardware accelerator on the compressed model that fit state-of-the-art DNN models fully on-chip, which has been covered by TheNextPlatform. Before joining Stanford, Song Han graduated from Institute of Microelectronics, Tsinghua University in 2012.

Contact information:

Song Han Stanford University