October 17, 2016
Fused-Layer CNN Accelerators
A new paper, "Fused-Layer CNN Accelerators", by Manoj Alwani, et al., is presented at the 49th IEEE/ACM International Symposium on Microarchitecture on Oct. 17, 2016. The proposed technique fuses the processing of multiple CNN layers by modifying the order in which the input data are brought on chip, enabling caching of intermediate data between the evaluation of adjacent CNN layers. As a result, the fused-layer accelerator can reduce the total data transfer significantly, e.g. by 95% for the first five convolutional layers of the VGGNet-E network on 362KB of on-chip storage.