Bin-NET: PiM-Enabled Binary Neural Network Inferencing at the Edge Bin-NET

Project: Bin-NET: PiM-Enabled Binary Neural Network Inferencing at the Edge

Acronym: Bin-NET

Main Objective:
Neural-network-based artificial intelligence enables many aspects of modern life that we have come to take for granted. In recent years, to tackle increasingly complex tasks, neural networks themselves have become increasingly complex. Ensuring low latency and high energy efficiency for complex neural networks has been challenging in modern systems, in part because these workloads are often fundamentally memory-intensive, with poor data locality. In modern computing systems, a memory access may require up to 5 orders of magnitude more energy than computation [HOR14]. As the volume of digital data generated increases, this problem will only become worse. In fact, the volume of digital data generated and consumed is estimated by the International Data Corporation (IDC) to amount to 175 ZB every year, by 2025 [IDC20], and a significant portion of this data will represent information used by artificial intelligence systems. New solutions are required to meet users’ expectations of low-latency and energy-efficient neural networks. A promising way of reducing the computational burden of complex neural networks lies in the use of lower precision for the representation of the inputs and weights of the network. This can often be done with minimal accuracy losses [YAZ18, MAR17]. In this domain, binary neural networks (BNNs) are of particular interest: they enable computational speedups of up to 58x, with up to 32x memory savings, at the expense of some accuracy [RAS16]. The fact that BNNs rely on very simple operations makes them a prime target for processing-in-memory (PiM). In this exploratory project, we propose to evaluate the feasibility of developing DRAM-based PiM solutions for binary neural network acceleration, targeted at mobile and edge devices. By performing computation close to where data resides, the overall movement of data is greatly reduced, which has the potential to improve latency, throughput, and energy efficiency. Recent prior work describes how to execute bitwise logic [SES17] operations entirely in DRAM. We propose to implement lookup-table-based [A1] operations also in-memory, which may be used in conjunction with bitwise logic to implement BNNs (that rely on XNOR + bitcount operations). We will also evaluate various design trade-offs, which may enable the extension of our proposed approach to support the execution of general-purpose operations in-memory, thus increasing the impact of the proposal.

Reference: EXPL/EEI-HAC/1511/2021

Funding: FCT

Start Date: 01-12-2021

End Date: 31-05-2023

Team: Gabriel Falcao Paiva Fernandes, Óscar Almeida Ferraz, Ana Beatriz Simões Fernandes

Groups: Multimedia Signal Processing – Co

Partners: INESC-ID -- IST/UL, SAFARI GROUP -- ETHZ

Local Coordinator: Gabriel Falcao Paiva Fernandes

United Nations Strategic Development Goal - Industry, Innovation and Infrastructure

Associated Publications

9Papers in Journals
J. Vieira, N Roma, G. Falcão, P. Tomás, gem5-accel: A Pre-RTL Simulation Toolchain for Accelerator Architecture Validation, IEEE Computer Architecture Letters, Vol. 23, No. 1, pp. 1 - 4, January, 2024,
| Abstract | Full text (PDF 1 MB) | BibTex
J. Vieira, N Roma, G. Falcão, P. Tomás, NDPmulator: Enabling Full-System Simulation for Near-Data Accelerators From Caches to DRAM, IEEE Access, Vol. 12, No., pp. 10349 - 10365, 2024,
| Abstract | Full text (PDF 1 MB) | BibTex
P. Carrinho, G. Falcão, Highly accurate and fast YOLOv4-based polyp detection, Expert Systems with Applications, Vol. 232, No. 1, pp. 120834 - 120854, December, 2023 | Full text (PDF 3 MBs) | BibTex
G. Falcão, J. Ferreira, To PiM or Not to PiM, Communications of the ACM, Vol. 66, No. 6, pp. 48 - 55, June, 2023 | Full text (PDF 2 MBs) | BibTex
L. Esteves, D. Portugal, P. Peixoto, G. Falcão, Towards Mobile Federated Learning with Unreliable Participants and Selective Aggregation, Applied Sciences, Vol. 13, No. 5, pp. 3135 - 3154, February, 2023 | Full text (PDF 945 KBs) | BibTex
G. Falcão, J. Ferreira, To PiM or Not to PiM: The case for in-memory inferencing of quantized CNNs at the edge, Queue, Vol. 20, No. 6, pp. 9 - 34, December, 2022,
| Abstract | Full text (PDF 3 MBs) | BibTex
G. Falcão, J. R. C. Cavallaro, Special Issue on Artificial Intelligence at the Edge, IEEE Micro, Vol. 42, No. 6, pp. 6 - 8, November, 2022 | BibTex
N. Neves, J. Domingos, N Roma, P. Tomás, G. Falcão, Compiling for Vector Extensions With Stream-Based Specialization, IEEE Micro, Vol. 42, No. 5, pp. 49 - 58, September, 2022 | Full text (PDF 33 MBs) | BibTex
G. Falcão, J. R. C. Cavallaro, L. Sousa, Guest Editorial: Special Issue on Advances in Signal Processing Systems, Journal of Signal Processing Systems, Vol. 94, No. 10, pp. 913 - 915, August, 2022 | Full text (PDF 600 KBs) | BibTex

4Papers in Conferences
A. Fernandes, N. Neves, L. Crespo, N Roma, P. Tomás, G. Falcão, A functional validation framework for the Unlimited Vector Extension, IEEE/ACM International Symposium on Microarchitecture MICRO, Toronto, Canada, October, 2023 | BibTex
A. Durao, J. Arrais, B. Ribeiro Ribeiro, G. Falcão, ON THE QUANTIZATION OF RECURRENT NEURAL NETWORKS FOR SMILES GENERATION, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rodhes Island, Greece, Vol., pp. -, June, 2023,
| Abstract | BibTex
J. Ferreira, G. Falcão, A. Fernandes, pLUTo: Enabling Massively Parallel Computation In DRAM via Lookup Tables, IEEE/ACM International Symposium on Microarchitecture MICRO, Chicago, United States, Vol., pp. -, October, 2022 | BibTex
P. Póvoa, G. Falcão, Large Field CdTe Monitor for Astrophysics and TGF Science on board the Space Rider, IEEE Nuclear Science Symposium and Medical Imaging Conference IEEE NSS MIC, Milan, Italy, Vol., pp. -, October, 2022 | BibTex

Project / Bin-NET

Project: Bin-NET: PiM-Enabled Binary Neural Network Inferencing at the Edge