Neural-network-based artificial intelligence enables many aspects of modern life that we have come to take for granted. In recent years, to tackle increasingly complex tasks, neural networks themselves have become increasingly complex. Ensuring low latency and high energy efficiency for complex neural networks has been challenging in modern systems, in part because these workloads are often fundamentally memory-intensive, with poor data locality. In modern computing systems, a memory access may require up to 5 orders of magnitude more energy than computation [HOR14]. As the volume of digital data generated increases, this problem will only become worse. In fact, the volume of digital data generated and consumed is estimated by the International Data Corporation (IDC) to amount to 175 ZB every year, by 2025 [IDC20], and a significant portion of this data will represent information used by artificial intelligence systems. New solutions are required to meet users’ expectations of low-latency and energy-efficient neural networks. A promising way of reducing the computational burden of complex neural networks lies in the use of lower precision for the representation of the inputs and weights of the network. This can often be done with minimal accuracy losses [YAZ18, MAR17]. In this domain, binary neural networks (BNNs) are of particular interest: they enable computational speedups of up to 58x, with up to 32x memory savings, at the expense of some accuracy [RAS16]. The fact that BNNs rely on very simple operations makes them a prime target for processing-in-memory (PiM). In this exploratory project, we propose to evaluate the feasibility of developing DRAM-based PiM solutions for binary neural network acceleration, targeted at mobile and edge devices. By performing computation close to where data resides, the overall movement of data is greatly reduced, which has the potential to improve latency, throughput, and energy efficiency. Recent prior work describes how to execute bitwise logic [SES17] operations entirely in DRAM. We propose to implement lookup-table-based [A1] operations also in-memory, which may be used in conjunction with bitwise logic to implement BNNs (that rely on XNOR + bitcount operations). We will also evaluate various design trade-offs, which may enable the extension of our proposed approach to support the execution of general-purpose operations in-memory, thus increasing the impact of the proposal.
|Start Date: 01-12-2021|
|End Date: 31-05-2023|
|Team: Gabriel Falcao Paiva Fernandes|
|Groups: Multimedia Signal Processing – Co|
|Partners: INESC-ID -- IST/UL, SAFARI GROUP -- ETHZ|
|Local Coordinator: Gabriel Falcao Paiva Fernandes|