Search CORE

185 research outputs found

Z-TCAM: An SRAM-based Architecture for TCAM

Author: Cheung RCC
Jaiswal MK
Ullah Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

published_or_final_versio

HKU Scholars Hub

Axp: A hw-sw co-design pipeline for energy-efficient approximated convnets via associative matching

Author: Calimera A.
Mocerino L.
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The reduction in energy consumption is key for deep neural networks (DNNs) to ensure usability and reliability, whether they are deployed on low-power end-nodes with limited resources or high-performance platforms that serve large pools of users. Leveraging the over-parametrization shown by many DNN models, convolutional neural networks (ConvNets) in particular, energy efficiency can be improved substantially preserving the model accuracy. The solution proposed in this work exploits the intrinsic redundancy of ConvNets to maximize the reuse of partial arithmetic results during the inference stages. Specifically, the weight-set of a given ConvNet is discretized through a clustering procedure such that the largest possible number of inner multiplications fall into predefined bins; this allows an off-line computation of the most frequent results, which in turn can be stored locally and retrieved when needed during the forward pass. Such a reuse mechanism leads to remarkable energy savings with the aid of a custom processing element (PE) that integrates an associative memory with a standard floating-point unit (FPU). Moreover, the adoption of an approximate associative rule based on a partial bit-match increases the hit rate over the pre-computed results, maximizing the energy reduction even further. Results collected on a set of ConvNets trained for computer vision and speech processing tasks reveal that the proposed associative-based hw-sw co-design achieves up to 77% in energy savings with less than 1% in accuracy loss

Multidisciplinary Digital Publishing Institute

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Cuckoo Directory: A Scalable Directory for Many-Core Systems

Author: Balet Ken
Falsafi Babak
Ferdman Michael
Lotfi-Kamran Pejman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/03/2011
Field of study

Growing core counts have highlighted the need for scalable on-chip coherence mechanisms. The increase in the number of on-chip cores exposes the energy and area costs of scaling the directories. Duplicate-tag based directories require highly associative structures that grow with core count, precluding scalability due to prohibitive power consumption. Sparse directories overcome the power barrier by reducing directory associativity, but require storage area over-provisioning to avoid high invalidation rates. We propose the Cuckoo directory, a power- and area-efficient scalable distributed directory. The cuckoo directory scales to high core counts without the energy costs of wide associative lookup and without gross capacity over-provisioning. Simulation of a 16-core CMP with commercial server and scientific workloads shows that the Cuckoo directory eliminates invalidations while being up to four times more power efficient than the Duplicate-tag directory and 24% more power-efficient and up to seven times more area efficient than the Sparse directory organization. Analytical projections indicate that the Cuckoo directory retains its energy and area benefits with increasing core count, efficiently scaling to at least 1024 cores

Infoscience - École polytechnique fédérale de Lausanne

Crossref

NASA JSC neural network survey results

Author: Greenwood Dan
Publication venue
Publication date
Field of study

A survey of Artificial Neural Systems in support of NASA's (Johnson Space Center) Automatic Perception for Mission Planning and Flight Control Research Program was conducted. Several of the world's leading researchers contributed papers containing their most recent results on artificial neural systems. These papers were broken into categories and descriptive accounts of the results make up a large part of this report. Also included is material on sources of information on artificial neural systems such as books, technical reports, software tools, etc

NASA Technical Reports Server

Area-efficient near-associative memories on FPGAs

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Crossref

Feature Study on a Programmable Network Traffic Classifier

Author: Guerra Perez Keissy
Scott-Hayward Sandra
Sezer Sakir
Yang Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/04/2017
Field of study

Queen's University Belfast Research Portal

Fast Packet Processing on High Performance Architectures

Author: ANTICHI GIANNI
Publication venue: 'Pisa University Press'
Publication date: 05/05/2011
Field of study

The rapid growth of Internet and the fast emergence of new network applications have brought great challenges and complex issues in deploying high-speed and QoS guaranteed IP network. For this reason packet classication and network intrusion detection have assumed a key role in modern communication networks in order to provide Qos and security. In this thesis we describe a number of the most advanced solutions to these tasks. We introduce NetFPGA and Network Processors as reference platforms both for the design and the implementation of the solutions and algorithms described in this thesis. The rise in links capacity reduces the time available to network devices for packet processing. For this reason, we show different solutions which, either by heuristic and randomization or by smart construction of state machine, allow IP lookup, packet classification and deep packet inspection to be fast in real devices based on high speed platforms such as NetFPGA or Network Processors

Electronic Thesis and Dissertation Archive - Università di Pisa