Search CORE

911 research outputs found

A Non-blocking Interconnection Network-Shared Cache Organization for Multi-core Processors

Author: Allam Rebhi Mohammad AbuMwais
علام ربحي محمد ابومويس
Publication venue: جامعة القدس
Publication date: 28/09/2013
Field of study

A scalable multi-core architecture with heterogeneous memory structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs)

Author: Indiveri Giacomo
Moradi Saber
Qiao Ning
Stefanini Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Neuromorphic computing systems comprise networks of neurons that use asynchronous events for both computation and communication. This type of representation offers several advantages in terms of bandwidth and power consumption in neuromorphic electronic systems. However, managing the traffic of asynchronous events in large scale systems is a daunting task, both in terms of circuit complexity and memory requirements. Here we present a novel routing methodology that employs both hierarchical and mesh routing strategies and combines heterogeneous memory structures for minimizing both memory requirements and latency, while maximizing programming flexibility to support a wide range of event-based neural network architectures, through parameter configuration. We validated the proposed scheme in a prototype multi-core neuromorphic processor chip that employs hybrid analog/digital circuits for emulating synapse and neuron dynamics together with asynchronous digital circuits for managing the address-event traffic. We present a theoretical analysis of the proposed connectivity scheme, describe the methods and circuits used to implement such scheme, and characterize the prototype chip. Finally, we demonstrate the use of the neuromorphic processor with a convolutional neural network for the real-time classification of visual symbols being flashed to a dynamic vision sensor (DVS) at high speed.Comment: 17 pages, 14 figure

arXiv.org e-Print Archive

ZORA

An Implementation of a Predictable Cache-coherent Multi-core System

Author: Tegegn Paulos
Publication venue: 'University of Waterloo'
Publication date: 13/05/2019
Field of study

Multi-core platforms have entered the realm of the embedded systems to meet the ever growing performance requirements of the real-time embedded applications. Real-time applications leverage the hardware parallelism from multi-cores while keeping the hardware cost minimum. However, when the real-time tasks are deployed on the multi-core platforms, they experience interference due to sharing of hardware resources such as shared bus, last level cache, and main memory. As a result, it complicates computing the worst-case execution time of the real-time tasks. In this thesis, I present a hardware prototype that implements a predictable cache-coherent real-time multi-core system. The designed hardware follows the design guidelines outlined in the predictable cache coherence protocol. The hardware uses a latency insensitive interfaces to integrate the multi-core components such as the processor, cache controller, and interconnecting bus. The prototyped multi-core hardware is synthesized and implemented in a low-cost and high-performing FPGA board. The hardware is validated and verified on a tethered system that enables the design to run multi-threaded pthread applications

University of Waterloo's Institutional Repository

Infrastructures and Algorithms for Testable and Dependable Systems-on-a-Chip

Author: Di Carlo S.
Publication venue: Politecnico di Torino
Publication date
Field of study

Every new node of semiconductor technologies provides further miniaturization and higher performances, increasing the number of advanced functions that electronic products can offer. Silicon area is now so cheap that industries can integrate in a single chip usually referred to as System-on-Chip (SoC), all the components and functions that historically were placed on a hardware board. Although adding such advanced functionality can benefit users, the manufacturing process is becoming finer and denser, making chips more susceptible to defects. Today’s very deep-submicron semiconductor technologies (0.13 micron and below) have reached susceptibility levels that put conventional semiconductor manufacturing at an impasse. Being able to rapidly develop, manufacture, test, diagnose and verify such complex new chips and products is crucial for the continued success of our economy at-large. This trend is expected to continue at least for the next ten years making possible the design and production of 100 million transistor chips. To speed up the research, the National Technology Roadmap for Semiconductors identified in 1997 a number of major hurdles to be overcome. Some of these hurdles are related to test and dependability. Test is one of the most critical tasks in the semiconductor production process where Integrated Circuits (ICs) are tested several times starting from the wafer probing to the end of production test. Test is not only necessary to assure fault free devices but it also plays a key role in analyzing defects in the manufacturing process. This last point has high relevance since increasing time-to-market pressure on semiconductor fabrication often forces foundries to start volume production on a given semiconductor technology node before reaching the defect densities, and hence yield levels, traditionally obtained at that stage. The feedback derived from test is the only way to analyze and isolate many of the defects in today’s processes and to increase process’s yield. With the increasing need of high quality electronic products, at each new physical assembly level, such as board and system assembly, test is used for debugging, diagnosing and repairing the sub-assemblies in their new environment. Similarly, the increasing reliability, availability and serviceability requirements, lead the users of high-end products performing periodic tests in the field throughout the full life cycle. To allow advancements in each one of the above scaling trends, fundamental changes are expected to emerge in different Integrated Circuits (ICs) realization disciplines such as IC design, packaging and silicon process. These changes have a direct impact on test methods, tools and equipment. Conventional test equipment and methodologies will be inadequate to assure high quality levels. On chip specialized block dedicated to test, usually referred to as Infrastructure IP (Intellectual Property), need to be developed and included in the new complex designs to assure that new chips will be adequately tested, diagnosed, measured, debugged and even sometimes repaired. In this thesis, some of the scaling trends in designing new complex SoCs will be analyzed one at a time, observing their implications on test and identifying the key hurdles/challenges to be addressed. The goal of the remaining of the thesis is the presentation of possible solutions. It is not sufficient to address just one of the challenges; all must be met at the same time to fulfill the market requirements

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Recommended from our members

Internet Infrastructures for Large Scale Emulation with Efficient HW/SW Co-design

Author: Gula Aiden K
Publication venue: ScholarWorks@UMass Amherst
Publication date: 20/10/2021
Field of study

Connected systems are becoming more ingrained in our daily lives with the advent of cloud computing, the Internet of Things (IoT), and artificial intelligence. As technology progresses, we expect the number of networked systems to rise along with their complexity. As these systems become abstruse, it becomes paramount to understand their interactions and nuances. In particular, Mobile Ad hoc Networks (MANET) and swarm communication systems exhibit added complexity due to a multitude of environmental and physical conditions. Testing these types of systems is challenging and incurs high engineering and deployment costs. In this work, we propose a scalable MANET emulation framework using virtualized internet infrastructures that generalizes an assortment of application spaces with diverse attributes. We then quantify the architecture using various evaluation techniques to determine both feasibility and scalability. Finally, we developed a hardware offload engine for virtualized network systems that builds upon recent work in the field

ScholarWorks@UMass Amherst

Programmable built-in self-testing of embedded RAM clusters in system-on-chip architectures

Author: Benso Alfredo
DI CARLO Stefano
DI NATALE Giorgio
Lobetti Bodoni M.
Prinetto Paolo Ernesto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Multiport memories are widely used as embedded cores in all communication system-on-chip devices. Due to their high complexity and very low accessibility, built-in self-test (BIST) is the most common solution implemented to test the different memories embedded in the system. This article presents a programmable BIST architecture based on a single microprogrammable BIST processor and a set of memory wrappers designed to simplify the test of a system containing a large number of distributed multiport memories of different sizes (number of bits, number of words), access protocols (asynchronous, synchronous), and timing

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PRODUCTIVELY SCALING HARDWARE DESIGNS OVER INCREASING RESOURCES USING A SYSTEMATIC DESIGN ANALYSIS APPROACH

Author: NC DOCKS at The University of North Carolina at Charlotte
Schmidt Andrew Gregory
Publication venue
Publication date: 01/01/2011
Field of study

As processor development shifts from strict single core frequency scaling to het- erogeneous resource scaling two important considerations require evaluation. First, how to design systems with an increasing amount of heterogeneous resources, and second, how to maintain a designer’s productivity as the number of possible con- figurations grows. Therefore, it is necessary to determine what useful information can be gathered from existing designs to help predict or identify a design’s potential scalability, as well as, identifying which routine tasks can be automated to improve a designer’s productivity. Moreover, once this information is collected, how can this information be conveyed to the designer such that it can be used to increase overall productivity when implementing the design over increasing amounts of resources? This research looks at various approaches to analyze designs and attempts to distribute an application efficiently across a heterogeneous cluster of computing re- sources through the use of a Systematic Design Analysis flow and an assortment of productivity tools. These tools provide the designer with projections on the amount of resources needed to scale an existing design to a specified performance, as well as, projecting the performance based on a specified amount of resources. This is accomplished through the combination of static HDL profiling, component synthesis resource utilization, and runtime performance monitoring. For evaluation, four case studies are presented to demonstrate the proposed flow’s scalability on a small scale cluster of FPGAs. The results are highly favorable, providing orders of magnitude speedup with minimal intervention from the designer

The University of North Carolina at Greensboro

Recommended from our members

A Verilog 8051 Soft Core for FPGA Applications

Author: Rangoonwala Sakina
Publication venue: 'University of North Texas Libraries'
Publication date: 01/08/2009
Field of study

The objective of this thesis was to develop an 8051 microcontroller soft core in the Verilog hardware description language (HDL). Each functional unit of the 8051 microcontroller was developed as a separate module, and tested for functionality using the open-source VHDL Dalton model as benchmark. These modules were then integrated to operate as concurrent processes in the 8051 soft core. The Verilog 8051 soft core was then synthesized in Quartus® II simulation and synthesis environment (Altera Corp., San Jose, CA, www.altera.com) and yielded the expected behavioral response to test programs written in 8051 assembler residing in the v8051 ROM. The design can operate at speeds up to 41 MHz and used only 16% of the FPGA fabric, thus allowing complex systems to be designed on a single chip. Further research and development can be performed on v8051 to enhance performance and functionality

UNT Digital Library

Silicon firewall prototype

Author: Cheng Jin
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

The Internet is a technological advance that provides access to information, and the ability to publish information, in revolutionary ways. There is also a major danger that provides the ability to corrupt and destroy information as well. When a computer is connected to the Internet, three things are put at risk: the data storage, the computing resources and the user’s reputation. In order to balance the advantages and risks, the contact between a computer and the Internet or the contact between different networks should be controlled carefully. A firewall is a form of protection that allows a network to connect to the Internet or to another network while maintaining a degree of security. The firewall is an effective type of network security, and in most situations, it is the most effective tool for doing that. With the availability of larger bandwidth, it is becoming more and more difficult for traditional software firewalls to function over a high-speed connection. In addition, the advances in network hardware technology, such as routers, and new applications of firewalls have caused the software firewall to be an impediment to high throughput. This network bottleneck leads to the requirement for new solutions to balance performance and security. Replacing software with hardware could lead to improved performance, enabling the firewalls to handle significantly larger amounts of data. The goal of this project is to investigate if and how existing desktop computer firewall technology could be improved by replacing software functionality with hardware (i.e., silicon). A hardware-based Silicon Firewall system has been designed by choosing the appropriate architecture and implemented using Altera FPGA (Field Programmable Gate Array) on a SOPC (System On a Programmable Chip) Board. The performance of the Silicon Firewall is tested and compared with the software firewall

eCommons@USASK

University of Saskatchewan Research Archive