1,590 research outputs found
Digital implementation of the cellular sensor-computers
Two different kinds of cellular sensor-processor architectures are used nowadays in various
applications. The first is the traditional sensor-processor architecture, where the sensor and the
processor arrays are mapped into each other. The second is the foveal architecture, in which a
small active fovea is navigating in a large sensor array. This second architecture is introduced
and compared here. Both of these architectures can be implemented with analog and digital
processor arrays. The efficiency of the different implementation types, depending on the used
CMOS technology, is analyzed. It turned out, that the finer the technology is, the better to use
digital implementation rather than analog
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
Adaptive Multiclient Network-on-Chip Memory Core: Hardware Architecture, Software Abstraction Layer, and Application Exploration
This paper presents the hardware architecture and the software abstraction layer of an adaptive multiclient Network-on-Chip (NoC) memory core. The memory core supports the flexibility of a heterogeneous FPGA-based runtime adaptive multiprocessor system called RAMPSoC. The processing elements, also called clients, can access the memory core via the Network-on-Chip (NoC). The memory core supports a dynamic mapping of an address space for the different clients as well as different data transfer modes, such as variable burst sizes. Therefore, two main limitations of FPGA-based multiprocessor systems, the restricted on-chip memory resources and that usually only one physical channel to an off-chip memory exists, are leveraged. Furthermore, a software abstraction layer is introduced, which hides the complexity of the memory core architecture and which provides an easy to use interface for the application programmer. Finally, the advantages of the novel memory core in terms of performance, flexibility, and user friendliness are shown using a real-world image processing application
Adaptive Multiclient Network-on-Chip Memory Core : Hardware Architecture, Software Abstraction Layer, and Application Exploration
This paper presents the hardware architecture and the software abstraction layer of an adaptive multiclient Network-on-Chip (NoC) memory core. The memory core supports the flexibility of a heterogeneous FPGA-based runtime adaptive multiprocessor system called RAMPSoC. The processing elements, also called clients, can access the memory core via the Network-on-Chip (NoC). The memory core supports a dynamic mapping of an address space for the different clients as well as different data transfer modes, such as variable burst sizes. Therefore, two main limitations of FPGA-based multiprocessor systems, the restricted on-chip memory resources and that usually only one physical channel to an off-chip memory exists, are leveraged. Furthermore, a software abstraction layer is introduced, which hides the complexity of the memory core architecture and which provides an easy to use interface for the application programmer. Finally, the advantages of the novel memory core in terms of performance, flexibility, and user friendliness are shown using a real-world image processing application
ACE16K: The Third Generation of Mixed-Signal SIMD-CNN ACE Chips Toward VSoCs
Today, with 0.18-ÎŒm technologies mature and stable enough for mixed-signal design with a large variety of CMOS compatible optical sensors available and with 0.09-ÎŒm technologies knocking at the door of designers, we can face the design of integrated systems, instead of just integrated circuits. In fact, significant progress has been made in the last few years toward the realization of vision systems on chips (VSoCs). Such VSoCs are eventually targeted to integrate within a semiconductor substrate the functions of optical sensing, image processing in space and time, high-level processing, and the control of actuators. The consecutive generations of ACE chips define a roadmap toward flexible VSoCs. These chips consist of arrays of mixed-signal processing elements (PEs) which operate in accordance with single instruction multiple data (SIMD) computing architectures and exhibit the functional features of CNN Universal Machines. They have been conceived to cover the early stages of the visual processing path in a fully-parallel manner, and hence more efficiently than DSP-based systems. Across the different generations, different improvements and modifications have been made looking to converge with the newest discoveries of neurobiologists regarding the behavior of natural retinas. This paper presents considerations pertaining to the design of a member of the third generation of ACE chips, namely to the so-called ACE16k chip. This chip, designed in a 0.35-ÎŒm standard CMOS technology, contains about 3.75 million transistors and exhibits peak computing figures of 330 GOPS, 3.6 GOPS/mm2 and 82.5 GOPS/W. Each PE in the array contains a reconfigurable computing kernel capable of calculating linear convolutions on 3Ă3 neighborhoods in less than 1.5 ÎŒs, imagewise Boolean combinations in less than 200 ns, imagewise arithmetic operations in about 5 ÎŒs, and CNN-like temporal evolutions with a time constant of about 0.5 ÎŒs. Unfortunately, the many ideas underlying the design of this chip cannot be covered in a single paper; hence, this paper is focused on, first, placing the ACE16k in the ACE chip roadmap and, then, discussing the most significant modifications of ACE16K versus its predecessors in the family.LOCUST IST2001â38 097VISTA TIC2003â09 817 - C02â01Office of Naval Research N000 140 210 88
A Dynamically Reconfigurable Parallel Processing Framework with Application to High-Performance Video Processing
Digital video processing demands have and will continue to grow at unprecedented rates. Growth comes from ever increasing volume of data, demand for higher resolution, higher frame rates, and the need for high capacity communications. Moreover, economic realities force continued reductions in size, weight and power requirements. The ever-changing needs and complexities associated with effective video processing systems leads to the consideration of dynamically reconfigurable systems. The goal of this dissertation research was to develop and demonstrate the viability of integrated parallel processing system that effectively and efficiently apply pre-optimized hardware cores for processing video streamed data. Digital video is decomposed into packets which are then distributed over a group of parallel video processing cores. Real time processing requires an effective task scheduler that distributes video packets efficiently to any of the reconfigurable distributed processing nodes across the framework, with the nodes running on FPGA reconfigurable logic in an inherently Virtual\u27 mode. The developed framework, coupled with the use of hardware techniques for dynamic processing optimization achieves an optimal cost/power/performance realization for video processing applications. The system is evaluated by testing processor utilization relative to I/O bandwidth and algorithm latency using a separable 2-D FIR filtering system, and a dynamic pixel processor. For these applications, the system can achieve performance of hundreds of 640x480 video frames per second across an eight lane Gen I PCIe bus. Overall, optimal performance is achieved in the sense that video data is processed at the maximum possible rate that can be streamed through the processing cores. This performance, coupled with inherent ability to dynamically add new algorithms to the described dynamically reconfigurable distributed processing framework, creates new opportunities for realizable and economic hardware virtualization.\u2
Characterisation of a reconfigurable free space optical interconnect system for parallel computing applications and experimental validation using rapid prototyping technology
Free-space optical interconnects (FSOIs) are widely seen as a potential solution to
present and future bandwidth bottlenecks for parallel processing applications.
This thesis will be focused on the study of a particular FSOI system called Optical
Highway (OH). The OH is a polarised beam routing system which uses Polarising
Beam Splitters and Liquid Crystals (PBS/LC) assemblies to perform reconfigurable
interconnection networks. The properties of the OH make it suitable for implementing
different passive static networks.
A technology known as Rapid Prototyping (RP) will be employed for the first time in
order to create optomechanical structures at low cost and low production times. Off-theshelf
optical components will also be characterised in order to implement the OH.
Additionally, properties such as reconfigurability, scalability, tolerance to misalignment
and polarisation losses will be analysed. The OH will be modelled at three levels: node,
optical stage and architecture. Different designs will be proposed and a particular
architecture, Optimised Cut-Through Ring (OCTR), will be experimentally
implemented. Finally, based on this architecture, a new set of properties will be defined
in order to optimise the efficiency of the optical channels
- âŠ