80,544 research outputs found
An On-line BIST RAM Architecture with Self Repair Capabilities
The emerging field of self-repair computing is expected to have a major impact on deployable systems for space missions and defense applications, where high reliability, availability, and serviceability are needed. In this context, RAM (random access memories) are among the most critical components. This paper proposes a built-in self-repair (BISR) approach for RAM cores. The proposed design, introducing minimal and technology-dependent overheads, can detect and repair a wide range of memory faults including: stuck-at, coupling, and address faults. The test and repair capabilities are used on-line, and are completely transparent to the external user, who can use the memory without any change in the memory-access protocol. Using a fault-injection environment that can emulate the occurrence of faults inside the module, the effectiveness of the proposed architecture in terms of both fault detection and repairing capability was verified. Memories of various sizes have been considered to evaluate the area-overhead introduced by this proposed architectur
The Chameleon Architecture for Streaming DSP Applications
We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool
Template Generation - A Graph Profiling Algorithm
The availability of high-level design entry tooling is crucial for the viability of any reconfigurable SoC architecture. This paper presents a template generation algorithm. The objective of template generation step is to extract functional equivalent structures, i.e. templates, from a control data flow graph. By profiling the graph, the algorithm generates all the possible templates and the corresponding matches. Using unique serial numbers and circle numbers, the algorithm can find all distinct templates with multiple outputs. A new type of graph (hydragraph) that can cope with multiple outputs is introduced. The generated templates pepresented by the hydragraph are not limited in shapes, i.e., we can find templates with multiple outputs or multiple sinks
Recommended from our members
A survey of behavioral-level partitioning systems
Many approaches have been developed to partition a system's behavioral description before a structural implementation is synthesized. We highlight the foundations and motivations for behavioral partitioning. We survey behavioral partitioning approaches, discussing abstraction levels, goals, major steps, and key assumptions in each
Recommended from our members
Timing models for high-level synthesis
In this paper, we describe a timing model for clock estimation during high-level synthesis. In order to obtain realistic timing estimates, the proposed model considers all delay elements, including datapath, control and wire delays, and several technology factors, such as layout architecture, technology mapping, buffers insertion and loading effects. The experimental results show that this model can provide much better estimates than previous models. This model is well suited for automatic and interactive synthesis as well as feedback-driven synthesis where performance matrices must be rapidly and incrementally calculated
Decoding information in the human hippocampus: a user's guide
Multi-voxel pattern analysis (MVPA), or 'decoding', of fMRI activity has gained popularity in the neuroimaging community in recent years. MVPA differs from standard fMRI analyses by focusing on whether information relating to specific stimuli is encoded in patterns of activity across multiple voxels. If a stimulus can be predicted, or decoded, solely from the pattern of fMRI activity, it must mean there is information about that stimulus represented in the brain region where the pattern across voxels was identified. This ability to examine the representation of information relating to specific stimuli (e.g., memories) in particular brain areas makes MVPA an especially suitable method for investigating memory representations in brain structures such as the hippocampus. This approach could open up new opportunities to examine hippocampal representations in terms of their content, and how they might change over time, with aging, and pathology. Here we consider published MVPA studies that specifically focused on the hippocampus, and use them to illustrate the kinds of novel questions that can be addressed using MVPA. We then discuss some of the conceptual and methodological challenges that can arise when implementing MVPA in this context. Overall, we hope to highlight the potential utility of MVPA, when appropriately deployed, and provide some initial guidance to those considering MVPA as a means to investigate the hippocampus
The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework
Computers continue to diversify with respect to system designs, emerging
memory technologies, and application memory demands. Unfortunately, continually
adapting the conventional virtual memory framework to each possible system
configuration is challenging, and often results in performance loss or requires
non-trivial workarounds. To address these challenges, we propose a new virtual
memory framework, the Virtual Block Interface (VBI). We design VBI based on the
key idea that delegating memory management duties to hardware can reduce the
overheads and software complexity associated with virtual memory. VBI
introduces a set of variable-sized virtual blocks (VBs) to applications. Each
VB is a contiguous region of the globally-visible VBI address space, and an
application can allocate each semantically meaningful unit of information
(e.g., a data structure) in a separate VB. VBI decouples access protection from
memory allocation and address translation. While the OS controls which programs
have access to which VBs, dedicated hardware in the memory controller manages
the physical memory allocation and address translation of the VBs. This
approach enables several architectural optimizations to (1) efficiently and
flexibly cater to different and increasingly diverse system configurations, and
(2) eliminate key inefficiencies of conventional virtual memory. We demonstrate
the benefits of VBI with two important use cases: (1) reducing the overheads of
address translation (for both native execution and virtual machine
environments), as VBI reduces the number of translation requests and associated
memory accesses; and (2) two heterogeneous main memory architectures, where VBI
increases the effectiveness of managing fast memory regions. For both cases,
VBI significanttly improves performance over conventional virtual memory
Technology Mapping for Circuit Optimization Using Content-Addressable Memory
The growing complexity of Field Programmable Gate Arrays (FPGA's) is leading to architectures with high input cardinality look-up tables (LUT's). This thesis describes a methodology for area-minimizing technology mapping for combinational logic, specifically designed for such FPGA architectures. This methodology, called LURU, leverages the parallel search capabilities of Content-Addressable Memories (CAM's) to outperform traditional mapping algorithms in both execution time and quality of results. The LURU algorithm is fundamentally different from other techniques for technology mapping in that LURU uses textual string representations of circuit topology in order to efficiently store and search for circuit patterns in a CAM. A circuit is mapped to the target LUT technology using both exact and inexact string matching techniques. Common subcircuit expressions (CSE's) are also identified and used for architectural optimization---a small set of CSE's is shown to effectively cover an average of 96% of the test circuits. LURU was tested with the ISCAS'85 suite of combinational benchmark circuits and compared with the mapping algorithms FlowMap and CutMap. The area reduction shown by LURU is, on average, 20% better compared to FlowMap and CutMap. The asymptotic runtime complexity of LURU is shown to be better than that of both FlowMap and CutMap
- …