3,302 research outputs found

    Memory and information processing in neuromorphic systems

    Full text link
    A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

    Codes for Asymmetric Limited-Magnitude Errors With Application to Multilevel Flash Memories

    Get PDF
    Several physical effects that limit the reliability and performance of multilevel flash memories induce errors that have low magnitudes and are dominantly asymmetric. This paper studies block codes for asymmetric limited-magnitude errors over q-ary channels. We propose code constructions and bounds for such channels when the number of errors is bounded by t and the error magnitudes are bounded by ℓ. The constructions utilize known codes for symmetric errors, over small alphabets, to protect large-alphabet symbols from asymmetric limited-magnitude errors. The encoding and decoding of these codes are performed over the small alphabet whose size depends only on the maximum error magnitude and is independent of the alphabet size of the outer code. Moreover, the size of the codes is shown to exceed the sizes of known codes (for related error models), and asymptotic rate-optimality results are proved. Extensions of the construction are proposed to accommodate variations on the error model and to include systematic codes as a benefit to practical implementation

    Low Power Decoding Circuits for Ultra Portable Devices

    Get PDF
    A wide spread of existing and emerging battery driven wireless devices do not necessarily demand high data rates. Rather, ultra low power, portability and low cost are the most desired characteristics. Examples of such applications are wireless sensor networks (WSN), body area networks (BAN), and a variety of medical implants and health-care aids. Being small, cheap and low power for the individual transceiver nodes, let those to be used in abundance in remote places, where access for maintenance or recharging the battery is limited. In such scenarios, the lifetime of the battery, in most cases, determines the lifetime of the individual nodes. Therefore, energy consumption has to be so low that the nodes remain operational for an extended period of time, even up to a few years. It is known that using error correcting codes (ECC) in a wireless link can potentially help to reduce the transmit power considerably. However, the power consumption of the coding-decoding hardware itself is critical in an ultra low power transceiver node. Power and silicon area overhead of coding-decoding circuitry needs to be kept at a minimum in the total energy and cost budget of the transceiver node. In this thesis, low power approaches in decoding circuits in the framework of the mentioned applications and use cases are investigated. The presented work is based on the 65nm CMOS technology and is structured in four parts as follows: In the first part, goals and objectives, background theory and fundamentals of the presented work is introduced. Also, the ECC block in coordination with its surrounding environment, a low power receiver chain, is presented. Designing and implementing an ultra low power and low cost wireless transceiver node introduces challenges that requires special considerations at various levels of abstraction. Similarly, a competitive solution often occurs after a conclusive design space exploration. The proposed decoder circuits in the following parts are designed to be embedded in the low power receiver chain, that is introduced in the first part. Second part, explores analog decoding method and its capabilities to be embedded in a compact and low power transceiver node. Analog decod- ing method has been theoretically introduced over a decade ago that followed with early proof of concept circuits that promised it to be a feasible low power solution. Still, with the increased popularity of low power sensor networks, it has not been clear how an analog decoding approach performs in terms of power, silicon area, data rate and integrity of calculations in recent technologies and for low data rates. Ultra low power budget, small size requirement and more relaxed demands on data rates suggests a decoding circuit with limited complexity. Therefore, the four-state (7,5) codes are considered for hardware implementation. Simulations to chose the critical design factors are presented. Consequently, to evaluate critical specifications of the decoding circuit, three versions of analog decoding circuit with different transistor dimensions fabricated. The measurements results reveal different trade-off possibilities as well as the potentials and limitations of the analog decoding approach for the target applications. Measurements seem to be crucial, since the available computer-aided design (CAD) tools provide limited assistance and precision, given the amount of calculations and parameters that has to be included in the simulations. The largest analog decoding core (AD1) takes 0.104mm2 on silicon and the other two (AD2 and AD3) take 0.035mm2 and 0.015mm2, respectively. Consequently, coding gain in trade-off with silicon area and throughput is presented. The analog decoders operate with 0.8V supply. The achieved coding gain is 2.3 dB at bit error rates (BER)=0.001 and 10 pico-Joules per bit (pJ/b) energy efficiency is reached at 2 Mbps. Third part of this thesis, proposes an alternative low power digital decoding approach for the same codes. The desired compact and low power goal has been pursued by designing an equivalent digital decoding circuit that is fabricated in 65nm CMOS technology and operates in low voltage (near-threshold) region. The architecture of the design is optimized in system and circuit levels to propose a competitive digital alternative. Similarly, critical specifications of the decoder in terms of power, area, data rate (speed) and integrity are reported according to the measurements. The digital implementation with 0.11mm2 area, consumes minimum energy at 0.32V supply which gives 9 pJ/b energy efficiency at 125 kb/s and 2.9 dB coding gain at BER=0.001. The forth and last part, compares the proposed design alternatives based on the fabricated chips and the results attained from the measurements to conclude the most suitable solution for the considered target applications. Advantages and disadvantages of both approaches are discussed. Possible extensions of this work is introduced as future work

    Design Solutions For Modular Satellite Architectures

    Get PDF
    The cost-effective access to space envisaged by ESA would open a wide range of new opportunities and markets, but is still many years ahead. There is still a lack of devices, circuits, systems which make possible to develop satellites, ground stations and related services at costs compatible with the budget of academic institutions and small and medium enterprises (SMEs). As soon as the development time and cost of small satellites will fall below a certain threshold (e.g. 100,000 to 500,000 €), appropriate business models will likely develop to ensure a cost-effective and pervasive access to space, and related infrastructures and services. These considerations spurred the activity described in this paper, which is aimed at: - proving the feasibility of low-cost satellites using COTS (Commercial Off The Shelf) devices. This is a new trend in the space industry, which is not yet fully exploited due to the belief that COTS devices are not reliable enough for this kind of applications; - developing a flight model of a flexible and reliable nano-satellite with less than 25,000€; - training students in the field of avionics space systems: the design here described is developed by a team including undergraduate students working towards their graduation work. The educational aspects include the development of specific new university courses; - developing expertise in the field of low-cost avionic systems, both internally (university staff) and externally (graduated students will bring their expertise in their future work activity); - gather and cluster expertise and resources available inside the university around a common high-tech project; - creating a working group composed of both University and SMEs devoted to the application of commercially available technology to space environment. The first step in this direction was the development of a small low cost nano-satellite, started in the year 2004: the name of this project was PiCPoT (Piccolo Cubo del Politecnico di Torino, Small Cube of Politecnico di Torino). The project was carried out by some departments of the Politecnico, in particular Electronics and Aerospace. The main goal of the project was to evaluate the feasibility of using COTS components in a space project in order to greatly reduce costs; the design exploited internal subsystems modularity to allow reuse and further cost reduction for future missions. Starting from the PiCPoT experience, in 2006 we began a new project called ARaMiS (Speretta et al., 2007) which is the Italian acronym for Modular Architecture for Satellites. This work describes how the architecture of the ARaMiS satellite has been obtained from the lesson learned from our former experience. Moreover we describe satellite operations, giving some details of the major subsystems. This work is composed of two parts. The first one describes the design methodology, solutions and techniques that we used to develop the PiCPoT satellite; it gives an overview of its operations, with some details of the major subsystems. Details on the specifications can also be found in (Del Corso et al., 2007; Passerone et al, 2008). The second part, indeed exploits the experience achieved during the PiCPoT development and describes a proposal for a low-cost modular architecture for satellite

    System-on-chip Computing and Interconnection Architectures for Telecommunications and Signal Processing

    Get PDF
    This dissertation proposes novel architectures and design techniques targeting SoC building blocks for telecommunications and signal processing applications. Hardware implementation of Low-Density Parity-Check decoders is approached at both the algorithmic and the architecture level. Low-Density Parity-Check codes are a promising coding scheme for future communication standards due to their outstanding error correction performance. This work proposes a methodology for analyzing effects of finite precision arithmetic on error correction performance and hardware complexity. The methodology is throughout employed for co-designing the decoder. First, a low-complexity check node based on the P-output decoding principle is designed and characterized on a CMOS standard-cells library. Results demonstrate implementation loss below 0.2 dB down to BER of 10^{-8} and a saving in complexity up to 59% with respect to other works in recent literature. High-throughput and low-latency issues are addressed with modified single-phase decoding schedules. A new "memory-aware" schedule is proposed requiring down to 20% of memory with respect to the traditional two-phase flooding decoding. Additionally, throughput is doubled and logic complexity reduced of 12%. These advantages are traded-off with error correction performance, thus making the solution attractive only for long codes, as those adopted in the DVB-S2 standard. The "layered decoding" principle is extended to those codes not specifically conceived for this technique. Proposed architectures exhibit complexity savings in the order of 40% for both area and power consumption figures, while implementation loss is smaller than 0.05 dB. Most modern communication standards employ Orthogonal Frequency Division Multiplexing as part of their physical layer. The core of OFDM is the Fast Fourier Transform and its inverse in charge of symbols (de)modulation. Requirements on throughput and energy efficiency call for FFT hardware implementation, while ubiquity of FFT suggests the design of parametric, re-configurable and re-usable IP hardware macrocells. In this context, this thesis describes an FFT/IFFT core compiler particularly suited for implementation of OFDM communication systems. The tool employs an accuracy-driven configuration engine which automatically profiles the internal arithmetic and generates a core with minimum operands bit-width and thus minimum circuit complexity. The engine performs a closed-loop optimization over three different internal arithmetic models (fixed-point, block floating-point and convergent block floating-point) using the numerical accuracy budget given by the user as a reference point. The flexibility and re-usability of the proposed macrocell are illustrated through several case studies which encompass all current state-of-the-art OFDM communications standards (WLAN, WMAN, xDSL, DVB-T/H, DAB and UWB). Implementations results are presented for two deep sub-micron standard-cells libraries (65 and 90 nm) and commercially available FPGA devices. Compared with other FFT core compilers, the proposed environment produces macrocells with lower circuit complexity and same system level performance (throughput, transform size and numerical accuracy). The final part of this dissertation focuses on the Network-on-Chip design paradigm whose goal is building scalable communication infrastructures connecting hundreds of core. A low-complexity link architecture for mesochronous on-chip communication is discussed. The link enables skew constraint looseness in the clock tree synthesis, frequency speed-up, power consumption reduction and faster back-end turnarounds. The proposed architecture reaches a maximum clock frequency of 1 GHz on 65 nm low-leakage CMOS standard-cells library. In a complex test case with a full-blown NoC infrastructure, the link overhead is only 3% of chip area and 0.5% of leakage power consumption. Finally, a new methodology, named metacoding, is proposed. Metacoding generates correct-by-construction technology independent RTL codebases for NoC building blocks. The RTL coding phase is abstracted and modeled with an Object Oriented framework, integrated within a commercial tool for IP packaging (Synopsys CoreTools suite). Compared with traditional coding styles based on pre-processor directives, metacoding produces 65% smaller codebases and reduces the configurations to verify up to three orders of magnitude

    Design Trade‐Offs for FPGA Implementation of LDPC Decoders

    Get PDF
    Low density parity check (LDPC) decoders represent important throughput bottlenecks, as well as major cost and power-consuming components in today\u27s digital circuits for wireless communication and storage. They present a wide range of architectural choices, with different throughput, cost, and error correction capability trade-offs. In this book chapter, we will present an overview of the main design options in the architecture and implementation of these circuits on field programmable gate array (FPGA) devices. We will present the mapping of the main units within the LDPC decoders on the specific embedded components of FPGA device. We will review architectural trade-offs for both flooded and layered scheduling strategies in their FPGA implementation

    Baseband Processing for 5G and Beyond: Algorithms, VLSI Architectures, and Co-design

    Get PDF
    In recent years the number of connected devices and the demand for high data-rates have been significantly increased. This enormous growth is more pronounced by the introduction of the Internet of things (IoT) in which several devices are interconnected to exchange data for various applications like smart homes and smart cities. Moreover, new applications such as eHealth, autonomous vehicles, and connected ambulances set new demands on the reliability, latency, and data-rate of wireless communication systems, pushing forward technology developments. Massive multiple-input multiple-output (MIMO) is a technology, which is employed in the 5G standard, offering the benefits to fulfill these requirements. In massive MIMO systems, base station (BS) is equipped with a very large number of antennas, serving several users equipments (UEs) simultaneously in the same time and frequency resource. The high spatial multiplexing in massive MIMO systems, improves the data rate, energy and spectral efficiencies as well as the link reliability of wireless communication systems. The link reliability can be further improved by employing channel coding technique. Spatially coupled serially concatenated codes (SC-SCCs) are promising channel coding schemes, which can meet the high-reliability demands of wireless communication systems beyond 5G (B5G). Given the close-to-capacity error correction performance and the potential to implement a high-throughput decoder, this class of code can be a good candidate for wireless systems B5G. In order to achieve the above-mentioned advantages, sophisticated algorithms are required, which impose challenges on the baseband signal processing. In case of massive MIMO systems, the processing is much more computationally intensive and the size of required memory to store channel data is increased significantly compared to conventional MIMO systems, which are due to the large size of the channel state information (CSI) matrix. In addition to the high computational complexity, meeting latency requirements is also crucial. Similarly, the decoding-performance gain of SC-SCCs also do come at the expense of increased implementation complexity. Moreover, selecting the proper choice of design parameters, decoding algorithm, and architecture will be challenging, since spatial coupling provides new degrees of freedom in code design, and therefore the design space becomes huge. The focus of this thesis is to perform co-optimization in different design levels to address the aforementioned challenges/requirements. To this end, we employ system-level characteristics to develop efficient algorithms and architectures for the following functional blocks of digital baseband processing. First, we present a fast Fourier transform (FFT), an inverse FFT (IFFT), and corresponding reordering scheme, which can significantly reduce the latency of orthogonal frequency-division multiplexing (OFDM) demodulation and modulation as well as the size of reordering memory. The corresponding VLSI architectures along with the application specific integrated circuit (ASIC) implementation results in a 28 nm CMOS technology are introduced. In case of a 2048-point FFT/IFFT, the proposed design leads to 42% reduction in the latency and size of reordering memory. Second, we propose a low-complexity massive MIMO detection scheme. The key idea is to exploit channel sparsity to reduce the size of CSI matrix and eventually perform linear detection followed by a non-linear post-processing in angular domain using the compressed CSI matrix. The VLSI architecture for a massive MIMO with 128 BS antennas and 16 UEs along with the synthesis results in a 28 nm technology are presented. As a result, the proposed scheme reduces the complexity and required memory by 35%–73% compared to traditional detectors while it has better detection performance. Finally, we perform a comprehensive design space exploration for the SC-SCCs to investigate the effect of different design parameters on decoding performance, latency, complexity, and hardware cost. Then, we develop different decoding algorithms for the SC-SCCs and discuss the associated decoding performance and complexity. Also, several high-level VLSI architectures along with the corresponding synthesis results in a 12 nm process are presented, and various design tradeoffs are provided for these decoding schemes

    VLSI Implementation of Multi-Bit Error Detection and Correction Codes for Space Communications

    Get PDF
    Data transmission in advanced space communications are suffering with the different types of noises. Further, these noises causeburst errors indata. Thus, the error correction codes (ECC) plays the major role to detect and correct the errors. However, the conventional hamming encoders, decoderswere detected and corrected only one bit error. Therefore, this work implementation the Multi-Bit Error Detection and CorrectionCodes (MBE-DCC) for multiple bits error detection and correction. Initially, MBE-DCC encoding operation is implemented by using generator matrix, which contains both identity bits and parity bits. Then, encoded code word is transmitted into the channel of space communication, where encoded data corrupted by different types of noises, errors. Therefore, the MBE-DCC decoding operation performed at receiver side of space communications, which corrected all the errors using syndrome detection, error location detection, and error correction modules.  The simulations revealed that the proposed MBE-DCC resulted in superior performance than conventional ECC method
    corecore