9 research outputs found

    Skyrmion Gas Manipulation for Probabilistic Computing

    Full text link
    The topologically protected magnetic spin configurations known as skyrmions offer promising applications due to their stability, mobility and localization. In this work, we emphasize how to leverage the thermally driven dynamics of an ensemble of such particles to perform computing tasks. We propose a device employing a skyrmion gas to reshuffle a random signal into an uncorrelated copy of itself. This is demonstrated by modelling the ensemble dynamics in a collective coordinate approach where skyrmion-skyrmion and skyrmion-boundary interactions are accounted for phenomenologically. Our numerical results are used to develop a proof-of-concept for an energy efficient (∼μW\sim\mu\mathrm{W}) device with a low area imprint (∼μm2\sim\mu\mathrm{m}^2). Whereas its immediate application to stochastic computing circuit designs will be made apparent, we argue that its basic functionality, reminiscent of an integrate-and-fire neuron, qualifies it as a novel bio-inspired building block.Comment: 41 pages, 20 figure

    Energy-Performance Scalability Analysis of a Novel Quasi-Stochastic Computing Approach

    Get PDF
    Stochastic computing (SC) is an emerging low-cost computation paradigm for efficient approximation. It processes data in forms of probabilities and offers excellent progressive accuracy. Since SC\u27s accuracy heavily depends on the stochastic bitstream length, generating acceptable approximate results while minimizing the bitstream length is one of the major challenges in SC, as energy consumption tends to linearly increase with bitstream length. To address this issue, a novel energy-performance scalable approach based on quasi-stochastic number generators is proposed and validated in this work. Compared to conventional approaches, the proposed methodology utilizes a novel algorithm to estimate the computation time based on the accuracy. The proposed methodology is tested and verified on a stochastic edge detection circuit to showcase its viability. Results prove that the proposed approach offers a 12—60% reduction in execution time and a 12—78% decrease in the energy consumption relative to the conventional counterpart. This excellent scalability between energy and performance could be potentially beneficial to certain application domains such as image processing and machine learning, where power and time-efficient approximation is desired

    Energy-Efficient FPGA-Based Parallel Quasi-Stochastic Computing

    Get PDF
    The high performance of FPGA (Field Programmable Gate Array) in image processing applications is justified by its flexible reconfigurability, its inherent parallel nature and the availability of a large amount of internal memories. Lately, the Stochastic Computing (SC) paradigm has been found to be significantly advantageous in certain application domains including image processing because of its lower hardware complexity and power consumption. However, its viability is deemed to be limited due to its serial bitstream processing and excessive run-time requirement for convergence. To address these issues, a novel approach is proposed in this work where an energy-efficient implementation of SC is accomplished by introducing fast-converging Quasi-Stochastic Number Generators (QSNGs) and parallel stochastic bitstream processing, which are well suited to leverage FPGA\u27s reconfigurability and abundant internal memory resources. The proposed approach has been tested on the Virtex-4 FPGA, and results have been compared with the serial and parallel implementations of conventional stochastic computation using the well-known SC edge detection and multiplication circuits. Results prove that by using this approach, execution time, as well as the power consumption are decreased by a factor of 3.5 and 4.5 for the edge detection circuit and multiplication circuit, respectively

    Novel approaches for efficient stochastic computing

    Get PDF
    This thesis is comprised of two papers, where the first paper presents a novel approach for parallel implementation of SC using FPGA (Field Programmable Gate Array). This paper makes use of the distributed memory elements of FPGAs (i.e., look-up-tables -LUTs) to achieve this. An attempt has been made to build the stochastic number generators (SNGs) by using the proposed LUT approach. The construction of these SNGs has been influenced by the Quasi-random number sequences, which provide the advantage of reducing the random fluctuations present in the pseudo-random number generators such as LFSR (Linear Feedback Shift Register) as well as the execution time by faster convergence. The results prove that the throughput of the system increases and the execution time is reduced by adopting the proposed technique. The second paper of the thesis proposes a novel technique referred to as the approximate stochastic computing (ASC) approach focusing on image processing applications to reduce the lengthy computation time of SC with a trade-off in accuracy. The proposed technique is to truncate low-order bits of the image pixel values for SC for faster operation, which also causes an error in the binary to stochastic converted value. Attempts have been made to reduce this error by linearly increasing the clock cycles rather than exponentially. Experimental results from the well-known SC edge detection circuit indicate that the proposed technique is a promising approach for efficient approximate stochastic image processing --Abstract, page iv

    The Logic of Random Pulses: Stochastic Computing.

    Full text link
    Recent developments in the field of electronics have produced nano-scale devices whose operation can only be described in probabilistic terms. In contrast with the conventional deterministic computing that has dominated the digital world for decades, we investigate a fundamentally different technique that is probabilistic by nature, namely, stochastic computing (SC). In SC, numbers are represented by bit-streams of 0's and 1's, in which the probability of seeing a 1 denotes the value of the number. The main benefit of SC is that complicated arithmetic computation can be performed by simple logic circuits. For example, a single (logic) AND gate performs multiplication. The dissertation begins with a comprehensive survey of SC and its applications. We highlight its main challenges, which include long computation time and low accuracy, as well as the lack of general design methods. We then address some of the more important challenges. We introduce a new SC design method, called STRAUSS, that generates efficient SC circuits for arbitrary target functions. We then address the problems arising from correlation among stochastic numbers (SNs). In particular, we show that, contrary to general belief, correlation can sometimes serve as a resource in SC design. We also show that unlike conventional circuits, SC circuits can tolerate high error rates and are hence useful in some new applications that involve nondeterministic behavior in the underlying circuitry. Finally, we show how SC's properties can be exploited in the design of an efficient vision chip that is suitable for retinal implants. In particular, we show that SC circuits can directly operate on signals with neural encoding, which eliminates the need for data conversion.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113561/1/alaghi_1.pd

    Heterogeneous Chip Multiprocessor: Data Representation, Mixed-Signal Processing Tiles, and System Design

    Get PDF
    With the emergence of big data, the need for more computationally intensive processors that can handle the increased processing demand has risen. Conventional computing paradigms based on the Von Neumann model that separates computational and memory structures have become outdated and less efficient for this increased demand. As the speed and memory density of processors have increased significantly over the years, these models of computing, which rely on a constant stream of data between the processor and memory, see less gains due to finite bandwidth and latency. Moreover, in the presence of extreme scaling, these conventional systems, implemented in submicron integrated circuits, have become even more susceptible to process variability, static leakage current, and more. In this work, alternative paradigms, predicated on distributive processing with robust data representation and mixed-signal processing tiles, are explored for constructing more efficient and scalable computing systems in application specific integrated circuits (ASICs). The focus of this dissertation work has been on heterogeneous chip multi-processor (CMP) design and optimization across different levels of abstraction. On the level of data representation, a different modality of representation based on random pulse density modulation (RPDM) coding is explored for more efficient processing using stochastic computation. On the level of circuit description, mixed-signal integrated circuits that exploit charge-based computing for energy efficient fixed point arithmetic are designed. Consequently, 8 different chips that test and showcase these circuits were fabricated in submicron CMOS processes. Finally, on the architectural level of description, a compact instruction-set processor and controller that facilitates distributive computing on System-On-Chips (SoCs) is designed. In addition to this, a robust bufferless network architecture is designed with a network simulator, and I/O cells are designed for SoCs. The culmination of this thesis work has led to the design and fabrication of a heterogeneous chip multi- processor prototype comprised of over 12,000 VVM cores, warp/dewarp processors, cache, and additional processors, which can be applied towards energy efficient large-scale data processing

    Design and optimization of approximate multipliers and dividers for integer and floating-point arithmetic

    Full text link
    The dawn of the twenty-first century has witnessed an explosion in the number of digital devices and data. While the emerging deep learning algorithms to extract information from this vast sea of data are becoming increasingly compute-intensive, traditional means of improving computing power are no longer yielding gains at the same rate due to the diminishing returns from traditional technology scaling. To minimize the increasing gap between computational demands and the available resources, the paradigm of approximate computing is emerging as one of the potential solutions. Specifically, the resource-efficient approximate arithmetic units promise overall system efficiency, since most of the compute-intensive applications are dominated by arithmetic operations. This thesis primarily presents design techniques for approximate hardware multipliers and dividers. The thesis presents the design of two approximate integer multipliers and an approximate integer divider. These are: an error-configurable minimally-biased approximate integer multiplier (MBM), an error-configurable reduced-error approximate log based multiplier (REALM), and error-configurable integer divider INZeD. The two multiplier designs and the divider designs are based on the coupling of novel mathematically formulated error-reduction mechanisms in the classical approximate log based multiplier and dividers, respectively. They exhibit very low error bias and offer Pareto-optimal error vs. resource-efficiency trade-offs when compared with the state-of-the-art approximate integer multipliers/dividers. Further, the thesis also presents design of approximate floating-point multipliers and dividers. These designs utilize the optimized versions of the proposed MBM and REALM multipliers for mantissa multiplications and the proposed INZeD divider for mantissa division, and offer better design trade-offs than traditional precision scaling. The existing approximate integer dividers as well as the proposed INZeD suffer from unreasonably high worst-case error. This thesis presents WEID, which is a novel light-weight method for reducing worst-case error in approximate dividers. Finally, the thesis presents a methodology for selection of approximate arithmetic units for a given application. The methodology is based on a novel selection algorithm and utilizes the subrange error characterization of approximate arithmetic units, which performs error characterization independently in different segments of the input range
    corecore