11 research outputs found

    Energy-Performance Scalability Analysis of a Novel Quasi-Stochastic Computing Approach

    Get PDF
    Stochastic computing (SC) is an emerging low-cost computation paradigm for efficient approximation. It processes data in forms of probabilities and offers excellent progressive accuracy. Since SC\u27s accuracy heavily depends on the stochastic bitstream length, generating acceptable approximate results while minimizing the bitstream length is one of the major challenges in SC, as energy consumption tends to linearly increase with bitstream length. To address this issue, a novel energy-performance scalable approach based on quasi-stochastic number generators is proposed and validated in this work. Compared to conventional approaches, the proposed methodology utilizes a novel algorithm to estimate the computation time based on the accuracy. The proposed methodology is tested and verified on a stochastic edge detection circuit to showcase its viability. Results prove that the proposed approach offers a 12—60% reduction in execution time and a 12—78% decrease in the energy consumption relative to the conventional counterpart. This excellent scalability between energy and performance could be potentially beneficial to certain application domains such as image processing and machine learning, where power and time-efficient approximation is desired

    Energy-Efficient FPGA-Based Parallel Quasi-Stochastic Computing

    Get PDF
    The high performance of FPGA (Field Programmable Gate Array) in image processing applications is justified by its flexible reconfigurability, its inherent parallel nature and the availability of a large amount of internal memories. Lately, the Stochastic Computing (SC) paradigm has been found to be significantly advantageous in certain application domains including image processing because of its lower hardware complexity and power consumption. However, its viability is deemed to be limited due to its serial bitstream processing and excessive run-time requirement for convergence. To address these issues, a novel approach is proposed in this work where an energy-efficient implementation of SC is accomplished by introducing fast-converging Quasi-Stochastic Number Generators (QSNGs) and parallel stochastic bitstream processing, which are well suited to leverage FPGA\u27s reconfigurability and abundant internal memory resources. The proposed approach has been tested on the Virtex-4 FPGA, and results have been compared with the serial and parallel implementations of conventional stochastic computation using the well-known SC edge detection and multiplication circuits. Results prove that by using this approach, execution time, as well as the power consumption are decreased by a factor of 3.5 and 4.5 for the edge detection circuit and multiplication circuit, respectively

    Novel approaches for efficient stochastic computing

    Get PDF
    This thesis is comprised of two papers, where the first paper presents a novel approach for parallel implementation of SC using FPGA (Field Programmable Gate Array). This paper makes use of the distributed memory elements of FPGAs (i.e., look-up-tables -LUTs) to achieve this. An attempt has been made to build the stochastic number generators (SNGs) by using the proposed LUT approach. The construction of these SNGs has been influenced by the Quasi-random number sequences, which provide the advantage of reducing the random fluctuations present in the pseudo-random number generators such as LFSR (Linear Feedback Shift Register) as well as the execution time by faster convergence. The results prove that the throughput of the system increases and the execution time is reduced by adopting the proposed technique. The second paper of the thesis proposes a novel technique referred to as the approximate stochastic computing (ASC) approach focusing on image processing applications to reduce the lengthy computation time of SC with a trade-off in accuracy. The proposed technique is to truncate low-order bits of the image pixel values for SC for faster operation, which also causes an error in the binary to stochastic converted value. Attempts have been made to reduce this error by linearly increasing the clock cycles rather than exponentially. Experimental results from the well-known SC edge detection circuit indicate that the proposed technique is a promising approach for efficient approximate stochastic image processing --Abstract, page iv

    Compact and accurate stochastic circuits with shared random number sources

    No full text

    기계학습 시스템 설계를 위한 방법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 최기영.Machine learning has been paid attention because intelligence such as recognition, decision making, and recommendation is a helpful utility in industrial, medical, transportation, entertainment systems, and others that human need to interact with. As machine learning techniques are extensively applied to various areas, the needs for more robust algorithms and more efficient hardware have been increased. In order to develop an efficient machine learning system, we have researched from high-level algorithm down to low-level hardware logicthe main focus of our work is on ensemble machine learning and stochastic computing (SC). The first work is to combine multiple components, i.e., multiple feature extractors (FE) and multiple classifiers in the aspect of pattern recognition. Ensemble of multiple components is one of challenging approaches for constructing a more accurate classifier. It can handle difficult problems where a single classifier easily makes a wrong decision due to lack of training or parameter optimization. Combining the decisions of participating classifiers statistically reduces the risk of wrong decision. We suggest a hierarchical ensemble framework of multiple feature extractors and multiple classifiers (MFMC). The second work is to construct efficient hardware building blocks for machine learning in order to reduce system complexity and generate high area- and energy-efficient logic, where we exploit the property of machine learning systems that does not require accurate computations. We select stochastic computing (SC), which is an alternative paradigm to conventional binary arithmetic computing. SC can boost efficiency in terms of area, power, and error tolerance, while relaxing the accuracy of computation. The third work is to combine both machine learning and stochastic computing, where we select deep learning. This work presents an efficient DNN design with stochastic computing. Observing that directly adopting stochastic computing to DNN has some challenges including random error fluctuation, range limitation, and overhead in accumulation, we address these problems by removing near-zero weights, applying weight-scaling, and integrating the activation function with the accumulator. The approach allows an easy implementation of early decision termination with a fixed hardware design by exploiting the progressive precision characteristics of stochastic computing, which was not easy with existing approaches. Experimental results show that our approach outperforms the conventional binary logic in terms of gate area, latency, and power consumption.1. Introduction 1 1.1 Hierarchical Ensemble Learning Framework 1 1.2 Hardware Building Block for Machine Learning By Using Stochastic Computing 1 1.2.1 Dynamic energy-accuracy trade-off using stochastic computing in deep neural networks 5 2. A Design Framework for Hierarchical Ensemble of Multiple Feature Extractors and Multiple Classifiers 7 2.1 Introduction 7 2.2 Related work 9 2.3 Proposed hierarchical ensemble system 12 2.3.1 Local Mapping Block and Global Mapping Block 12 2.3.2 Complexity comparison according to composition of LMB 15 2.3.3 Motivation for differentiating local and global mappings17 2.3.4 Reinforcement learning for LMB 19 2.3.5 Construction of Bayesian network from GMB 24 2.4 Experimental results 32 2.4.1 Measure of effectiveness for WMV and RL 33 2.4.2 Pedestrian detection dataset 35 2.4.3 Comparison between GMB and AdaBoost 41 2.4.4 UCI Multiple Features dataset 42 2.4.5 LMB selection 44 2.4.6 Discussion 45 2.5 Conclusion 46 3. Synthesis of Efficient Stochastic Logic for Many-Variable Expressions 49 3.1 Introduction 49 3.2 Related Work 52 3.3 SC Logic Synthesis for Multivariate Expressions 54 3.3.1 Probabilistic Logic 55 3.3.2 Definitions 58 3.3.3 Overview of the Proposed Method 60 3.3.4 Direct Synthesis VS. Kernel-based Synthesis 60 3.3.5 SC Kernel 63 3.3.6 Prime SC Kernel 65 3.3.7 iSC Kernel 68 3.3.8 Relationship Between iSC Kernels 70 3.3.9 Hybrid Scheme 75 3.3.10 Cost Function 76 3.3.11 SC Synthesis Algorithm 78 3.4 Experimental Results 82 3.4.1 Performance of SC Logic Synthesis Algorithm 83 3.4.2 Quality of Synthesis Results 84 3.4.3 Comparison of Accuracy 89 3.5 Conclusion 90 4. An Energy-Efficient Random Number Generator for Stochastic Circuits 91 4.1 Introduction 91 4.2 II. Background 92 4.2.1 Preliminaries 92 4.2.2 Shortcomings of Conventional Approaches 93 4.3 III. Proposed Stochastic Number Generator 96 4.3.1 Overview of the Proposed SNG 96 4.3.2 Even-distribution Encoding 96 4.3.3 Inter-group Randomization 98 4.3.4 Proposed Building Block for Bit Shuffling 100 4.3.5 Intra-group Randomization 102 4.4 Experimental Results 103 4.4.1 Accuracy of Generated Stochastic Bit Stream 104 4.4.2 Area, Delay, Power, Energy and SCC Average 104 4.4.3 Energy Efficiency When Operated under Maximal Precision 105 4.5 Conclusion 106 5. Approximate De-randomizer for Stochastic Circuits 107 5.1 Introduction 107 5.2 Proposed Approximate Parallel Counter 108 5.2.1 Analysis for Gate Count in 1-layer Approximate PC 109 5.2.2 Analysis for Error in 1-layer Approximate PC 110 5.3 Experimental Results 111 5.4 Conclusion 112 6. Dynamic Energy-Accuracy Trade-off Using Stochastic Computing in Deep Neural Networks 113 6.1 Introduction 113 6.2 Background 115 6.4 DNN Using Stochastic Circuit 117 6.4.1 Overview of the Proposed DNN using SC 117 6.4.2 Removing Near-Zero Weights 119 6.4.3 Applying Weight Scaling 120 6.4.4 Activation Function with Accumulation 121 6.5 Early Decision Termination 125 6.5.1 Moving Average Tracking Output Trends 126 6.6 Experimental Results 127 6.6.1 Accuracy of DNN Using SC 128 6.6.2 Effectiveness of Early Decision Termination 129 6.6.3 Comparison of Synthesis Results 130 6.7 Conclusion 132 7. Conclusion 134 Bibliography 136 요약(국문초록) 144Docto