74,888 research outputs found

    Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

    Full text link
    State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution. Such networks strain the computational capabilities and energy available to embedded and mobile processing platforms, restricting their use in many important applications. In this paper, we push the boundaries of hardware-effective CNN design by proposing BCNN with Separable Filters (BCNNw/SF), which applies Singular Value Decomposition (SVD) on BCNN kernels to further reduce computational and storage complexity. To enable its implementation, we provide a closed form of the gradient over SVD to calculate the exact gradient with respect to every binarized weight in backward propagation. We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR-10 on FPGA hardware. Our BCNNw/SF accelerator realizes memory savings of 17% and execution time reduction of 31.3% compared to BCNN with only minor accuracy sacrifices.Comment: 9 pages, 6 figures, accepted for Embedded Vision Workshop (CVPRW

    Probabilistic Computability and Choice

    Get PDF
    We study the computational power of randomized computations on infinite objects, such as real numbers. In particular, we introduce the concept of a Las Vegas computable multi-valued function, which is a function that can be computed on a probabilistic Turing machine that receives a random binary sequence as auxiliary input. The machine can take advantage of this random sequence, but it always has to produce a correct result or to stop the computation after finite time if the random advice is not successful. With positive probability the random advice has to be successful. We characterize the class of Las Vegas computable functions in the Weihrauch lattice with the help of probabilistic choice principles and Weak Weak K\H{o}nig's Lemma. Among other things we prove an Independent Choice Theorem that implies that Las Vegas computable functions are closed under composition. In a case study we show that Nash equilibria are Las Vegas computable, while zeros of continuous functions with sign changes cannot be computed on Las Vegas machines. However, we show that the latter problem admits randomized algorithms with weaker failure recognition mechanisms. The last mentioned results can be interpreted such that the Intermediate Value Theorem is reducible to the jump of Weak Weak K\H{o}nig's Lemma, but not to Weak Weak K\H{o}nig's Lemma itself. These examples also demonstrate that Las Vegas computable functions form a proper superclass of the class of computable functions and a proper subclass of the class of non-deterministically computable functions. We also study the impact of specific lower bounds on the success probabilities, which leads to a strict hierarchy of classes. In particular, the classical technique of probability amplification fails for computations on infinite objects. We also investigate the dependency on the underlying probability space.Comment: Information and Computation (accepted for publication

    The Theory of Relational Cohesion: Review of a Research Program

    Get PDF
    In this paper we analyze and review the theory of relational cohesion and attendant program of research. Since the early 1990s, the theory has evolved to answer a number of basic questions regarding cohesion and commitment in social exchange relations. Drawing from the sociology of emotion and modem theories of social identity, the theory asserts that joint activity in the form of frequent exchange unleashes positive emotions and perceptions of relational cohesion. In turn, relational cohesion is predicted to be the primary cause of commitment behavior in a range of situations. Here we outline the theory of relational cohesion, tracing its development through the present day, and summarize the corpus of empirical evidence for the theory’s claims. We conclude by looking ahead to future projects and discussing some of the more general issues informed by our work

    Quantum and spin-based tunneling devices for memory systems

    Get PDF
    Rapid developments in information technology, such as internet, portable computing, and wireless communication, create a huge demand for fast and reliable ways to store and process information. Thus far, this need has been paralleled with the revolution in solid-state memory technologies. Memory devices, such as SRAM, DRAM, and flash, have been widely used in most electronic products. The primary strategy to keep up the trend is miniaturization. CMOS devices have been scaled down beyond sub-45 nm, the size of only a few atomic layers. Scaling, however, will soon reach the physical limitation of the material and cease to yield the desired enhancement in device performance. In this thesis, an alternative method to scaling is proposed and successfully realized. The proposed scheme integrates quantum devices, Si/SiGe resonant interband tunnel diodes (RITD), with classical CMOS devices forming a microsystem of disparate devices to achieve higher performance as well as higher density. The device/circuit designs, layouts and masks involving 12 levels were fabricated utilizing a process that incorporates nearly a hundred processing steps. Utilizing unique characteristics of each component, a low-power tunneling-based static random access memory (TSRAM) has been demonstrated. The TSRAM cells exhibit bistability operation with a power supply voltage as low as 0.37 V. Various TSRAM cells were also constructed and their latching mechanisms have been extensively investigated. In addition, the operation margins of TSRAM cells are evaluated based on different device structures and temperature variation from room temperature up to 200oC. The versatility of TSRAM is extended beyond the binary system. Using multi-peak Si/SiGe RITD, various multi-valued TSRAM (MV-TSRAM) configurations that can store more than two logic levels per cell are demonstrated. By this virtue, memory density can be substantially increased. Using two novel methods via ambipolar operation and utilization of enable/disable transistors, a six-valued MV-TSRAM cell are demonstrated. A revolutionary novel concept of integrating of Si/SiGe RITD with spin tunnel devices, magnetic tunnel junctions (MTJ), has been developed. This hybrid approach adds non-volatility and multi-valued memory potential as demonstrated by theoretical predictions and simulations. The challenges of physically fabricating these devices have been identified. These include process compatibility and device design. A test bed approach of fabricating RITD-MTJ structures has been developed. In conclusion, this body of work has created a sound foundation for new research frontiers in four different major areas: integrated TSRAM system, MV-TSRAM system, MTJ/RITD-based nonvolatile MRAM, and RITD/CMOS logic circuits

    The "MIND" Scalable PIM Architecture

    Get PDF
    MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

    Noise-based information processing: Noise-based logic and computing: what do we have so far?

    Full text link
    We briefly introduce noise-based logic. After describing the main motivations we outline classical, instantaneous (squeezed and non-squeezed), continuum, spike and random-telegraph-signal based schemes with applications such as circuits that emulate the brain functioning and string verification via a slow communication channel.Comment: Invited talk at the 21st International Conference on Noise and Fluctuations, Toronto, Canada, June 12-16, 201
    corecore