146 research outputs found

    Design and Code Optimization for Systems with Next-generation Racetrack Memories

    Get PDF
    With the rise of computationally expensive application domains such as machine learning, genomics, and fluids simulation, the quest for performance and energy-efficient computing has gained unprecedented momentum. The significant increase in computing and memory devices in modern systems has resulted in an unsustainable surge in energy consumption, a substantial portion of which is attributed to the memory system. The scaling of conventional memory technologies and their suitability for the next-generation system is also questionable. This has led to the emergence and rise of nonvolatile memory ( NVM ) technologies. Today, in different development stages, several NVM technologies are competing for their rapid access to the market. Racetrack memory ( RTM ) is one such nonvolatile memory technology that promises SRAM -comparable latency, reduced energy consumption, and unprecedented density compared to other technologies. However, racetrack memory ( RTM ) is sequential in nature, i.e., data in an RTM cell needs to be shifted to an access port before it can be accessed. These shift operations incur performance and energy penalties. An ideal RTM , requiring at most one shift per access, can easily outperform SRAM . However, in the worst-cast shifting scenario, RTM can be an order of magnitude slower than SRAM . This thesis presents an overview of the RTM device physics, its evolution, strengths and challenges, and its application in the memory subsystem. We develop tools that allow the programmability and modeling of RTM -based systems. For shifts minimization, we propose a set of techniques including optimal, near-optimal, and evolutionary algorithms for efficient scalar and instruction placement in RTMs . For array accesses, we explore schedule and layout transformations that eliminate the longer overhead shifts in RTMs . We present an automatic compilation framework that analyzes static control flow programs and transforms the loop traversal order and memory layout to maximize accesses to consecutive RTM locations and minimize shifts. We develop a simulation framework called RTSim that models various RTM parameters and enables accurate architectural level simulation. Finally, to demonstrate the RTM potential in non-Von-Neumann in-memory computing paradigms, we exploit its device attributes to implement logic and arithmetic operations. As a concrete use-case, we implement an entire hyperdimensional computing framework in RTM to accelerate the language recognition problem. Our evaluation shows considerable performance and energy improvements compared to conventional Von-Neumann models and state-of-the-art accelerators

    The number of solutions for random regular NAE-SAT

    Full text link
    Recent work has made substantial progress in understanding the transitions of random constraint satisfaction problems. In particular, for several of these models, the exact satisfiability threshold has been rigorously determined, confirming predictions of statistical physics. Here we revisit one of these models, random regular k-NAE-SAT: knowing the satisfiability threshold, it is natural to study, in the satisfiable regime, the number of solutions in a typical instance. We prove here that these solutions have a well-defined free energy (limiting exponential growth rate), with explicit value matching the one-step replica symmetry breaking prediction. The proof develops new techniques for analyzing a certain "survey propagation model" associated to this problem. We believe that these methods may be applicable in a wide class of related problems

    Multipartite entanglement and quantum algorithms

    Get PDF
    [eng] Quantum information science has grown from being a very small subfield in the 70s until being one of the most dynamic fields in physics, both in fundamentals and applications. In the theoretical section, perhaps the feature that has attracted most interest is the notion of entanglement, the ghostly relation between particles that dazzled Einstein and has provided fabulous challenges to build a coherent interpretation of quantum mechanics. While not completely solved, we have today learned enough to feel less uneasy with this fundamental problem, and the focus has shifted towards its potential powerful applications. Entanglement is now being studied from different perspectives as a resource for performing information processing tasks. With bipartite entanglement being largely understood nowadays, many questions remain unanswered in the multipartite case. The first part of this thesis deals with multipartite entanglement in different contexts. In the first chapters it is studied within the whole corresponding Hilbert space, and we investigate several entanglement measures searching for states that maximize them, including violations of Bell inequalities. Later, focus is shifted towards hamiltonians that have entangled ground states, and we investigate entanglement as a way to establish a distance between theories and we study frustration and methods to efficiently solve hamiltonians that exhibit it. In the practical section, the most promised upcoming technological advance is the advent of quantum computers. In the 90s some quantum algorithms improving the performance of all known classical algorithms for certain problems started to appear, while in the 2000s the first universal computers of few atoms began to be built, allowing implementation of those algorithms in small scales. The D-Wave machine already performs quantum annealing in thousands of qubits, although some controversy over the true quantumness of its internal workings surrounds it. Many countries in the planet are devoting large amounts of money to this field, with the recent European flagship and the involvement of the largest US technological companies giving reasons for optimism. The second part of this thesis deals with some aspects of quantum computation, starting with the creation of the field of cloud quantum computation with the appearance of the first computer available to the general public through internet, which we have used and analysed extensively. Also small incursions in quantum adiabatic computation and quantum thermodynamics are present in this second part.[cat] La informació quàntica ha crescut des d'un petit subcamp als anys setanta fins a esdevenir un dels camps més dinàmics de la física actualment, tant en aspectes fonamentals com en les seves aplicacions. En la secció teòrica, potser la propietat que ha atret més interès és la noció d'entrellaçament, la relació fantasmagòrica entre partícules que va deixar estupefacte Einstein i que ha suposat un enorme desafiament per a construir una interpretació coherent de la mecànica quàntica. Sense estar totalment solucionat, hem après prou per sentir-nos menys incòmodes amb aquest problema fonamental i el focus s'ha desplaçat a les seves aplicacions potencials. L'entrellaçament s'estudia avui en dia des de diferents perspectives com a recurs per realitzar tasques de processament de la informació. L'entrellaçament bipartit està ja molt ben comprès, però en el cas multipartit queden moltes qüestions obertes. La primera part d'aquesta tesi tracta de l'entrellaçament multipartit en diferents contextos. Estudiem l'hiperdeterminant com a mesura d'entrellaçament el cas de 4 qubits, analitzem l'existència i les propietats matemàtiques dels estats absolutament màximament entrellaçats, trobem noves desigualtats de Bell, estudiem l'espectre d'entrellaçament com a mesura de distància entre teories i estudiem xarxes tensorials per tractar eficientment sistemes frustrats. En l'apartat pràctic, el més prometedor avenç tecnològic del camp és l'adveniment dels ordinadors quàntics. La segona part de la tesi tracta d'alguns aspectes de computació quàntica, començant per la creació del camp de la computació quàntica al núvol, amb l'aparició del primer ordinador disponible per al públic general, que hem usat extensament. També fem petites incursions a la computació quàntica adiabàtica i a la termodinàmica quàntica en aquesta segona par

    Acoustic event detection and localization using distributed microphone arrays

    Get PDF
    Automatic acoustic scene analysis is a complex task that involves several functionalities: detection (time), localization (space), separation, recognition, etc. This thesis focuses on both acoustic event detection (AED) and acoustic source localization (ASL), when several sources may be simultaneously present in a room. In particular, the experimentation work is carried out with a meeting-room scenario. Unlike previous works that either employed models of all possible sound combinations or additionally used video signals, in this thesis, the time overlapping sound problem is tackled by exploiting the signal diversity that results from the usage of multiple microphone array beamformers. The core of this thesis work is a rather computationally efficient approach that consists of three processing stages. In the first, a set of (null) steering beamformers is used to carry out diverse partial signal separations, by using multiple arbitrarily located linear microphone arrays, each of them composed of a small number of microphones. In the second stage, each of the beamformer output goes through a classification step, which uses models for all the targeted sound classes (HMM-GMM, in the experiments). Then, in a third stage, the classifier scores, either being intra- or inter-array, are combined using a probabilistic criterion (like MAP) or a machine learning fusion technique (fuzzy integral (FI), in the experiments). The above-mentioned processing scheme is applied in this thesis to a set of complexity-increasing problems, which are defined by the assumptions made regarding identities (plus time endpoints) and/or positions of sounds. In fact, the thesis report starts with the problem of unambiguously mapping the identities to the positions, continues with AED (positions assumed) and ASL (identities assumed), and ends with the integration of AED and ASL in a single system, which does not need any assumption about identities or positions. The evaluation experiments are carried out in a meeting-room scenario, where two sources are temporally overlapped; one of them is always speech and the other is an acoustic event from a pre-defined set. Two different databases are used, one that is produced by merging signals actually recorded in the UPC¿s department smart-room, and the other consists of overlapping sound signals directly recorded in the same room and in a rather spontaneous way. From the experimental results with a single array, it can be observed that the proposed detection system performs better than either the model based system or a blind source separation based system. Moreover, the product rule based combination and the FI based fusion of the scores resulting from the multiple arrays improve the accuracies further. On the other hand, the posterior position assignment is performed with a very small error rate. Regarding ASL and assuming an accurate AED system output, the 1-source localization performance of the proposed system is slightly better than that of the widely-used SRP-PHAT system, working in an event-based mode, and it even performs significantly better than the latter one in the more complex 2-source scenario. Finally, though the joint system suffers from a slight degradation in terms of classification accuracy with respect to the case where the source positions are known, it shows the advantage of carrying out the two tasks, recognition and localization, with a single system, and it allows the inclusion of information about the prior probabilities of the source positions. It is worth noticing also that, although the acoustic scenario used for experimentation is rather limited, the approach and its formalism were developed for a general case, where the number and identities of sources are not constrained

    Subject Index Volumes 1–200

    Get PDF

    Low Density Graph Codes And Novel Optimization Strategies For Information Transfer Over Impaired Medium

    Get PDF
    Effective methods for information transfer over an imperfect medium are of great interest. This thesis addresses the following four topics involving low density graph codes and novel optimization strategies.Firstly, we study the performance of a promising coding technique: low density generator matrix (LDGM) codes. LDGM codes provide satisfying performance while maintaining low encoding and decoding complexities. In the thesis, the performance of LDGM codes is extracted for both majority-rule-based and sum-product iterative decoding algorithms. The ultimate performance of the coding scheme is revealed through distance spectrum analysis. We derive the distance spectral for both LDGM codes and concatenated LDGM codes. The results show that serial-concatenated LDGM codes deliver extremely low error-floors. This work provides valued information for selecting the parameters of LDGM codes. Secondly, we investigate network-coding on relay-assisted wireless multiple access (WMA) networks. Network-coding is an effective way to increase robustness and traffic capacity of networks. Following the framework of network-coding, we introduce new network codes for the WMA networks. The codes are constructed based on sparse graphs, and can explore the diversities available from both the time and space domains. The data integrity from relays could be compromised when the relays are deployed in open areas. For this, we propose a simple but robust security mechanism to verify the data integrity.Thirdly, we study the problem of bandwidth allocation for the transmission of multiple sources of data over a single communication medium. We aim to maximize the overall user satisfaction, and formulate an optimization problem. Using either the logarithmic or exponential form of satisfaction function, we derive closed-form optimal solutions, and show that the optimal bandwidth allocation for each type of data is piecewise linear with respect to the total available bandwidth. Fourthly, we consider the optimization strategy on recovery of target spectrum for filter-array-based spectrometers. We model the spectrophotometric system as a communication system, in which the information content of the target spectrum is passed through distortive filters. By exploiting non-negative nature of spectral content, a non-negative least-square optimal criterion is found particularly effective. The concept is verified in a hardware implemen

    Angles and devices for quantum approximate optimization

    Get PDF
    A potential application of emerging Noisy Intermediate-Scale Quantum (NISQ) devices is that of approximately solving combinatorial optimization problems. This thesis investigates a gate-based algorithm for this purpose, the Quantum Approximate Optimization Algorithm (QAOA), in two major themes. First, we examine how the QAOA resolves the problems it is designed to solve. We take a statistical view of the algorithm applied to ensembles of problems, first, considering a highly symmetric version of the algorithm, using Grover drivers. In this highly symmetric context, we find a simple dependence of the QAOA state’s expected value on how values of the cost function are distributed. Furthering this theme, we demonstrate that, generally, QAOA performance depends on problem statistics with respect to a metric induced by a chosen driver Hamiltonian. We obtain a method for evaluating QAOA performance on worst-case problems, those of random costs, for differing driver choices. Second, we investigate a QAOA context with device control occurring only via single-qubit gates, rather than using individually programmable one- and two-qubit gates. In this reduced control overhead scheme---the digital-analog scheme---the complexity of devices running QAOA circuits is decreased at the cost of errors which are shown to be non-harmful in certain regimes. We then explore hypothetical device designs one could use for this purpose.Eine mögliche Anwendung für “Noisy Intermediate-Scale Quantum devices” (NISQ devices) ist die näherungsweise Lösung von kombinatorischen Optimierungsproblemen. Die vorliegende Arbeit untersucht anhand zweier Hauptthemen einen gatterbasierten Algorithmus, den sogenannten “Quantum Approximate Optimization Algorithm” (QAOA). Zuerst prüfen wir, wie der QAOA jene Probleme löst, für die er entwickelt wurde. Wir betrachten den Algorithmus in einer Kombination mit hochsymmetrischen Grover-Treibern für statistische Ensembles von Probleminstanzen. In diesem Kontext finden wir eine einfache Abhängigkeit von der Verteilung der Kostenfunktionswerte. Weiterführend zeigen wir, dass die QAOA-Leistung generell von der Problemstatistik in Bezug auf eine durch den gewählten Treiber-Hamiltonian induzierte Metrik abhängt. Wir erhalten eine Methode zur Bewertung der QAOA-Leistung bei schwersten Problemen (solche zufälliger Kosten) für unterschiedliche Treiberauswahlen. Zweitens untersuchen wir eine QAOA-Variante, bei der sich die Hardware- Kontrolle nur auf Ein-Qubit-Gatter anstatt individuell programmierbare Ein- und Zwei-Qubit-Gatter erstreckt. In diesem reduzierten Kontrollaufwandsschema—dem digital-analogen Schema—sinkt die Komplexität der Hardware, welche die QAOASchaltungen ausführt, auf Kosten von Fehlern, die in bestimmten Bereichen als ungefährlich nachgewiesen werden. Danach erkunden wir hypothetische Hardware- Konzepte, die für diesen Zweck genutzt werden könnten

    Handbook of Computer Vision Algorithms in Image Algebra

    Full text link
    corecore