8 research outputs found

    New algorithms for hardware-efficient implementation of sign detection and magnitude comparison in residue number systems

    No full text
    Residue Number System (RNS), being a non-positional number system, is emerging as a promising data representation to substitute the accustomed two’s complement number system for low-power and high-speed digital signal processing. Due to its carry free property at sub-word level, arithmetic operations such as addition, subtraction and multiplication can be performed at higher speed than any weighted number system. Nonetheless, the arithmetic operations such as sign detection, signed integer magnitude comparison are found to be more difficult to implement in RNS. This thesis aims to develop new algorithms for hardware-efficient implementation of these difficult operations in RNS. Specifically, the following new findings and results are reported in this thesis. First and foremost, a new fast and area-efficient adder-based sign detector for RNS {2n−1, 2n, 2n+1} has been proposed. The circuit is greatly simplified by shrinking the dynamic range to eliminate large modulo operations with the help of the new Chinese remainder theorem (CRT)-I. Synthesis results based on 65nm CMOS standard cell library show that the proposed design outperforms all existing adder-based sign detectors reported for this moduli set in area and speed for n ranges from 5 to 25 in step of 5. Following which, a radically different quantization approach for comparing signed integers in the four-moduli supersets, S1 ≡ {2n+k, 2n−1, 2n+1, 2n+1−1} and S2 ≡ {2n+k, 2n−1, 2n+1, 2n−1−1}, is proposed. The dynamic range of the target moduli set is quantized into equal divisions, and the ranks of the divisions resided by the residue representations of the integers in comparison are identified and compared. This approach allows the sign of a residue representation to be directly extracted from the most significant bit of its rank, or with some simple additional logic function if it resides in the middle rank. Comparing with the best existing signed magnitude comparator applicable to these two moduli sets, the synthesis results based on 65 nm CMOS technology show that the proposed design is at least 22.48% smaller, 19.08% faster and 27.66% more energy-efficient for S1 and 18.75% smaller, 16.41% faster and 23.30% more energy-efficient for S2. Last but not least, a scaling-assisted sign integer comparator for the balanced five-moduli set {2n−1, 2n, 2n+1, 2n+1−1, 2n−1−1} has been proposed. The signs of the operands in comparison, as well as their difference are detected after scaling them by a factor of (22n −1) (2n−1−1). The resulting finite series in the composite modulus channel is further factored into parallel carry-saved additions in the existing mod 2n and mod 2n+1−1 modulus channels to reduce the sizes of modulo adders from 5n bits to n and n+1 bits. Upon detecting the signs of the operands and their difference, the relation is inferred with a small fraction of logic gates. Synthesis results in 65 nm CMOS technology show that the proposed design is 36.9% smaller, 7.6% faster and 45.5% more energy-efficient than the best CRT-II based magnitude comparator and at least 12.9% smaller, 7.3% faster and 20.8% more energy-efficient than the best reverse-conversion based implementation of signed integer comparator for the same five-moduli set.Doctor of Philosophy (EEE

    Residue Number System Based Building Blocks for Applications in Digital Signal Processing

    Get PDF
    Předkládaná disertační práce se zabývá návrhem základních bloků v systému zbytkových tříd pro zvýšení výkonu aplikací určených pro digitální zpracování signálů (DSP). Systém zbytkových tříd (RNS) je neváhová číselná soustava, jež umožňuje provádět paralelizovatelné, vysokorychlostní, bezpečné a proti chybám odolné aritmetické operace, které jsou zpracovávány bez přenosu mezi řády. Tyto vlastnosti jej činí značně perspektivním pro použití v DSP aplikacích náročných na výpočetní výkon a odolných proti chybám. Typický RNS systém se skládá ze tří hlavních částí: převodníku z binárního kódu do RNS, který počítá ekvivalent vstupních binárních hodnot v systému zbytkových tříd, dále jsou to paralelně řazené RNS aritmetické jednotky, které provádějí aritmetické operace s operandy již převedenými do RNS. Poslední část pak tvoří převodník z RNS do binárního kódu, který převádí výsledek zpět do výchozího binárního kódu. Hlavním cílem této disertační práce bylo navrhnout nové struktury základních bloků výše zmiňovaného systému zbytkových tříd, které mohou být využity v aplikacích DSP. Tato disertační práce předkládá zlepšení a návrhy nových struktur komponent RNS, simulaci a také ověření jejich funkčnosti prostřednictvím implementace v obvodech FPGA. Kromě návrhů nové struktury základních komponentů RNS je prezentován také podrobný výzkum různých sad modulů, který je srovnává a determinuje nejefektivnější sadu pro různé dynamické rozsahy. Dalším z klíčových přínosů disertační práce je objevení a ověření podmínky určující výběr optimální sady modulů, která umožňuje zvýšit výkonnost aplikací DSP. Dále byla navržena aplikace pro zpracování obrazu využívající RNS, která má vůči klasické binární implementanci nižší spotřebu a vyšší maximální pracovní frekvenci. V závěru práce byla vyhodnocena hlavní kritéria při rozhodování, zda je vhodnější pro danou aplikaci využít binární číselnou soustavu nebo RNS.This doctoral thesis deals with designing residue number system based building blocks to enhance the performance of digital signal processing applications. The residue number system (RNS) is a non-weighted number system that provides carry-free, parallel, high speed, secure and fault tolerant arithmetic operations. These features make it very attractive to be used in high-performance and fault tolerant digital signal processing (DSP) applications. A typical RNS system consists of three main components; the first one is the binary to residue converter that computes the RNS equivalent of the inputs represented in the binary number system. The second component in this system is parallel residue arithmetic units that perform arithmetic operations on the operands already represented in RNS. The last component is the residue to binary converter, which converts the outputs back into their binary representation. The main aim of this thesis was to propose novel structures of the basic components of this system in order to be later used as fundamental units in DSP applications. This thesis encloses improving and designing novel structures of these components, simulating and verifying their efficiency via FPGA implementation. In addition to suggesting novel structures of basic RNS components, a detailed study on different moduli sets that compares and determines the most efficient one for different dynamic range requirements is also presented. One of the main outcomes of this thesis is concluding and verifying the main condition that should be met when choosing a moduli set, in order to improve the timing performance of a DSP application. An RNS-based image processing application is also proposed. Its efficiency, in terms of timing performance and power consumption, is proved via comparing it with a binary-based one. Finally, the main considerations that should be taken into account when choosing to use the binary number system or RNS are also discussed in details.

    Scalable Energy-efficient Microarchitectures with Computational Error Tolerance

    Get PDF
    Dennard scaling of conventional semiconductor technology has reached its limit resulting in issues pertaining to leakage current and threshold voltage. Energy-savings found at the transistor level by simply lowering supply voltage are no longer available for these devices (e.g., MOSFETs) and has reached the Landauer-Shannon limit. Recent proposals of minivolt switch technologies aim to extend the technology scaling roadmap by maintaining a high on/off ratio of drain current with a much lower supply voltage. However, high intermittent error probabilities in millivolt switches constraints their Vdd reduction for traditional architectures. Thus, there is an urgent need for scalable and energy-efficient micro-architectures with computational error-tolerance. This thesis systematically leverages the error detection and correction properties of the Redundant Residue Number System (RRNS) by varying the number of non-redundant (n) and redundant (r) components (residues), and selects and discusses trade-offs about configuration points from a two-dimensional (n, r)-RRNS design plane that meet certain capabilities of error detection and/or correction. Being able to efficiently handle resilience in this (n, r)-RRNS plane significantly improves reliability, allowing further Vdd reduction and energy savings. First, the necessary implementation details of RRNS cores are discussed. Second, scalable RRNS micro-architectures that simultaneously support both error-correction and checkpointing with restart capabilities for uncorrectable errors are proposed. Third, novel RRNS-based adaptive checkpointing&restart mechanisms are designed that automatically guarantee reliability while minimizing the energy-delay product (EDP). Finally, the RRNS design space is explored to find the optimal (n, r) configuration points. For similar reliability when compared to a conventional binary core (running at high Vdd) without computational error tolerance, the proposed RRNS scalable micro-architecture reduces EDP by 53% on average for memory-intensive workloads and by 67% on average for non-memory-intensive workloads. This thesis's second topic is to alleviate fault rate and power consumption issues of exascale computing. Faults in High-Performance Computing (HPC) have become an urgent challenge with estimated Mean Time Between Failures (MTBF) of exascale system projected as only several minutes with contemporary methodologies. Unfortunately, existing error-tolerance technologies in the context of HPC systems have serious deficiencies such as insufficient error-tolerance coverage, high power consumption, and difficult integration with existing workloads. Considering Department of Energy (DOE) guidelines that limit exascale power consumption to 20 MW, this thesis highlights the issue of energy usage and proposes a thread-level fault tolerance mechanism compatible with current state-of-the art exascale programming models while simultaneously meeting the requirements of full system error protection. Additionally, an efficient micro-architecture and corresponding mechanisms that can support thread level RRNS are discussed. Experimental results show that this strategy reduces energy consumption by 62.25% and the Energy-Delay-Product by 58.67% on average when compared with state-of-the-art black box resilience techniques.Ph.D

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    Cloud-based homomorphic encryption for privacy-preserving machine learning in clinical decision support

    Get PDF
    While privacy and security concerns dominate public cloud services, Homomorphic Encryption (HE) is seen as an emerging solution that ensures secure processing of sensitive data via untrusted networks in the public cloud or by third-party cloud vendors. It relies on the fact that some encryption algorithms display the property of homomorphism, which allows them to manipulate data meaningfully while still in encrypted form; although there are major stumbling blocks to overcome before the technology is considered mature for production cloud environments. Such a framework would find particular relevance in Clinical Decision Support (CDS) applications deployed in the public cloud. CDS applications have an important computational and analytical role over confidential healthcare information with the aim of supporting decision-making in clinical practice. Machine Learning (ML) is employed in CDS applications that typically learn and can personalise actions based on individual behaviour. A relatively simple-to-implement, common and consistent framework is sought that can overcome most limitations of Fully Homomorphic Encryption (FHE) in order to offer an expanded and flexible set of HE capabilities. In the absence of a significant breakthrough in FHE efficiency and practical use, it would appear that a solution relying on client interactions is the best known entity for meeting the requirements of private CDS-based computation, so long as security is not significantly compromised. A hybrid solution is introduced, that intersperses limited two-party interactions amongst the main homomorphic computations, allowing exchange of both numerical and logical cryptographic contexts in addition to resolving other major FHE limitations. Interactions involve the use of client-based ciphertext decryptions blinded by data obfuscation techniques, to maintain privacy. This thesis explores the middle ground whereby HE schemes can provide improved and efficient arbitrary computational functionality over a significantly reduced two-party network interaction model involving data obfuscation techniques. This compromise allows for the powerful capabilities of HE to be leveraged, providing a more uniform, flexible and general approach to privacy-preserving system integration, which is suitable for cloud deployment. The proposed platform is uniquely designed to make HE more practical for mainstream clinical application use, equipped with a rich set of capabilities and potentially very complex depth of HE operations. Such a solution would be suitable for the long-term privacy preserving-processing requirements of a cloud-based CDS system, which would typically require complex combinatorial logic, workflow and ML capabilities

    Review of Particle Physics (2016)

    Get PDF

    Review of particle physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 3,062 new measurements from 721 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. All the particle properties and search limits are listed in Summary Tables. We also give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 117 reviews are many that are new or heavily revised, including those on Pentaquarks and Inflation. The complete Review is published online in a journal and on the website of the Particle Data Group (http://pdg.lbl.gov). The printed PDG Book contains the Summary Tables and all review articles but no longer includes the detailed tables from the Particle Listings. A Booklet with the Summary Tables and abbreviated versions of some of the review articles is also available

    Review of Particle Physics

    Get PDF
    The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 3,062 new measurements from 721 papers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as supersymmetric particles, heavy bosons, axions, dark photons, etc. All the particle properties and search limits are listed in Summary Tables. We also give numerous tables, figures, formulae, and reviews of topics such as Higgs Boson Physics, Supersymmetry, Grand Unified Theories, Neutrino Mixing, Dark Energy, Dark Matter, Cosmology, Particle Detectors, Colliders, Probability and Statistics. Among the 117 reviews are many that are new or heavily revised, including new reviews on Pentaquarks and Inflation. The complete Review is published online in a journal and on the website of the Particle Data Group (http://pdg.lbl.gov). The printed PDG Book contains the Summary Tables and all review articles but no longer includes the detailed tables from the Particle Listings. A Booklet with the Summary Tables and abbreviated versions of some of the review articles is also available.The publication of the Review of Particle Physics is supported by the Director, Office of Science, Office of High Energy Physics of the U.S. Department of Energy under Contract No. DE–AC02–05CH11231; by the European Laboratory for Particle Physics (CERN); by an implementing arrangement between the governments of Japan (MEXT: Ministry of Education, Culture, Sports, Science and Technology) and the United States (DOE) on cooperative research and development; by the Institute of High Energy Physics, Chinese Academy of Sciences; and by the Italian National Institute of Nuclear Physics (INFN).The authors are grateful to Vincent Vennin for his careful reading of this manuscript and preparing Fig. 23.3 for this review. The work of J.E. was supported in part by the London Centre for Terauniverse Studies (LCTS), using funding from the European Research Council via the Advanced Investigator Grant 267352 and from the UK STFC via the research grant ST/L000326/1. The work of D.W. was supported in part by the UK STFC research grant ST/K00090X/1
    corecore