461 research outputs found

    HW-FlowQ: A Multi-Abstraction Level HW-CNN Co-design Quantization Methodology

    Get PDF
    Model compression through quantization is commonly applied to convolutional neural networks (CNNs) deployed on compute and memory-constrained embedded platforms. Different layers of the CNN can have varying degrees of numerical precision for both weights and activations, resulting in a large search space. Together with the hardware (HW) design space, the challenge of finding the globally optimal HW-CNN combination for a given application becomes daunting. To this end, we propose HW-FlowQ, a systematic approach that enables the co-design of the target hardware platform and the compressed CNN model through quantization. The search space is viewed at three levels of abstraction, allowing for an iterative approach for narrowing down the solution space before reaching a high-fidelity CNN hardware modeling tool, capable of capturing the effects of mixed-precision quantization strategies on different hardware architectures (processing unit counts, memory levels, cost models, dataflows) and two types of computation engines (bit-parallel vectorized, bit-serial). To combine both worlds, a multi-objective non-dominated sorting genetic algorithm (NSGA-II) is leveraged to establish a Pareto-optimal set of quantization strategies for the target HW-metrics at each abstraction level. HW-FlowQ detects optima in a discrete search space and maximizes the task-related accuracy of the underlying CNN while minimizing hardware-related costs. The Pareto-front approach keeps the design space open to a range of non-dominated solutions before refining the design to a more detailed level of abstraction. With equivalent prediction accuracy, we improve the energy and latency by 20% and 45% respectively for ResNet56 compared to existing mixed-precision search methods

    GNU epsilon - an extensible programming language

    Full text link
    Reductionism is a viable strategy for designing and implementing practical programming languages, leading to solutions which are easier to extend, experiment with and formally analyze. We formally specify and implement an extensible programming language, based on a minimalistic first-order imperative core language plus strong abstraction mechanisms, reflection and self-modification features. The language can be extended to very high levels: by using Lisp-style macros and code-to-code transforms which automatically rewrite high-level expressions into core forms, we define closures and first-class continuations on top of the core. Non-self-modifying programs can be analyzed and formally reasoned upon, thanks to the language simple semantics. We formally develop a static analysis and prove a soundness property with respect to the dynamic semantics. We develop a parallel garbage collector suitable to multi-core machines to permit efficient execution of parallel programs.Comment: 172 pages, PhD thesi

    Type systems for programs respecting dimensions

    Get PDF
    Type systems can be used for tracking dimensional consistency of numerical computations: we present an extension from dimensions of scalar quantities to dimensions of vectors and matrices, making use of dependent types from programming language theory. We show that our types are unique, and most general. We further show that we can give straightforward dimensioned types to many common matrix operations such as addition, multiplication, determinants, traces, and fundamental row operations

    WHYPE: A Scale-Out Architecture with Wireless Over-the-Air Majority for Scalable In-memory Hyperdimensional Computing

    Full text link
    Hyperdimensional computing (HDC) is an emerging computing paradigm that represents, manipulates, and communicates data using long random vectors known as hypervectors. Among different hardware platforms capable of executing HDC algorithms, in-memory computing (IMC) has shown promise as it is very efficient in performing matrix-vector multiplications, which are common in the HDC algebra. Although HDC architectures based on IMC already exist, how to scale them remains a key challenge due to collective communication patterns that these architectures required and that traditional chip-scale networks were not designed for. To cope with this difficulty, we propose a scale-out HDC architecture called WHYPE, which uses wireless in-package communication technology to interconnect a large number of physically distributed IMC cores that either encode hypervectors or perform multiple similarity searches in parallel. In this context, the key enabler of WHYPE is the opportunistic use of the wireless network as a medium for over-the-air computation. WHYPE implements an optimized source coding that allows receivers to calculate the bit-wise majority of multiple hypervectors (a useful operation in HDC) being transmitted concurrently over the wireless channel. By doing so, we achieve a joint broadcast distribution and computation with a performance and efficiency unattainable with wired interconnects, which in turn enables massive parallelization of the architecture. Through evaluations at the on-chip network and complete architecture levels, we demonstrate that WHYPE can bundle and distribute hypervectors faster and more efficiently than a hypothetical wired implementation, and that it scales well to tens of receivers. We show that the average error rate of the majority computation is low, such that it has negligible impact on the accuracy of HDC classification tasks.Comment: Accepted at IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS). arXiv admin note: text overlap with arXiv:2205.1088

    Self-adapting structuring and representation of space

    Get PDF
    The objective of this report is to propose a syntactic formalism for space representation. Beside the well known advantages of hierarchical data structure, the underlying approach has the additional strength of self-adapting to a spatial structure at hand. The formalism is called puzzletree because its generation results in a number of blocks which in a certain order -- like a puzzle - reconstruct the original space. The strength of the approach does not lie only in providing a compact representation of space (e.g. high compression), but also in attaining an ideal basis for further knowledge-based modeling and recognition of objects. The approach may be applied to any higher-dimensioned space (e.g. images, volumes). The report concentrates on the principles of puzzletrees by explaining the underlying heuristic for their generation with respect to 2D spaces, i.e. images, but also schemes their application to volume data. Furthermore, the paper outlines the use of puzzletrees to facilitate higher-level operations like image segmentation or object recognition. Finally, results are shown and a comparison to conventional region quadtrees is done

    Data capture from engineering drawings

    Get PDF
    Call number: LD2668 .T4 1985 S574Master of Scienc

    Fuzzy Differential Evolution Algorithm

    Get PDF
    The Differential Evolution (DE) algorithm is a powerful search technique for solving global optimization problems over continuous space. The search initialization for this algorithm does not adequately capture vague preliminary knowledge from the problem domain. This thesis proposes a novel Fuzzy Differential Evolution (FDE) algorithm, as an alternative approach, where the vague information of the search space can be represented and used to deliver a more efficient search. The proposed FDE algorithm utilizes fuzzy set theory concepts to modify the traditional DE algorithm search initialization and mutation components. FDE, alongside other key DE features, is implemented in a convenient decision support system software package. Four benchmark functions are used to demonstrate performance of the new FDE and its practical utility. Additionally, the application of the algorithm is illustrated through a water management case study problem. The new algorithm shows faster convergence for most of the benchmark functions
    • …
    corecore