Search CORE

16,155 research outputs found

Timing Measurement Platform for Arbitrary Black-Box Circuits Based on Transition Probability

Author: Cheung PYK
Wong JSJ
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2013
Field of study

Crossref

Spiral - Imperial College Digital Repository

Compressive Mining: Fast and Optimal Data Mining in the Compressed Domain

Author: Freris Nikolaos
Kyrillidis Anastasios
Vlachos Michail
Publication venue
Publication date: 22/05/2014
Field of study

Real-world data typically contain repeated and periodic patterns. This suggests that they can be effectively represented and compressed using only a few coefficients of an appropriate basis (e.g., Fourier, Wavelets, etc.). However, distance estimation when the data are represented using different sets of coefficients is still a largely unexplored area. This work studies the optimization problems related to obtaining the \emph{tightest} lower/upper bound on Euclidean distances when each data object is potentially compressed using a different set of orthonormal coefficients. Our technique leads to tighter distance estimates, which translates into more accurate search, learning and mining operations \textit{directly} in the compressed domain. We formulate the problem of estimating lower/upper distance bounds as an optimization problem. We establish the properties of optimal solutions, and leverage the theoretical analysis to develop a fast algorithm to obtain an \emph{exact} solution to the problem. The suggested solution provides the tightest estimation of the

L_2

-norm or the correlation. We show that typical data-analysis operations, such as k-NN search or k-Means clustering, can operate more accurately using the proposed compression and distance reconstruction technique. We compare it with many other prevalent compression and reconstruction techniques, including random projections and PCA-based techniques. We highlight a surprising result, namely that when the data are highly sparse in some basis, our technique may even outperform PCA-based compression. The contributions of this work are generic as our methodology is applicable to any sequential or high-dimensional data as well as to any orthogonal data transformation used for the underlying data compression scheme.Comment: 25 pages, 20 figures, accepted in VLD

arXiv.org e-Print Archive

Serveur académique lausannois

Gunrock: GPU Graph Analytics

Author: Davidson Andrew
Liu Weitang
Osama Muhammad
Owens John D.
Pan Yuechao
Riffel Andy T.
Wang Leyuan
Wang Yangzihao
Wu Yuduo
Yang Carl
Yuan Chenshan
Publication venue
Publication date: 04/01/2017
Field of study

For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

arXiv.org e-Print Archive

eScholarship - University of California

FigShare

AI/ML Algorithms and Applications in VLSI Design and Technology

Author: Abbas Zia
Ahmad Amir
Amuru Deepthi
Cherupally Pavan K.
Gurram Sushanth R.
Vudumula Harsha V.
Zahra Andleeb
Publication venue
Publication date: 15/02/2023
Field of study

An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

arXiv.org e-Print Archive

An implementation of dynamic fully compressed suffix trees

Author: Figueiredo Miguel Filipe da Silva
Publication venue: Faculdade de Ciências e Tecnologia
Publication date: 01/01/2010
Field of study

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia InformáticaThis dissertation studies and implements a dynamic fully compressed suffix tree. Suffix trees are important algorithms in stringology and provide optimal solutions for myriads of problems. Suffix trees are used, in bioinformatics to index large volumes of data. For most aplications suffix trees need to be efficient in size and funcionality. Until recently they were very large, suffix trees for the 700 megabyte human genome spawn 40 gigabytes of data. The compressed suffix tree requires less space and the recent static fully compressed suffix tree requires even less space, in fact it requires optimal compressed space. However since it is static it is not suitable for dynamic environments. Chan et. al.[3] proposed the first dynamic compressed suffix tree however the space used for a text of size n is O(n log )bits which is far from the new static solutions. Our goal is to implement a recent proposal by Russo, Arlindo and Navarro[22] that defines a dynamic fully compressed suffix tree and uses only nH0 +O(n log ) bits of space

Repositório da Universidade Nova de Lisboa

Recommended from our members

Testability considerations for implementing an embedded memory subsystem

Author: Seok Geewhun
Publication venue
Publication date: 01/02/2012
Field of study

textThere are a number of testability considerations for VLSI design, but test coverage, test time, accuracy of test patterns and correctness of design information for DFD (Design for debug) are the most important ones in design with embedded memories. The goal of DFT (Design-for-Test) is to achieve zero defects. When it comes to the memory subsystem in SOCs (system on chips), many flavors of memory BIST (built-in self test) are able to get high test coverage in a memory, but often, no proper attention is given to the memory interface logic (shadow logic). Functional testing and BIST are the most prevalent tests for this logic, but functional testing is impractical for complicated SOC designs. As a result, industry has widely used at-speed scan testing to detect delay induced defects. Compared with functional testing, scan-based testing for delay faults reduces overall pattern generation complexity and cost by enhancing both controllability and observability of flip-flops. However, without proper modeling of memory, Xs are generated from memories. Also, when the design has chip compression logic, the number of ATPG patterns is increased significantly due to Xs from memories. In this dissertation, a register based testing method and X prevention logic are presented to tackle these problems. An important design stage for scan based testing with memory subsystems is the step to create a gate level model and verify with this model. The flow needs to provide a robust ATPG netlist model. Most industry standard CAD tools used to analyze fault coverage and generate test vectors require gate level models. However, custom embedded memories are typically designed using a transistor-level flow, there is a need for an abstraction step to generate the gate models, which must be equivalent to the actual design (transistor level). The contribution of the research is a framework to verify that the gate level representation of custom designs is equivalent to the transistor-level design. Compared to basic stuck-at fault testing, the number of patterns for at-speed testing is much larger than for basic stuck-at fault testing. So reducing test and data volume are important. In this desertion, a new scan reordering method is introduced to reduce test data with an optimal routing solution. With in depth understanding of embedded memories and flows developed during the study of custom memory DFT, a custom embedded memory Bit Mapping method using a symbolic simulator is presented in the last chapter to achieve high yield for memories.Electrical and Computer Engineerin

Texas ScholarWorks

Impacts of frequent itemset hiding algorithms on privacy preserving data mining

Author: Yıldız Barış
Publication venue: Izmir Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2010Includes bibliographical references (leaves: 54-58)Text in English; Abstract: Turkish and Englishx, 69 leavesThe invincible growing of computer capabilities and collection of large amounts of data in recent years, make data mining a popular analysis tool. Association rules (frequent itemsets), classification and clustering are main methods used in data mining research. The first part of this thesis is implementation and comparison of two frequent itemset mining algorithms that work without candidate itemset generation: Matrix Apriori and FP-Growth. Comparison of these algorithms revealed that Matrix Apriori has higher performance with its faster data structure. One of the great challenges of data mining is finding hidden patterns without violating data owners. privacy. Privacy preserving data mining came into prominence as a solution. In the second study of the thesis, Matrix Apriori algorithm is modified and a frequent itemset hiding framework is developed. Four frequent itemset hiding algorithms are proposed such that: i) all versions work without pre-mining so privacy breech caused by the knowledge obtained by finding frequent itemsets is prevented in advance, ii) efficiency is increased since no pre-mining is required, iii) supports are found during hiding process and at the end sanitized dataset and frequent itemsets of this dataset are given as outputs so no post-mining is required, iv) the heuristics use pattern lengths rather than transaction lengths eliminating the possibility of distorting more valuable data