42 research outputs found

    Methods to Improve Applicability and Efficiency of Distributed Data-Centric Compute Frameworks

    Get PDF
    The success of modern applications depends on the insights they collect from their data repositories. Data repositories for such applications currently exceed exabytes and are rapidly increasing in size, as they collect data from varied sources - web applications, mobile phones, sensors and other connected devices. Distributed storage and data-centric compute frameworks have been invented to store and analyze these large datasets. This dissertation focuses on extending the applicability and improving the efficiency of distributed data-centric compute frameworks

    Function-specific schemes for verifiable computation

    Get PDF
    An integral component of modern computing is the ability to outsource data and computation to powerful remote servers, for instance, in the context of cloud computing or remote file storage. While participants can benefit from this interaction, a fundamental security issue that arises is that of integrity of computation: How can the end-user be certain that the result of a computation over the outsourced data has not been tampered with (not even by a compromised or adversarial server)? Cryptographic schemes for verifiable computation address this problem by accompanying each result with a proof that can be used to check the correctness of the performed computation. Recent advances in the field have led to the first implementations of schemes that can verify arbitrary computations. However, in practice the overhead of these general-purpose constructions remains prohibitive for most applications, with proof computation times (at the server) in the order of minutes or even hours for real-world problem instances. A different approach for designing such schemes targets specific types of computation and builds custom-made protocols, sacrificing generality for efficiency. An important representative of this function-specific approach is an authenticated data structure (ADS), where a specialized protocol is designed that supports query types associated with a particular outsourced dataset. This thesis presents three novel ADS constructions for the important query types of set operations, multi-dimensional range search, and pattern matching, and proves their security under cryptographic assumptions over bilinear groups. The scheme for set operations can support nested queries (e.g., two unions followed by an intersection of the results), extending previous works that only accommodate a single operation. The range search ADS provides an exponential (in the number of attributes in the dataset) asymptotic improvement from previous schemes for storage and computation costs. Finally, the pattern matching ADS supports text pattern and XML path queries with minimal cost, e.g., the overhead at the server is less than 4% compared to simply computing the result, for all our tested settings. The experimental evaluation of all three constructions shows significant improvements in proof-computation time over general-purpose schemes

    Using reconfigurable computing technology to accelerate matrix decomposition and applications

    Get PDF
    Matrix decomposition plays an increasingly significant role in many scientific and engineering applications. Among numerous techniques, Singular Value Decomposition (SVD) and Eigenvalue Decomposition (EVD) are widely used as factorization tools to perform Principal Component Analysis for dimensionality reduction and pattern recognition in image processing, text mining and wireless communications, while QR Decomposition (QRD) and sparse LU Decomposition (LUD) are employed to solve the dense or sparse linear system of equations in bioinformatics, power system and computer vision. Matrix decompositions are computationally expensive and their sequential implementations often fail to meet the requirements of many time-sensitive applications. The emergence of reconfigurable computing has provided a flexible and low-cost opportunity to pursue high-performance parallel designs, and the use of FPGAs has shown promise in accelerating this class of computation. In this research, we have proposed and implemented several highly parallel FPGA-based architectures to accelerate matrix decompositions and their applications in data mining and signal processing. Specifically, in this dissertation we describe the following contributions: • We propose an efficient FPGA-based double-precision floating-point architecture for EVD, which can efficiently analyze large-scale matrices. • We implement a floating-point Hestenes-Jacobi architecture for SVD, which is capable of analyzing arbitrary sized matrices. • We introduce a novel deeply pipelined reconfigurable architecture for QRD, which can be dynamically configured to perform either Householder transformation or Givens rotation in a manner that takes advantage of the strengths of each. • We design a configurable architecture for sparse LUD that supports both symmetric and asymmetric sparse matrices with arbitrary sparsity patterns. • By further extending the proposed hardware solution for SVD, we parallelize a popular text mining tool-Latent Semantic Indexing with an FPGA-based architecture. • We present a configurable architecture to accelerate Homotopy l1-minimization, in which the modification of the proposed FPGA architecture for sparse LUD is used at its core to parallelize both Cholesky decomposition and rank-1 update. Our experimental results using an FPGA-based acceleration system indicate the efficiency of our proposed novel architectures, with application and dimension-dependent speedups over an optimized software implementation that range from 1.5ÃÂ to 43.6ÃÂ in terms of computation time

    A Scale-Invariant Spatial Graph Model

    Get PDF
    Information wird räumlich genannt, wenn sie Referenzen zum Raum beinhaltet. Die vorliegende Dissertation zielt darauf ab, die Charakterisierung räumlicher Information auf ein strukturelles Level zu heben. Toblers erstes Gesetz der Geographie und die Skaleninvarianz werden weithin zur Charakterisierung räumlicher Information verwendet. Ihre formale Beschreibung basiert jedoch auf expliziten Referenzen zum Raum, was einer Verwendung für die strukturelle Charakterisierung räumlicher Information entgegensteht. Der Autor führt daher ein Graphenmodell ein, welches im Falle einer Einbettung des Graphen in einen Raum typische Eigenschaften räumlicher Information aufweist, d.h. unter anderem Toblers Gesetz befolgt und skaleninvariant ist. Das Graphenmodell weist die Auswirkungen dieser typischen Eigenschaften auf seine Struktur auch dann auf, wenn es als abstrakter Graph interpretiert wird. Daher ist es zur Diskussion dieser typischen Eigenschaften auf einem strukturellen Level geeignet. Ein Vergleich des Modells mit verschiedenen räumlichen und nicht-räumlichen Datensätzen in der vorliegenden Dissertation legt nahe, dass räumliche Datensätze durch eine gemeinsame Struktur gekennzeichnet sind, weil die betrachteten räumlichen Datensätze im Gegensatz zu den nicht-räumlichen Gemeinsamkeiten mit dem Modell aufweisen. Dies lässt das Konzept einer räumlichen Struktur sinnvoll erscheinen. Das eingeführte Modell ist ein Modell dieser räumlichen Struktur. Die Dimension des Raumes wirkt sich auf räumliche Information und somit auch auf die räumliche Struktur aus. Die Dissertation untersucht, wie die Eigenschaften des Modells, insbesondere im Falle einer Gleichverteilung der Knoten im Raum, von der Dimension des Raumes abhängen und zeigt, wie eine Schätzung der Dimension aus der räumlichen Struktur eines Datensatzes gefolgert werden kann. Die Ergebnisse der Dissertation, insbesondere das Konzept einer räumlichen Struktur und das Graphenmodell, stellen einen grundlegenden Beitrag für die Diskussion räumlicher Information auf einem strukturellen Level dar: Auf räumlichen Daten operierende Algorithmen können unter Berücksichtigung der räumlichen Struktur verbessert werden; eine statistische Evaluation von Überlegungen zu räumlichen Daten wird möglich, da das Graphenmodell beliebig viele Testdatensätze mit kontrollierbaren Eigenschaften generieren kann; und das Erkennen von räumlichen Strukturen sowie die Schätzung der Dimension und weiterer Parameter kann zum langfristigen Ziel beitragen, Daten mit unvollständiger oder fehlender Semantik zu verwenden.Information is called spatial if it contains references to space. The thesis aims at lifting the characterization of spatial information to a structural level. Tobler's first law of geography and scale invariance are widely used to characterize spatial information, but their formal description is based on explicit references to space, which prevents them from being used in the structural characterization of spatial information. To overcome this problem, the author proposes a graph model that exposes, when embedded in space, typical properties of spatial information, amongst others Tobler's law and scale invariance. The graph model, considered as an abstract graph, still exposes the effect of these typical properties on the structure of the graph and can thus be used for the discussion of these typical properties at a structural level. A comparison of the proposed model to several spatial and non-spatial data sets in this thesis suggests that spatial data sets can be characterized by a common structure, because the considered spatial data sets expose structural similarities to the proposed model but the non-spatial data sets do not. This proves the concept of a spatial structure to be meaningful, and the proposed model to be a model of spatial structure. The dimension of space has an impact on spatial information, and thus also on the spatial structure. The thesis examines how the properties of the proposed graph model, in particular in case of a uniform distribution of nodes in space, depend on the dimension of space and shows how to estimate the dimension from the structure of a data set. The results of the thesis, in particular the concept of a spatial structure and the proposed graph model, are a fundamental contribution to the discussion of spatial information at a structural level: algorithms that operate on spatial data can be improved by paying attention to the spatial structure; a statistical evaluation of considerations about spatial data is rendered possible, because the graph model can generate arbitrarily many test data sets with controlled properties; and the detection of spatial structures as well as the estimation of the dimension and other parameters can contribute to the long-term goal of using data with incomplete or missing semantics.von Franz-Benjamin MocnikZusammenfassung in deutscher SpracheAbweichender Titel nach Übersetzung der Verfasserin/des VerfassersTechnische Universität Wien, Dissertation, 2016OeBB(VLID)164200
    corecore