978 research outputs found

    Dynamic Dictionary with Subconstant Wasted Bits per Key

    Full text link
    Dictionaries have been one of the central questions in data structures. A dictionary data structure maintains a set of key-value pairs under insertions and deletions such that given a query key, the data structure efficiently returns its value. The state-of-the-art dictionaries [Bender, Farach-Colton, Kuszmaul, Kuszmaul, Liu 2022] store nn key-value pairs with only O(nlog(k)n)O(n \log^{(k)} n) bits of redundancy, and support all operations in O(k)O(k) time, for klognk \leq \log^* n. It was recently shown to be optimal [Li, Liang, Yu, Zhou 2023b]. In this paper, we study the regime where the redundant bits is R=o(n)R=o(n), and show that when RR is at least n/polylognn/\text{poly}\log n, all operations can be supported in O(logn+log(n/R))O(\log^* n + \log (n/R)) time, matching the lower bound in this regime [Li, Liang, Yu, Zhou 2023b]. We present two data structures based on which range RR is in. The data structure for R<n/log0.1nR<n/\log^{0.1} n utilizes a generalization of adapters studied in [Berger, Kuszmaul, Polak, Tidor, Wein 2022] and [Li, Liang, Yu, Zhou 2023a]. The data structure for Rn/log0.1nR \geq n/\log^{0.1} n is based on recursively hashing into buckets with logarithmic sizes.Comment: 46 pages; SODA 202

    Managing Metadata in Data Warehouses: Pitfalls and Possibilities

    Get PDF
    This paper motivates a comprehensive academic study of metadata and the roles that metadata plays in organizational information systems. While the benefits of metadata and challenges in implementing metadata solutions are widely addressed in practitioner publications, explicit discussion of metadata in academic literature is rare. Metadata, when discussed, is perceived primarily as a technology solution. Integrated management of metadata and its business value are not well addressed. This paper discusses both the benefits offered by and the challenges associated with integrating metadata. It also describes solutions for addressing some of these challenges. The inherent complexity of an integrated metadata repository is demonstrated by reviewing the metadata functionality required in a data warehouse: a decision support environment where its importance is acknowledged. Comparing this required functionality with metadata management functionalities offered by data warehousing software products identifies crucial gaps. Based on these analyses, topics for further research on metadata are proposed

    Certificate Transparency with Enhancements and Short Proofs

    Full text link
    Browsers can detect malicious websites that are provisioned with forged or fake TLS/SSL certificates. However, they are not so good at detecting malicious websites if they are provisioned with mistakenly issued certificates or certificates that have been issued by a compromised certificate authority. Google proposed certificate transparency which is an open framework to monitor and audit certificates in real time. Thereafter, a few other certificate transparency schemes have been proposed which can even handle revocation. All currently known constructions use Merkle hash trees and have proof size logarithmic in the number of certificates/domain owners. We present a new certificate transparency scheme with short (constant size) proofs. Our construction makes use of dynamic bilinear-map accumulators. The scheme has many desirable properties like efficient revocation, low verification cost and update costs comparable to the existing schemes. We provide proofs of security and evaluate the performance of our scheme.Comment: A preliminary version of the paper was published in ACISP 201

    Certificate Transparency with Enhancements and Short Proofs

    Full text link
    Browsers can detect malicious websites that are provisioned with forged or fake TLS/SSL certificates. However, they are not so good at detecting malicious websites if they are provisioned with mistakenly issued certificates or certificates that have been issued by a compromised certificate authority. Google proposed certificate transparency which is an open framework to monitor and audit certificates in real time. Thereafter, a few other certificate transparency schemes have been proposed which can even handle revocation. All currently known constructions use Merkle hash trees and have proof size logarithmic in the number of certificates/domain owners. We present a new certificate transparency scheme with short (constant size) proofs. Our construction makes use of dynamic bilinear-map accumulators. The scheme has many desirable properties like efficient revocation, low verification cost and update costs comparable to the existing schemes. We provide proofs of security and evaluate the performance of our scheme.Comment: A preliminary version of the paper was published in ACISP 201

    Marshall Application Realignment System (MARS) Architecture

    Get PDF
    The Marshall Application Realignment System (MARS) Architecture project was established to meet the certification requirements of the Department of Defense Architecture Framework (DoDAF) V2.0 Federal Enterprise Architecture Certification (FEAC) Institute program and to provide added value to the Marshall Space Flight Center (MSFC) Application Portfolio Management process. The MARS Architecture aims to: (1) address the NASA MSFC Chief Information Officer (CIO) strategic initiative to improve Application Portfolio Management (APM) by optimizing investments and improving portfolio performance, and (2) develop a decision-aiding capability by which applications registered within the MSFC application portfolio can be analyzed and considered for retirement or decommission. The MARS Architecture describes a to-be target capability that supports application portfolio analysis against scoring measures (based on value) and overall portfolio performance objectives (based on enterprise needs and policies). This scoring and decision-aiding capability supports the process by which MSFC application investments are realigned or retired from the application portfolio. The MARS Architecture is a multi-phase effort to: (1) conduct strategic architecture planning and knowledge development based on the DoDAF V2.0 six-step methodology, (2) describe one architecture through multiple viewpoints, (3) conduct portfolio analyses based on a defined operational concept, and (4) enable a new capability to support the MSFC enterprise IT management mission, vision, and goals. This report documents Phase 1 (Strategy and Design), which includes discovery, planning, and development of initial architecture viewpoints. Phase 2 will move forward the process of building the architecture, widening the scope to include application realignment (in addition to application retirement), and validating the underlying architecture logic before moving into Phase 3. The MARS Architecture key stakeholders are most interested in Phase 3 because this is where the data analysis, scoring, and recommendation capability is realized. Stakeholders want to see the benefits derived from reducing the steady-state application base and identify opportunities for portfolio performance improvement and application realignment

    Representing Dataset Quality Metadata using Multi-Dimensional Views

    Full text link
    Data quality is commonly defined as fitness for use. The problem of identifying quality of data is faced by many data consumers. Data publishers often do not have the means to identify quality problems in their data. To make the task for both stakeholders easier, we have developed the Dataset Quality Ontology (daQ). daQ is a core vocabulary for representing the results of quality benchmarking of a linked dataset. It represents quality metadata as multi-dimensional and statistical observations using the Data Cube vocabulary. Quality metadata are organised as a self-contained graph, which can, e.g., be embedded into linked open datasets. We discuss the design considerations, give examples for extending daQ by custom quality metrics, and present use cases such as analysing data versions, browsing datasets by quality, and link identification. We finally discuss how data cube visualisation tools enable data publishers and consumers to analyse better the quality of their data.Comment: Preprint of a paper submitted to the forthcoming SEMANTiCS 2014, 4-5 September 2014, Leipzig, German
    corecore