476 research outputs found

    On optimally partitioning a text to improve its compression

    Full text link
    In this paper we investigate the problem of partitioning an input string T in such a way that compressing individually its parts via a base-compressor C gets a compressed output that is shorter than applying C over the entire T at once. This problem was introduced in the context of table compression, and then further elaborated and extended to strings and trees. Unfortunately, the literature offers poor solutions: namely, we know either a cubic-time algorithm for computing the optimal partition based on dynamic programming, or few heuristics that do not guarantee any bounds on the efficacy of their computed partition, or algorithms that are efficient but work in some specific scenarios (such as the Burrows-Wheeler Transform) and achieve compression performance that might be worse than the optimal-partitioning by a Ω(logn)\Omega(\sqrt{\log n}) factor. Therefore, computing efficiently the optimal solution is still open. In this paper we provide the first algorithm which is guaranteed to compute in O(n \log_{1+\eps}n) time a partition of T whose compressed output is guaranteed to be no more than (1+ϵ)(1+\epsilon)-worse the optimal one, where ϵ\epsilon may be any positive constant

    La giurisdizione volontaria nel Diritto Processuale Civile Internazionale

    Get PDF
    Divulgação dos SUMÁRIOS das obras recentemente incorporadas ao acervo da Biblioteca Ministro Oscar Saraiva do STJ. Em respeito à lei de Direitos Autorais, não disponibilizamos a obra na íntegra. STJ0008192

    Bicriteria data compression

    Get PDF
    The advent of massive datasets (and the consequent design of high-performing distributed storage systems) have reignited the interest of the scientific and engineering community towards the design of lossless data compressors which achieve effective compression ratio and very efficient decompression speed. Lempel-Ziv's LZ77 algorithm is the de facto choice in this scenario because of its decompression speed and its flexibility in trading decompression speed versus compressed-space efficiency. Each of the existing implementations offers a trade-off between space occupancy and decompression speed, so software engineers have to content themselves by picking the one which comes closer to the requirements of the application in their hands. Starting from these premises, and for the first time in the literature, we address in this paper the problem of trading optimally, and in a principled way, the consumption of these two resources by introducing the Bicriteria LZ77-Parsing problem, which formalizes in a principled way what data-compressors have traditionally approached by means of heuristics. The goal is to determine an LZ77 parsing which minimizes the space occupancy in bits of the compressed file, provided that the decompression time is bounded by a fixed amount (or vice-versa). This way, the software engineer can set its space (or time) requirements and then derive the LZ77 parsing which optimizes the decompression speed (or the space occupancy, respectively). We solve this problem efficiently in O(n log^2 n) time and optimal linear space within a small, additive approximation, by proving and deploying some specific structural properties of the weighted graph derived from the possible LZ77-parsings of the input file. The preliminary set of experiments shows that our novel proposal dominates all the highly engineered competitors, hence offering a win-win situation in theory&practice

    Compressed Text Indexes:From Theory to Practice!

    Full text link
    A compressed full-text self-index represents a text in a compressed form and still answers queries efficiently. This technology represents a breakthrough over the text indexing techniques of the previous decade, whose indexes required several times the size of the text. Although it is relatively new, this technology has matured up to a point where theoretical research is giving way to practical developments. Nonetheless this requires significant programming skills, a deep engineering effort, and a strong algorithmic background to dig into the research results. To date only isolated implementations and focused comparisons of compressed indexes have been reported, and they missed a common API, which prevented their re-use or deployment within other applications. The goal of this paper is to fill this gap. First, we present the existing implementations of compressed indexes from a practitioner's point of view. Second, we introduce the Pizza&Chili site, which offers tuned implementations and a standardized API for the most successful compressed full-text self-indexes, together with effective testbeds and scripts for their automatic validation and test. Third, we show the results of our extensive experiments on these codes with the aim of demonstrating the practical relevance of this novel and exciting technology

    Cache-Oblivious Peeling of Random Hypergraphs

    Full text link
    The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random rr-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n))O(\mathrm{sort}(n)) I/Os and O(nlogn)O(n \log n) time to peel a random hypergraph with nn edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.67.6 billion keys in less than 2121 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets

    Exhaust Energy Recovery with Variable Geometry Turbine to Reduce Fuel Consumption for Microcars

    Get PDF
    The objective proposed by EU to reduce by about 4%/year CO2 emission of internal combustion engines for the next years up to 2030, requires to increase the engine efficiency and accordingly improving the technology. In this framework, hybrid powertrains can have the possibility of a deep market penetration since they may recover energy during brake, allow the engine to operate in better efficiency conditions and with less transients, Moreover, they can recover a large amount of energy lost through the exhaust and use it to reduce fuel consumption. This paper concerns the modification of a conventional two in-line cylinders Diesel engine (440 cm3) adding a variable geometry turbine (VGT) coupled with a generator. The turbine is used to recover exhaust gas energy that otherwise would be lost. The generator, connected to the turbo shaft, converts mechanical energy into electrical energy and is used to charge the vehicle battery or the auxiliaries. The aim of this work is reducing fuel consumption by replacing the alternator with a kind of electric turbo-compounding system to drive vehicle auxiliaries. If the selected turbine recovers enough energy to power auxiliaries, the alternator, which usually has low efficiency, can be removed. Along these lines, fuel consumption savings can be achieved. At a later stage, a microcar has been tested on WLTC (Class 1) driving cycle. The results show fuel consumption reduction of 6 to 9%, depending on VGT size. Indeed, four different VGT sizes have been analyzed to choose the optimal configuration that reflects a compromise between energy recovery and fuel consumption reductions

    Unsteady cfd analysis of erosion mechanism in the coolant channels of a rotating gas turbine blade

    Get PDF
    The two-phase flow in a rotating wedge mimicking the final portion of a blade turbine internal cooling channel is here presented and discussed focusing on unsteady motion and erosion mechanisms. The rotation axis is placed to properly reproduce a configuration with a very strong deviation (90°). The flow field was modelled by using the well known k---f unsteady-RANS model based on the elliptic-relaxation concept. The model was modified by some of the authors to take into account the influence of turbulence anisotropy as well as rotation. The model was applied to the well-established and fully validated T-FlowS code. A systematic comparison of rotating and non-rotating case was carried out to show the influence of Coriolis force on flow and erosion mechanisms. The rotational effects strongly changed the flow behaviour within the channel, affecting both the unsteady flow and the particles trajectories. In the rotating case, there is no recirculation on the tip region; besides, position of the small recirculation regions above each pedestals change. These, and other minor effects, affect the particle motion thus resulting in a different erosion pattern

    BAC: A bagged associative classifier for big data frameworks

    Get PDF
    Big Data frameworks allow powerful distributed computations extending the results achievable on a single machine. In this work, we present a novel distributed associative classifier, named BAC, based on ensemble techniques. Ensembles are a popular approach that builds several models on different subsets of the original dataset, eventually voting to provide a unique classification outcome. Experiments on Apache Spark and preliminary results showed the capability of the proposed ensemble classifier to obtain a quality comparable with the single-machine version on popular real-world datasets, and overcome their scalability limits on large synthetic datasets
    corecore