382 research outputs found

    Cracking KD-Tree: The first multidimensional adaptive indexing

    Get PDF
    Workload-aware physical data access structures are crucial to achieve short response time with (exploratory) data analysis tasks as commonly required for Big Data and Data Science applications. Recently proposed techniques such as automatic index advisers (for a priori known static workloads) and query-driven adaptive incremental indexing (for a priori unknown dynamic workloads) form the state-of-the-art to build single-dimensional indexes for single-attribute query predicates. However, similar techniques for more demanding multi-attribute query predicates, which are vital for any data analysis task, have not been proposed, yet. In this paper, we present our on-going work on a new set of workload-adaptive indexing techniques that focus on creating multidimensional indexes. We present our proof-of-concept, the Cracking KD-Tree, an adaptive indexing approach that generates a KD-Tree based on multidimensional range query predicates. It works by incrementally creating partial multidimensional indexes as a by-product of query processing. The indexes are produced only on those parts of the data that are accessed, and their creation cost is effectively distributed across a stream of queries. Experimental results show that the Cracking KD-Tree is three times faster than creating a full KD-Tree, one order of magnitude faster than executing full scans and two orders of magnitude faster than using uni-dimensional full or adaptive indexes on multiple columns

    HIPE: HMC Instruction Predication Extension Applied on Database Processing

    Get PDF
    The recent Hybrid Memory Cube (HMC) is a smart memory which includes functional units inside one logic layer of the 3D stacked memory design. In order to execute instructions inside the Hybrid Memory Cube (HMC), the processor needs to send instructions to be executed near data, keeping most of the pipeline complexity inside the processor. Thus, control-flow and data-flow dependencies are all managed inside the processor, in such way that only update instructions are supported by the HMC. In order to solve data-flow dependencies inside the memory, previous work proposed HMC Instruction Vector Extensions (HIVE), which embeds a high number of functional units with a interlock register bank. In this work we propose HMC Instruction Prediction Extensions (HIPE), that supports predicated execution inside the memory, in order to transform control-flow dependencies into data-flow dependencies. Our mechanism focus on removing the high latency iteration between the processor and the smart memory during the execution of branches that depends on data processed inside the memory. In this paper we evaluate a balanced design of HIVE comparing to x86 and HMC executions. After we show the HIPE mechanism results when executing a database workload, which is a strong candidate to use smart memories. We show interesting trade-offs of performance when comparing our mechanism to previous work

    Multidimensional adaptive & progressive indexes

    Get PDF
    Exploratory data analysis is the primary technique used by data scientists to extract knowledge from new data sets. This type of workload is composed of trial-and-error hypothesis-driven queries with a human in the loop. To keep up with the data scientist's productivity, the system must be capable of answering queries in interactive times. Given that these queries are highly selective multidimensional queries, multidimensional indexes are necessary to ensure low latency. However, creating the appropriate indexes is not a given due to the highly exploratory and interactive nature of such human-in-the-loop scenarios.In this paper, we identify four main objectives that are desirable for exploratory data analysis workloads: (1) low overhead over the initial queries, (2) low query variance (i.e., high robustness), (3) predictable index convergence, and (4) low total workload time. Given that not all of them can be achieved at the same time, we present three novel incremental multidimensional indexing techniques that represent three sample points on a Pareto front for this multi-objective optimization problem. (a) The Adaptive KD-Tree is designed to achieve the lowest total workload time at the expense of a higher indexing penalty for the initial queries, lack of robustness, and unpredictable convergence. (b) The Progressive KD-Tree has predictable convergence and a user-defined indexing cost for the initial queries. However, total workload time can be higher than with Adaptive KD-Trees, and per-query time still varies. (c) The Greedy Progressive KD-Tree aims at full robustness at the expense of only improving the per-query cost after full index convergence.Our extensive experimental evaluation using both synthetic and real-life data sets and workloads shows that (a) the Adaptive KD-Tree reduc

    Phyllosticta citricarpa and sister species of global importance to Citrus.

    Get PDF
    Several Phyllosticta species are known as pathogens of Citrus spp., and are responsible for various disease symptoms including leaf and fruit spots. One of the most important species is P. citricarpa, which causes a foliar and fruit disease called citrus black spot. The Phyllosticta species occurring on citrus can most effectively be distinguished from P. citricarpa by means of multilocus DNA sequence data. Recent studies also demonstrated P. citricarpa to be heterothallic, and reported successful mating in the laboratory. Since the domestication of citrus, different clones of P. citricarpa have escaped Asia to other continents via trade routes, with obvious disease management consequences. This pathogen profile represents a comprehensive literature review of this pathogen and allied taxa associated with citrus, focusing on identification, distribution, genomics, epidemiology and disease management. This review also considers the knowledge emerging from seven genomes of Phyllosticta spp., demonstrating unknown aspects of these species, including their mating behaviour.TaxonomyPhyllosticta citricarpa (McAlpine) Aa, 1973. Kingdom Fungi, Phylum Ascomycota, Class Dothideomycetes, Order Botryosphaeriales, Family Phyllostictaceae, Genus Phyllosticta, Species citricarpa.Host rangeConfirmed on more than 12 Citrus species, Phyllosticta citricarpa has only been found on plant species in the Rutaceae.Disease symptomsP. citricarpa causes diverse symptoms such as hard spot, virulent spot, false melanose and freckle spot on fruit, and necrotic lesions on leaves and twigs.Useful websitesDOE Joint Genome Institute MycoCosm portals for the Phyllosticta capitalensis (https://genome.jgi.doe.gov/Phycap1), P. citriasiana (https://genome.jgi.doe.gov/Phycit1), P. citribraziliensis (https://genome.jgi.doe.gov/Phcit1), P. citrichinaensis (https://genome.jgi.doe.gov/Phcitr1), P. citricarpa (https://genome.jgi.doe.gov/Phycitr1, https://genome.jgi.doe.gov/Phycpc1), P. paracitricarpa (https://genome.jgi.doe.gov/Phy27169) genomes. All available Phyllosticta genomes on MycoCosm can be viewed at https://genome.jgi.doe.gov/Phyllosticta

    Comprehensive clinical and molecular analysis of 12 families with type 1 recessive cutis laxa.

    Get PDF
    Autosomal recessive cutis laxa type I (ARCL type I) is characterized by generalized cutis laxa with pulmonary emphysema and/or vascular complications. Rarely, mutations can be identified in FBLN4 or FBLN5. Recently, LTBP4 mutations have been implicated in a similar phenotype. Studying FBLN4, FBLN5, and LTBP4 in 12 families with ARCL type I, we found bi-allelic FBLN5 mutations in two probands, whereas nine probands harbored biallelic mutations in LTBP4. FBLN5 and LTBP4 mutations cause a very similar phenotype associated with severe pulmonary emphysema, in the absence of vascular tortuosity or aneurysms. Gastrointestinal and genitourinary tract involvement seems to be more severe in patients with LTBP4 mutations. Functional studies showed that most premature termination mutations in LTBP4 result in severely reduced mRNA and protein levels. This correlated with increased transforming growth factor-beta (TGFβ) activity. However, one mutation, c.4127dupC, escaped nonsense-mediated decay. The corresponding mutant protein (p.Arg1377Alafs(*) 27) showed reduced colocalization with fibronectin, leading to an abnormal morphology of microfibrils in fibroblast cultures, while retaining normal TGFβ activity. We conclude that LTBP4 mutations cause disease through both loss of function and gain of function mechanisms

    Transmission of mitochondrial DNA following assisted reproduction and nuclear transfer

    Get PDF
    Review of the articleMitochondria are the organelles responsible for producing the majority of a cell's ATP and also play an essential role in gamete maturation and embryo development. ATP production within the mitochondria is dependent on proteins encoded by both the nuclear and the mitochondrial genomes, therefore co-ordination between the two genomes is vital for cell survival. To assist with this co-ordination, cells normally contain only one type of mitochondrial DNA (mtDNA) termed homoplasmy. Occasionally, however, two or more types of mtDNA are present termed heteroplasmy. This can result from a combination of mutant and wild-type mtDNA molecules or from a combination of wild-type mtDNA variants. As heteroplasmy can result in mitochondrial disease, various mechanisms exist in the natural fertilization process to ensure the maternal-only transmission of mtDNA and the maintenance of homoplasmy in future generations. However, there is now an increasing use of invasive oocyte reconstruction protocols, which tend to bypass mechanisms for the maintenance of homoplasmy, potentially resulting in the transmission of either form of mtDNA heteroplasmy. Indeed, heteroplasmy caused by combinations of wild-type variants has been reported following cytoplasmic transfer (CT) in the human and following nuclear transfer (NT) in various animal species. Other techniques, such as germinal vesicle transfer and pronuclei transfer, have been proposed as methods of preventing transmission of mitochondrial diseases to future generations. However, resulting embryos and offspring may contain mtDNA heteroplasmy, which itself could result in mitochondrial disease. It is therefore essential that uniparental transmission of mtDNA is ensured before these techniques are used therapeutically
    corecore