27 research outputs found

    FACHBEITRAG Unleashing XQuery for Data-Independent Programming

    Get PDF
    an SQL equivalent for XML data, but its roots in functional programming make it also a perfect choice for processing almost any kind of structured and semi-structured data. Apart from standard XML processing, however, advanced language features make it hard to efficiently implement the complete language for large data volumes. This work proposes a novel compilation strategy that provides both flexibility and efficiency to unleash XQuery’s potential as data programming language. It combines the simplicity and versatility of a storage-independent data abstraction with the scalability advantages of set-oriented processing. Expensive iterative sections in a query are unrolled to a pipeline of relational-style operators, which is open for optimized join processing, index use, and parallelization. The remaining aspects of the language are processed in a standard fashion, yet can be compiled anytime to more efficient native operations of the actual runtime environment. This hybrid compilation mechanism yields an efficient and highly flexible query engine that is able to drive any computation from simple XML transformation to complex data analysis, even on non-XML data. Experiments with our prototype and stateof-the-art competitors in classic XML query processing and business analytics over relational data attest the generality and efficiency of the design

    MONITORAMENTO DA QUALIDADE DA ÁGUA E AVALIAÇÃO DA CAPACIDADE DE AUTODEPURAÇÃO DO RIO LIGEIRO NO MUNICÍPIO E PATO BRANCO - PR

    Get PDF
    Este estudo teve como objetivo realizar o monitoramento da qualidade da água para a avaliação da capacidade de autodepuração do Rio Ligeiro, desde uma das nascentes até a confluência com o Rio Chopim. Estabeleceram-se seis pontos de monitoramento (PM01 a PM06) ao longo do percurso. Foram realizadas coletas de amostras de água nas diferentes estações do ano de 2017, para análises de oxigênio dissolvido (OD), demanda bioquímica de oxigênio (DBO) e mensuração da temperatura. Além disso, foram realizadas medições hidrológicas em todos os PM. Com base nos resultados das análises laboratoriais realizou-se a calibração do modelo QUAL-UFMG para a avaliação da capacidade de autodepuração. A modelagem da qualidade da água, na campanha de inverno, foi a que apresentou maior grau de deterioração da qualidade da água, com baixos níveis de OD e elevada concentração de DBO, indicando que no período de estiagem reduz-se a capacidade de diluição frente aos despejos de efluentes. Conclui-se que o Rio Ligeiro encontra-se deteriorado desde a nascente e que a sua capacidade de autodepuração apresenta condições críticas no período de menor vazão

    SPA: Economical and workload-driven indexing for data analytics in the cloud

    Get PDF
    Selective queries are not uncommon in large-scale data analytics, for example, when drilling down into a specific customer in a dashboard. Traditionally, selective queries are accelerated by creating secondary indexes. However, because of their large size, expensive maintenance, and difficulty to tune and automate, indexes are typically not used in modern cloud data warehouses or data lakes. Instead, such systems rely mostly on full table scans and lightweight optimizations like min/max filtering, whose effectiveness depends heavily on the data layout and value distributions.We propose SPA as the vision for automatically optimizing selective queries for immutable copy-on-write data formats. SPA adaptively indexes subsets of the data in an incremental and workload-driven manner. It makes fine-grained decisions and continuously monitors their benefit, dynamically allocating an optimization budget in a way that bounds the additional cost of indexing. Furthermore, it guarantees a performance improvement in the cases where indexes - potentially partial ones - prove to be beneficial. When indexes lose their benefit due to a shifting workload, they are gradually deconstructed in favor of optimizations that accommodate recent trends. As SPA does not require information about updates performed on the data, it can also be employed as an accelerator for systems that do not control the data, e.g., in cloud data lakes
    corecore