16,867 research outputs found

    PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development

    Full text link
    This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/function dispatch to the JVM. This hybrid approach---declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small---results in a system that is ideal for the development of reusable, data-intensive tools and libraries. Through extensive benchmarking, we show that implementing complex objects manipulation and non-trivial, library-style computations on top of PlinyCompute can result in a speedup of 2x to more than 50x or more compared to equivalent implementations on Spark.Comment: 48 pages, including references and Appendi

    06472 Abstracts Collection - XQuery Implementation Paradigms

    Get PDF
    From 19.11.2006 to 22.11.2006, the Dagstuhl Seminar 06472 ``XQuery Implementation Paradigms'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Benchmarking database systems for Genomic Selection implementation

    Get PDF
    Motivation: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. Results: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix

    Software como um Serviço: uma plataforma eficaz para oferta de sistemas holísticos de gestão da performance

    Get PDF
    This study main objective was to assess the viability of development of a Performance Management (PM) system, delivered in the form of Software as a Service (SaaS), specific for the hospitality industry and to evaluate the benefits of its use. Software deployed in the cloud, delivered and licensed as a service, is becoming increasingly common and accepted in a business context. Although, Business Intelligence (BI) solutions are not usually distributed in the SaaS model, there are some examples that this is changing. To achieve the study objective, design science research methodology was employed in the development of a prototype. This prototype was deployed in four hotels and its results evaluated. Evaluation of the prototype was focused both on the system technical characteristics and business benefits. Results shown that hotels were very satisfied with the system and that building a prototype and making it available in the form of SaaS is a good solution to assess BI systems contribution to improve management performance.O objetivo principal deste estudo é avaliar a viabilidade de desenvolvimento de um sistema de Gestão da Performance, entregue sob a forma de “Software como Serviço” (SaaS), específico para o setor hoteleiro, e também avaliar os benefícios de seu uso. O software implantado na cloud, entregue e licenciado como um serviço, é cada vez mais aceite num contexto de negócios. Todavia, não é comum que soluções de Business Intelligence (BI) sejam distribuídas neste modelo SaaS. No entanto, existem alguns exemplos de que isso se está a alterar. Para atingir o objetivo do estudo, foi utilizada Design Science Research como metodologia de pesquisa científica para desenvolvimento de um protótipo. Este protótipo foi implementado em quatro hotéis para que os seus resultados pudessem ser avaliados. A avaliação foi focada tanto nas características técnicas do sistema como nos benefícios para o negócio. Os resultados mostraram que os hotéis estavam muito satisfeitos com o sistema e que construir um protótipo e disponibilizá-lo sob a forma de SaaS é uma boa solução para avaliar a contribuição dos sistemas de BI para melhorar o desempenho da gestão.info:eu-repo/semantics/publishedVersio

    XWeB: the XML Warehouse Benchmark

    Full text link
    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

    Managing contextual information in semantically-driven temporal information systems

    Get PDF
    Context-aware (CA) systems have demonstrated the provision of a robust solution for personalized information delivery in the current content-rich and dynamic information age we live in. They allow software agents to autonomously interact with users by modeling the user’s environment (e.g. profile, location, relevant public information etc.) as dynamically-evolving and interoperable contexts. There is a flurry of research activities in a wide spectrum at context-aware research areas such as managing the user’s profile, context acquisition from external environments, context storage, context representation and interpretation, context service delivery and matching of context attributes to users‘ queries etc. We propose SDCAS, a Semantic-Driven Context Aware System that facilitates public services recommendation to users at temporal location. This paper focuses on information management and service recommendation using semantic technologies, taking into account the challenges of relationship complexity in temporal and contextual information
    corecore