10,576 research outputs found

    Benchmarking database systems for Genomic Selection implementation

    Get PDF
    Motivation: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. Results: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix

    Notes on Cloud computing principles

    Get PDF
    This letter provides a review of fundamental distributed systems and economic Cloud computing principles. These principles are frequently deployed in their respective fields, but their inter-dependencies are often neglected. Given that Cloud Computing first and foremost is a new business model, a new model to sell computational resources, the understanding of these concepts is facilitated by treating them in unison. Here, we review some of the most important concepts and how they relate to each other

    Review of trends and targets of complex systems for power system optimization

    Get PDF
    Optimization systems (OSs) allow operators of electrical power systems (PS) to optimally operate PSs and to also create optimal PS development plans. The inclusion of OSs in the PS is a big trend nowadays, and the demand for PS optimization tools and PS-OSs experts is growing. The aim of this review is to define the current dynamics and trends in PS optimization research and to present several papers that clearly and comprehensively describe PS OSs with characteristics corresponding to the identified current main trends in this research area. The current dynamics and trends of the research area were defined on the basis of the results of an analysis of the database of 255 PS-OS-presenting papers published from December 2015 to July 2019. Eleven main characteristics of the current PS OSs were identified. The results of the statistical analyses give four characteristics of PS OSs which are currently the most frequently presented in research papers: OSs for minimizing the price of electricity/OSs reducing PS operation costs, OSs for optimizing the operation of renewable energy sources, OSs for regulating the power consumption during the optimization process, and OSs for regulating the energy storage systems operation during the optimization process. Finally, individual identified characteristics of the current PS OSs are briefly described. In the analysis, all PS OSs presented in the observed time period were analyzed regardless of the part of the PS for which the operation was optimized by the PS OS, the voltage level of the optimized PS part, or the optimization goal of the PS OS.Web of Science135art. no. 107

    SOA enabled ELTA: approach in designing business intelligence solutions in Era of Big Data

    Get PDF
    The current work presents a new approach for designing business intelligence solutions. In the Era of Big Data, former and robust analytical concepts and utilities need to adapt themselves to the changed market circumstances. The main focus of this work is to address the acceleration of building process of a “data-centric” Business Intelligence (BI) solution besides preparing BI solutions for Big Data utilization. This research addresses the following goals: reducing the time spent during business intelligence solution’s design phase; achieving flexibility of BI solution by adding new data sources; and preparing BI solution for utilizing Big Data concepts. This research proposes an extension of the existing Extract, Load and Transform (ELT) approach to the new one Extract, Load, Transform and Analyze (ELTA) supported by service-orientation concept. Additionally, the proposed model incorporates Service-Oriented Architecture concept as a mediator for the transformation phase. On one side, such incorporation brings flexibility to the BI solution and on the other side; it reduces the complexity of the whole system by moving some responsibilities to external authorities

    Building Efficient Query Engines in a High-Level Language

    Get PDF
    Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance, instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend. In this article, we realize this vision in the domain of analytical query processing. We present LegoBase, a query engine written in the high-level language Scala. The key technique to regain efficiency is to apply generative programming: LegoBase performs source-to-source compilation and optimizes the entire query engine by converting the high-level Scala code to specialized, low-level C code. We show how generative programming allows to easily implement a wide spectrum of optimizations, such as introducing data partitioning or switching from a row to a column data layout, which are difficult to achieve with existing low-level query compilers that handle only queries. We demonstrate that sufficiently powerful abstractions are essential for dealing with the complexity of the optimization effort, shielding developers from compiler internals and decoupling individual optimizations from each other. We evaluate our approach with the TPC-H benchmark and show that: (a) With all optimizations enabled, LegoBase significantly outperforms a commercial database and an existing query compiler. (b) Programmers need to provide just a few hundred lines of high-level code for implementing the optimizations, instead of complicated low-level code that is required by existing query compilation approaches. (c) The compilation overhead is low compared to the overall execution time, thus making our approach usable in practice for compiling query engines
    corecore