10,576 research outputs found
Benchmarking database systems for Genomic Selection implementation
Motivation: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. Results: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix
Notes on Cloud computing principles
This letter provides a review of fundamental distributed systems and economic
Cloud computing principles. These principles are frequently deployed in their
respective fields, but their inter-dependencies are often neglected. Given that
Cloud Computing first and foremost is a new business model, a new model to sell
computational resources, the understanding of these concepts is facilitated by
treating them in unison. Here, we review some of the most important concepts
and how they relate to each other
Review of trends and targets of complex systems for power system optimization
Optimization systems (OSs) allow operators of electrical power systems (PS) to optimally operate PSs and to also create optimal PS development plans. The inclusion of OSs in the PS is a big trend nowadays, and the demand for PS optimization tools and PS-OSs experts is growing. The aim of this review is to define the current dynamics and trends in PS optimization research and to present several papers that clearly and comprehensively describe PS OSs with characteristics corresponding to the identified current main trends in this research area. The current dynamics and trends of the research area were defined on the basis of the results of an analysis of the database of 255 PS-OS-presenting papers published from December 2015 to July 2019. Eleven main characteristics of the current PS OSs were identified. The results of the statistical analyses give four characteristics of PS OSs which are currently the most frequently presented in research papers: OSs for minimizing the price of electricity/OSs reducing PS operation costs, OSs for optimizing the operation of renewable energy sources, OSs for regulating the power consumption during the optimization process, and OSs for regulating the energy storage systems operation during the optimization process. Finally, individual identified characteristics of the current PS OSs are briefly described. In the analysis, all PS OSs presented in the observed time period were analyzed regardless of the part of the PS for which the operation was optimized by the PS OS, the voltage level of the optimized PS part, or the optimization goal of the PS OS.Web of Science135art. no. 107
SOA enabled ELTA: approach in designing business intelligence solutions in Era of Big Data
The current work presents a new approach for designing business intelligence solutions. In the Era of Big Data, former and robust analytical concepts and utilities need to adapt themselves to the changed market circumstances. The main focus of this work is to address the acceleration of building process of a “data-centric” Business Intelligence (BI) solution besides preparing BI solutions for Big Data utilization. This research addresses the following goals: reducing the time spent during business intelligence solution’s design phase; achieving flexibility of BI solution by adding new data sources; and preparing BI solution for utilizing Big Data concepts. This research proposes an extension of the existing Extract, Load and Transform (ELT) approach to the new one Extract, Load, Transform and Analyze (ELTA) supported by service-orientation concept. Additionally, the proposed model incorporates Service-Oriented Architecture concept as a mediator for the transformation phase. On one side, such incorporation brings flexibility to the BI solution and on the other side; it reduces the complexity of the whole system by moving some responsibilities to external authorities
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
- …