66 research outputs found

    Database management system performance comparisons: A systematic literature review

    Full text link
    Efficiency has been a pivotal aspect of the software industry since its inception, as a system that serves the end-user fast, and the service provider cost-efficiently benefits all parties. A database management system (DBMS) is an integral part of effectively all software systems, and therefore it is logical that different studies have compared the performance of different DBMSs in hopes of finding the most efficient one. This study systematically synthesizes the results and approaches of studies that compare DBMS performance and provides recommendations for industry and research. The results show that performance is usually tested in a way that does not reflect real-world use cases, and that tests are typically reported in insufficient detail for replication or for drawing conclusions from the stated results.Comment: 36 page

    GenBase: A Complex Analytics Genomics Benchmark

    Get PDF
    This paper introduces a new benchmark, designed to test database management system (DBMS) performance on a mix of data management tasks (joins, filters, etc.) and complex analytics (regression, singular value decomposition, etc.) Such mixed workloads are prevalent in a number of application areas, including most science workloads and web analytics. As a specific use case, we have chosen genomics data for our benchmark, and have constructed a collection of typical tasks in this area. In addition to being representative of a mixed data management and analytics workload, this benchmark is also meant to scale to large dataset sizes and multiple nodes across a cluster. Besides presenting this benchmark, we have run it on a variety of storage systems including traditional row stores, newer column stores, Hadoop, and an array DBMS. We present performance numbers on all systems on single and multiple nodes, and show that performance differs by orders of magnitude between the various solutions. In addition, we demonstrate that most platforms have scalability issues. We also test offloading the analytics onto a coprocessor. The intent of this benchmark is to focus research interest in this area; to this end, all of our data, data generators, and scripts are available on our web site

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum

    Mechanism-driven hypothesis generation support for a predictive adverse effect in colorectal cancer treatment

    Get PDF
    Diese bioinformatische Dissertation beschreibt die tumorbiologische Hypothesengenierung, insbesondere im Kontext des Kolorektalkarzinoms. Hintergrund der Studien ist eine Beobachtung aus der klinischen Praxis. Verschiedene Autoren berichten, dass bei der Behandlung mit Inhibitoren des Epidermalen Wachstumsfaktor Rezeptors (EGFR), speziell des therapeutischen Antikörpers Cetuximab, eine Minderheit der Patienten die ĂŒbliche Nebenwirkung der HauttoxizitĂ€t nicht oder in deutlich verminderter Form zeigt. Bei diesen Patienten wird gleichzeitig eine reduzierte Wirksamkeit der Therapie beschrieben. Das Ausbleiben der Nebenwirkung wird somit als phĂ€notypischer Biomarker genutzt, um gegebenenfalls die Therapie anzupassen. Nachteilig erscheint in diesem Kontext allerdings die prĂ€ventive Hautpflege sowie die Tatsache, dass eine Cetuximab-Behandlung zunĂ€chst gestartet werden muss, um eine Information ĂŒber die Wirksamkeit zu gewinnen. Dadurch, dass der zugrunde liegende molekulare Mechanismus unbekannt ist, kann keine Vorhersage anhand eines klinischen Test getroffen werden. In der vorliegenden Arbeit war es das Ziel, Hypothesen zu generieren, welche Proteine und zellulĂ€ren Signalwege kausal fĂŒr das unterschiedliche Ansprechverhalten der Patientengruppen sein könnten. Ausgehend von der Annahme, dass natĂŒrliche Keimbahnvarianten in der Erbinformation der Individuen im Behandlungskontext diskriminatorisch wirken, baut die Dissertation auf einem kleinen Datensatz von 23 Exomen von Teilnehmern klinischer Studien auf. Diese Sequenzierungsdaten wurden in genomische Varianten ĂŒberfĂŒhrt und auf ihren potentiellen genetisch-mechanistischen Einfluss hin untersucht. Gezielte EinschrĂ€nkungen wurden dabei anhand einer Modellierung des biomedizinischen Kontextes des Anwendungsfalls eingefĂŒhrt, um die reduzierte Datenlage gezielt mit Informationen anzureichern. Die so erhaltenen Kandidatengene, welche in nachfolgenden praktischen Arbeiten validiert werden mĂŒssen, werden im Einzelnen beschrieben und bewertet. Methodisch ist das Ergebnis dieser Dissertation die „Molecular Systems Map“, eine in Cytoscape modellierte Netzwerkstruktur, die funktionelle Interaktionen zwischen Proteinen interaktiv visualisiert und gleichzeitig als Filter auf Basis des biologischen Kontexts dient. Ziel hierbei ist es, einen biomedizinisch ausgebildeten Fachanwender bei der Generierung von Hypothesen zu unterstĂŒtzen, indem im Gegensatz zu sonst hĂ€ufig anzutreffenden tabellarischen Ansichten die Ergebnisse aus der Sequenzanalyse in eben jenem funktionalen Kontext dargestellt werden. DarĂŒber hinaus wird so die Anwendung von Graphenalgorithmen und die Integration weiterer Daten ermöglicht, z.B. solcher aus komplementĂ€ren ‘omics-Experimenten.This bioinformatics thesis describes work and results from a study on a use case in the context of colorectal cancer. Background of the studies is an observation form the clinical practice. Various authors report that upon treatment with inhibitors of the Epidermal Growth Factor Receptor (EGFR), in particular with the therapeutic antibody Cetuximab, a minority of patients does not, or in a clearly reduced form, show common adverse effects of skin toxicity. For these patients, at the same time a reduced efficacy of the therapy is described. The lack of the adverse effect therefore gets used as a phenotypic biomarker for inducing a switch of therapy. However, preventive skin care during treatment, counteracting the biomarker signal, and the necessity to start the therapy first in order to gain the information, appear unfavorable. As the underlying molecular mechanisms remain elusive, predictions ahead of treatment, e.g. by a clinical test, are not possible yet. In the presented work, the aim was to generate hypotheses, which proteins and cellular signaling pathways might be causal for the differentiating response of the patient groups. Starting from the assumption that naturally occurring germline variations functionally discriminate individuals in the context of the treatment, the thesis builds up on a small dataset of 23 exomes of patients from a clinical study context. These sequencing data were processed to genomic variants and analyzed for their potential influence on the mechanistic level. Targeted restrictions were introduced by modeling the biomedical context of the use case in order to enrich the sparse individual data with further information. The obtained candidate genes, which are necessary to be validated in practical studies, are described and evaluated in detail. Methodologically, the result of the thesis is the „Molecular Systems Map“, a network data structure modeled in Cytoscape, interactively visualizing the functional interactions of proteins and simulatenously filtering the called variants upon the biological context. Here, the aim is to enable biomedical domain experts, beyond scrolling tabular information on called variants, to review their experimental data in the functional context and support them in the hypothesis generation process. Additionally, this provides the opportunity to apply graph algorithms and integrate further data, e.g. such from completary ‘omics experiments

    HoneyIo4: the construction of a virtual, low-interaction IoT Honeypot

    Get PDF
    Outgoin

    Numerical and Experimental Analysis of Injection and Mixture Formation in High-Performance CNG Engines

    Get PDF
    L'abstract Ăš presente nell'allegato / the abstract is in the attachmen

    Effective Use of SSDs in Database Systems

    Get PDF
    With the advent of solid state drives (SSDs), the storage industry has experienced a revolutionary improvement in I/O performance. Compared to traditional hard disk drives (HDDs), SSDs benefit from shorter I/O latency, better power efficiency, and cheaper random I/Os. Because of these superior properties, SSDs are gradually replacing HDDs. For decades, database management systems have been designed, architected, and optimized based on the performance characteristics of HDDs. In order to utilize the superior performance of SSDs, new methods should be developed, some database components should be redesigned, and architectural decisions should be revisited. In this thesis, novel methods are proposed to exploit the new capabilities of modern SSDs to improve the performance of database systems. The first is a new method for using SSDs as a fully persistent second level memory buffer pool. This method uses SSDs as a supplementary storage device to improve transactional throughput and to reduce the checkpoint and recovery times. A prototype of the proposed method is compared with its closest existing competitor. The second considers the impact of the parallel I/O capability of modern SSDs on the database query optimizer. It is shown that a query optimizer that is unaware of the parallel I/O capability of SSDs can make significantly sub-optimal decisions. In addition, a practical method for making the query optimizer parallel-I/O-aware is introduced and evaluated empirically. The third technique is an SSD-friendly external merge sort. This sorting technique has better performance than other common external sorting techniques. It also improves the SSD's lifespan by reducing the number of write operations required during sorting

    Can we accelerate medicinal chemistry by augmenting the chemist with Big Data and artificial intelligence?

    Get PDF
    It is both the best of times and the worst of times to be a medicinal chemist. Massive amounts of data combined with machine-learning and/or artificial intelligence (AI) tools to analyze it can increase our capabilities. However, drug discovery faces severe economic pressure and a high level of societal need set against challenging targets. Here, we show how improving medicinal chemistry by better curating and exchanging knowledge can contribute to improving drug hunting in all disease areas. Although securing intellectual property (IP) is a critical task for medicinal chemists, it impedes the sharing of generic medicinal chemistry knowledge. Recent developments enable the sharing of knowledge both within and between organizations while securing IP. We also explore the effects of the structure of the corporate ecosystem within drug discovery on knowledge sharing
    • 

    corecore