Search CORE

518 research outputs found

Minimum Clinical Recommendations for diagnosis, treatment and follow-up of malignant pleural mesothelioma

Author: Manegold C.
Stahel R. A.
Publication venue
Publication date: 02/08/2017
Field of study

Pay One, Get Hundreds for Free: Reducing Cloud Costs through Shared Query Execution

Author: Chen Chung-Min
Graefe Goetz
Harizopoulos Stavros
Lang Christian A.
Manegold Stefan
Transaction Processing Performance Council
Zukowski Marcin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/09/2018
Field of study

Cloud-based data analysis is nowadays common practice because of the lower system management overhead as well as the pay-as-you-go pricing model. The pricing model, however, is not always suitable for query processing as heavy use results in high costs. For example, in query-as-a-service systems, where users are charged per processed byte, collections of queries accessing the same data frequently can become expensive. The problem is compounded by the limited options for the user to optimize query execution when using declarative interfaces such as SQL. In this paper, we show how, without modifying existing systems and without the involvement of the cloud provider, it is possible to significantly reduce the overhead, and hence the cost, of query-as-a-service systems. Our approach is based on query rewriting so that multiple concurrent queries are combined into a single query. Our experiments show the aggregated amount of work done by the shared execution is smaller than in a query-at-a-time approach. Since queries are charged per byte processed, the cost of executing a group of queries is often the same as executing a single one of them. As an example, we demonstrate how the shared execution of the TPC-H benchmark is up to 100x and 16x cheaper in Amazon Athena and Google BigQuery than using a query-at-a-time approach while achieving a higher throughput

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Report on the Second International Workshop on Data Management on Modern Hardware (DaMoN'06)

Author: Ailamaki A. (Anastasia)
Boncz P.A. (Peter)
Manegold S. (Stefan)
Publication venue: A.C.M.
Publication date: 01/12/2006
Field of study

This report summarizes the presentations and discussions that occurred during the Second International Workshop on Data Management on Modern Hardware (DaMoN). DaMoN was held in Chicago on June 25th, 2006, and was collocated with ACM SIGMOD 2006. The aim of this one-day workshop is to bring together researchers interested in optimizing database performance on modern computing infrastructure by designing new data management techniques and tools

CWI's Institutional Repository

Welcome to Sigmod 2019 - The 2019 ACM SIGMOD International Conference on the Management of Data!

Author: Ailamaki A. (Anastasia)
Boncz P.A. (Peter)
Manegold S. (Stefan)
Publication venue
Publication date: 30/06/2019
Field of study

CWI's Institutional Repository

Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)

Author: Ailamaki A.
Boncz P. A.
Boncz P. A.
Chen M.-S.
Chen M.-S.
DeWitt D. J.
Lang H.
Li Y.
Liu B.
Lohman G. M.
Lu H.
Manegold S.
Ono K.
Schneider D. A.
Shatdal A.
Stillger M.
Zhang N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/07/2015
Field of study

Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes memory-resident. Even after careful tuning for an in-memory environment, a linear disk I/O model such as the one implemented in PostgreSQL may make query response time predictions that are up to 2X slower than the optimal multi-join query plan over memory-resident data. This paper introduces a memory I/O cost model to identify good evaluation strategies for complex query plans with multiple hash-based equi-joins over memory-resident data. The proposed cost model is carefully validated for accuracy using three different systems, including an Amazon EC2 instance, to control for hardware-specific differences. Prior work in parallel query evaluation has advocated right-deep and bushy trees for multi-join queries due to their greater parallelization and pipelining potential. A surprising finding is that the conventional wisdom from shared-nothing disk-based systems does not directly apply to the modern shared-everything memory hierarchy. As corroborated by our model, the performance gap between the optimal left-deep and right-deep query plan can grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in SoCC'1

arXiv.org e-Print Archive

Crossref

Quality predictors of abdominal fetal electrocardiography recording in antenatal ambulatory and bedside settings

Author: Hoesli I.
Holzgreve W.
Huhn E.
Manegold-Brauer G.
Meyer A. H.
Müller M.
Wilhelm F. H.
Publication venue: 'S. Karger AG'
Publication date: 04/11/2016
Field of study

Background: Fetal electrocardiography using an abdominal monitor (Monica AN24™) could increase the diagnostic use of fetal heart rate (fHR) variability measurements. However, signal quality may depend on factors such as maternal physical activity, posture, and bedside versus ambulatory setting. Methods: Sixty-three healthy women wore the monitor at home and 42 women during a hospital stay. All women underwent a posture experiment, and all home and 13 hospital participants wore the monitor during daytime and nighttime. The success rate (SR) of fHR detection was analyzed in relation to maternal physical activity, posture, daytime versus nighttime, and other maternal and fetal predictors. Results: Ambulatorily, the SR was 86.8% for nighttime and 40.2% for daytime. The low daytime SR was largely due to effects of maternal physical activity and posture. The in-hospital SR was lower during nighttime (71.1%) and similar during daytime (43.3%). SR was related to gestational age, but not affected by pre-pregnancy and current body mass index or fetal growth restriction. Conclusions: The success of beat-to-beat fHR detection strongly depends on the home/hospital setting and predictors such as time of recording, activity levels, and maternal posture. Its clinical utility may be limited in periods of unsupervised recording with physical activity or posture shifts

Crossref

edoc

Замена электродвигателя ПЭН турбоприводом на Кемеровской ТЭЦ

Author: Baba H. A.
Handt Stefan
Klosterhalfen Bernd
Lopez Cotarelo C.
Luka J.
Manegold E.
Mittermayer Christian
Sellhaus B.
Tietze Lothar
Publication venue
Publication date: 01/01/2000
Field of study

В данной работе рассматривается возможность замены электродвигателя ПЭН турбоприводом на Кемеровской ТЭЦ, с установкой турбопривода на существующий фундамент. Целью работы является оценка возможности увеличения отпуска электроэнергии от станции в результате уменьшения затрат на собственные нужды и повышение маневренности ТЭЦ.In this paper we consider the possibility of replacing the turbine drive motor PEN to Kemerovo CHP , with the installation of turbine drive on the existing foundation. The aim is to assess the possibility of increasing the supply of electric power from the plant by reducing the costs of their own needs and improving maneuverability CHP

Electronic archive of Tomsk Polytechnic University

Docetaxel-based induction therapy prior to radiotherapy with or without docetaxel for non-small-cell lung cancer.

Author: Cardenal F.
Debus J.
Lebeau B.
Manegold C.
Mattson K.
Price A.
Ramlau R.
Scagliotti Giorgio Vittorio
Szczesna A.
Van Zandwijk N.
Publication venue
Publication date: 01/01/2006
Field of study

Institutional Research Information System University of Turin

Genome sequence analysis with MonetDB: a case study on Ebola virus diversity

Author: Cijvat C.P. (Robin)
Kersten M.L. (Martin)
Klau G.W. (Gunnar)
Manegold S. (Stefan)
Marschall T. (Tobias)
Schönhuth A. (Alexander)
Zhang Y. (Ying)
Publication venue
Publication date: 03/03/2015
Field of study

Next-generation sequencing (NGS) technology has led the life sciences into the big data era. Today, sequencing genomes takes little time and cost, but results in terabytes of data to be stored and analysed. Biologists are often exposed to excessively time consuming and error-prone data management and analysis hurdles. In this paper, we propose a database management system (DBMS) based approach to accelerate and substantially simplify genome sequence analysis. We have extended MonetDB, an open-source column-based DBMS, with a BAM module, which enables easy, flexible, and rapid management and analysis of sequence alignment data stored as Sequence Alignment/Map (SAM/BAM) files. We describe the main features of MonetDB/BAM using a case study on Ebola virus genomes

CWI's Institutional Repository

GeoTriples: Transforming geospatial data into RDF graphs using R2RML and RML mappings

Author: Karalis N. (Nikolaos)
Koubarakis M. (Manolis)
Kyzirakos K. (Konstantinos)
Manegold S. (Stefan)
Savva D. (Dimitrianos)
Vasileiou A. (Alexandros)
Vlachopoulos I. (Ioannis)
Publication venue: 'Elsevier BV'
Publication date: 11/09/2018
Field of study

A lot of geospatial data has become available at no charge in many countries recently. Geospatial data that is currently made available by government agencies usually do not follow the linked data paradigm. In the few cases where government agencies do follow the linked data paradigm (e.g., Ordnance Survey in the United Kingdom), specialized scripts have been used for transforming geospatial data into RDF. In this paper we present the open source tool GeoTriples which generates and processes extended R2RML and RML mappings that transform geospatial data from many input formats into RDF. GeoTriples allows the transformation of geospatial data stored in raw files (shapefiles, CSV, KML, XML, GML and GeoJSON) and spatially-enabled RDBMS (PostGIS and MonetDB) into RDF graphs using well-known vocabularies like GeoSPARQL and stSPARQL, but without being tightly coupled to a specific vocabulary. GeoTriples has been developed in European projects LEO and Melodies and has been used to transform many geospatial data sources into linked data. We study the performance of GeoTriples experimentally using large publicly available geospatial datasets, and show that GeoTriples is very efficient and scalable especially when its mapping processor is implemented using Apache Hadoop

CWI's Institutional Repository

Leiden University Scholary Publications