Search CORE

903 research outputs found

The data cyclotron query processing scheme

Author: Kersten M.L. (Martin)
Pereira Goncalves R.A. (Romulo Antonio)
Publication venue: 'MIT Press - Journals'
Publication date: 01/03/2010
Field of study

Distributed database systems exploit static workload characteristics to steer data fragmentation and data allocation schemes. However, the grand challenge of distributed query processing is to come up with a self-organizing architecture, which exploits all resources to manage the hot data set, minimize query response time, and maximize throughput without global co-ordination. In this paper, we introduce the Data Cyclotron architecture which addresses the challenges using turbulent data movement through a storage ring built from distributed main memory capitalizing modern remote-DMA facilities. Queries assigned to individual nodes interact with the Data Cyclotron by picking up data fragments continuously flowing around, i.e., the hot set. Each data fragment carries a level of interest (LOI) metric, which represents the cumulative query interest as the fragment passes around the ring multiple times. A fragment with a LOI below a given threshold, inversely proportional to the ring load, is pulled o

CWI's Institutional Repository

The data cyclotron : juggling data and queries for a data warehouse audience

Author: Pereira Goncalves R.A. (Romulo Antonio)
Publication venue
Publication date: 22/03/2013
Field of study

CWI's Institutional Repository

The Database Architectures Research Group at CWI

Author: Kersten M.L. (Martin)
Manegold S. (Stefan)
Mullender K.S. (Sjoerd)
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/12/2011
Field of study

The Database research group at CWI was established in 1985. It has steadily grown from two PhD students to a group of 17 people ultimo 2011. The group is supported by a scientific programmer and a system engineer to keep our machines running. In this short note, we look back at our past and highlight the multitude of topics being addressed

CWI's Institutional Repository

Solving the Optimal Trading Trajectory Problem Using a Quantum Annealer

Author: Carr Peter
de Prado Marcos López
Goddard Phil
Haghnegahdar Poya
Rosenberg Gili
Wu Kesheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/08/2016
Field of study

We solve a multi-period portfolio optimization problem using D-Wave Systems' quantum annealer. We derive a formulation of the problem, discuss several possible integer encoding schemes, and present numerical examples that show high success rates. The formulation incorporates transaction costs (including permanent and temporary market impact), and, significantly, the solution does not require the inversion of a covariance matrix. The discrete multi-period portfolio optimization problem we solve is significantly harder than the continuous variable problem. We present insight into how results may be improved using suitable software enhancements, and why current quantum annealing technology limits the size of problem that can be successfully solved today. The formulation presented is specifically designed to be scalable, with the expectation that as quantum annealing technology improves, larger problems will be solvable using the same techniques.Comment: 7 pages; expanded and update

arXiv.org e-Print Archive

eScholarship - University of California

MonetDB: Two Decades of Research in Column-oriented Database Architectures

Author: Groffen F.E. (Fabian)
Idreos S. (Stratos)
Kersten M.L. (Martin)
Manegold S. (Stefan)
Mullender K.S. (Sjoerd)
Nes N.J. (Niels)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2012
Field of study

MonetDB is a state-of-the-art open-source column-store database management system targeting applications in need for analytics over large collections of data. MonetDB is actively used nowadays in health care, in telecommunications as well as in scientiﬁc databases and in data management research, accumulating on average more than 10,000 downloads on a monthly basis. This paper gives a brief overview of the MonetDB technology as it developed over the past two decades and the main research highlights which drive the current MonetDB design and form the basis for its future evolution

CWI's Institutional Repository

Just-in-time Data Distribution for Analytical Query Processing

Author: Groffen F.E. (Fabian)
Ivanova M.G. (Milena)
Kersten M.L. (Martin)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/09/2012
Field of study

Distributed processing commonly requires data spread across machines using a priori static or hash-based data allocation. In this paper, we explore an alternative approach that starts from a master node in control of the complete database, and a variable number of worker nodes for delegated query processing. Data is shipped just-in-time to the worker nodes using a need to know policy, and is being reused, if possible, in subsequent queries. A bidding mechanism among the workers yields a scheduling with the most efficient reuse of previously shipped data, minimizing the data transfer costs. Just-in-time data shipment allows our system to benefit from locally available idle resources to boost overall performance. The system is maintenance-free and allocation is fully transparent to users. Our experiments show that the proposed adaptive distributed architecture is a viable and flexible alternative for small scale MapReduce-type of settings

CWI's Institutional Repository

Assessment of Metabolome Annotation Quality: A Method for Evaluating the False Discovery Rate of Elemental Composition Searches

Author: A Kaufmann
A Koulman
A Oikawa
AD Hegeman
AH Grange
AH Grange
Akira Oikawa
C Abate-Shen
C Bottcher
DL Tabb
E Werner
EW Sayers
F Matsuda
Fumio Matsuda
G Madalinski
H Choi
H Suzuki
H Takahashi
Hany A. El-Shemy
J Schmidt
JE Elias
K Dettmer
Kazuki Saito
L Kall
LW Sumner
M Kanehisa
M Watanabe
Masami Yokota Hirai
MY Hirai
Oliver Fiehn
P Giavalisco
P Kiefer
R Taguchi
RA Dixon
RA Scheltema
RC De Vos
RJ Bino
S Bocker
S Moco
S Ojanpera
S Suzuki
Shigehiko Kanaya
T Kind
T Kind
T Soga
W Schwab
WB Dunn
Y Iijima
Y Shinbo
Yoko Shinbo
Publication venue: Public Library of Science
Publication date: 16/10/2009
Field of study

BACKGROUND: In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. METHODOLOGY/PRINCIPAL FINDINGS: The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. CONCLUSIONS/SIGNIFICANCE: High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR <10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central