Search CORE

3,261 research outputs found

XWeB: the XML Warehouse Benchmark

Author: A. Schmidt
A. Simitsis
C. Kit
J. Darmont
J. Gray
K. Runapongsa
L. Afanasiev
L. Wyatt
P. O’Neil
R. Kimball
R. Torlone
S. Bressan
S. Rizzi
T. Böhme
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/09/2010
Field of study

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

arXiv.org e-Print Archive

Crossref

HAL Descartes

HAL

Resource constrained meta-data storage and retrieval

Author: van den Broek K.H.M.
Publication venue
Publication date: 01/01/2005
Field of study

Repository TU/e

Pure OAI Repository

Recommended from our members

Leveraging legacy codes to distributed problem solving environments: A web service approach

Author: Huang Y
Li M
Rana O
Walker D
Ward R
Williams P
Publication venue: 'Elsevier BV'
Publication date: 01/01/2003
Field of study

This paper describes techniques used to leverage high performance legacy codes as CORBA components to a distributed problem solving environment. It first briefly introduces the software architecture adopted by the environment. Then it presents a CORBA oriented wrapper generator (COWG) which can be used to automatically wrap high performance legacy codes as CORBA components. Two legacy codes have been wrapped with COWG. One is an MPI-based molecular dynamic simulation (MDS) code, the other is a finite element based computational fluid dynamics (CFD) code for simulating incompressible Navier-Stokes flows. Performance comparisons between runs of the MDS CORBA component and the original MDS legacy code on a cluster of workstations and on a parallel computer are also presented. Wrapped as CORBA components, these legacy codes can be reused in a distributed computing environment. The first case shows that high performance can be maintained with the wrapped MDS component. The second case shows that a Web user can submit a task to the wrapped CFD component through a Web page without knowing the exact implementation of the component. In this way, a user’s desktop computing environment can be extended to a high performance computing environment using a cluster of workstations or a parallel computer

Brunel University Research Archive

Post-genomic structural analysis of single amino acid polymorphisms

Author: McMillan L.E.M.
Publication venue: UCL (University College London)
Publication date: 01/11/2009
Field of study

Inherited genetic variation is critical in defining disease susceptibility. PDs, or pathogenic deviations, are mutations reported to be disease-causing, while SNPs, or single nucleotide polymorphisms, are understood to have a negligible effect on phenotype. With recent developments in biotechnology—most relevant being increased reliability and speed of sequencing—a wealth of information regarding SNPs and PDs has been acquired. Quite apart from the analytical challenge of analysing this information with a view to identifying novel therapies and targets for disease, the challenge of simply storing, mapping and processing these data is significant in itself. This thesis describes the development of a large-scale, automated pipeline that provides hypotheses as to what the structural effects of these genomic variations might be. This includes the development of nine new analyses. Eight of these new methods are structural, identifying mutations that disrupt various aspects of protein structure, including the interface, binding sites, folding mechanics and stability. The final new analysis is a novel method of identifying highly conserved residues from sequence. Here, the distribution of conservation scores from a multiple sequence alignment (MSA) is analysed to generate an MSA-specific threshold for high conservation. In order to construct MSAs for the sequence analysis, a novel method for identifying functionally equivalent proteins has been developed. Further, PDs and SNPs are characterised with respect to these structural analyses, and with respect to basic sequence and structural features. The findings support trends elsewhere in the literature: PDs are more often found in the core of proteins and at highly conserved sites; they most often affect the stability of protein structures; and they more often are between very different amino acids. In addition to the implications for disease therapies, these findings are informative in the more general context of protein structure

UCL Discovery

Data DNA: The Next Generation of Statistical Metadata

Author: Cynthia M. Taeuber
Daniel W. Gillman
Laura Smith
Publication venue: 'Brookings Institution Press'
Publication date: 03/03/2007
Field of study

Describes the components of a complete statistical metadata system and suggests ways to create and structure metadata for better access and understanding of data sets by diverse users

IssueLab

Interoperable Information Exchange, Resource Discovery, and Service Quality Monitoring Across Virtual Organizations in Distributed Research Infrastructures

Author: Kálmán Tibor
Publication venue
Publication date: 08/11/2016
Field of study

Georg-August-University Göttingen

MPG.PuRe

Why and How to Benchmark XML Databases

Author: Busse R.
Carey M.J.
Florescu D.
Kersten M.L. (Martin)
Manolescu I.
Schmidt A.R.
Waas F.
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2001
Field of study

Benchmarks belong to the very standard repertory of tools deployed in database development. Assessing the capabilities of a system, analyzing actual and potential bottlenecks, and, naturally, comparing the pros and cons of different systems architectures have become indispensable tasks as databases management systems grow in complexity and capacity. In the course of the development of XML databases the need for a benchmark framework has become more and more evident: a great many different ways to store XML data have been suggested in the past, each with its genuine advantages, disadvantages and consequences that propagate through the layers of a complex database system and need to be carefully considered. The different storage schemes render the query characteristics of the data variably different. However, no conclusive methodology for assessing these differences is available to date. In this paper, we outline desiderata for a benchmark for XML databases drawing from our own experience of developing an XML repository, involvement in the definition of the standard query language, and experience with standard benchmarks for relational databases

CWI's Institutional Repository

Fraunhofer-ePrints

International Migration, Integration and Social Cohesion online publications

Improving the resolution of interaction maps: A middleground between high-resolution complexes and genome-wide interactomes

Author: Mora Joan Segura
Publication venue: University of Leeds
Publication date: 01/03/2013
Field of study

Protein-protein interactions are ubiquitous in Biology and therefore central to understand living organisms. In recent years, large-scale studies have been undertaken to describe, at least partially, protein-protein interaction maps or interactomes for a number of relevant organisms including human. Although the analysis of interaction networks is proving useful, current interactomes provide a blurry and granular picture of the molecular machinery, i.e. unless the structure of the protein complex is known the molecular details of the interaction are missing and sometime is even not possible to know if the interaction between the proteins is direct, i.e. physical interaction or part of functional, not necessary, direct association. Unfortunately, the determination of the structure of protein complexes cannot keep pace with the discovery of new protein-protein interactions resulting in a large, and increasing, gap between the number of complexes that are thought to exist and the number for which 3D structures are available. The aim of the thesis was to tackle this problem by implementing computational approaches to derive structural models of protein complexes and thus reduce this existing gap. Over the course of the thesis, a novel modelling algorithm to predict the structure of protein complexes, V-D2OCK, was implemented. This new algorithm combines structure-based prediction of protein binding sites by means of a novel algorithm developed over the course of the thesis: VORFFIP and M-VORFFIP, data-driven docking and energy minimization. This algorithm was used to improve the coverage and structural content of the human interactome compiled from different sources of interactomic data to ensure the most comprehensive interactome. Finally, the human interactome and structural models were compiled in a database, V-D2OCK DB, that offers an easy and user-friendly access to the human interactome including a bespoken graphical molecular viewer to facilitate the analysis of the structural models of protein complexes. Furthermore, new organisms, in addition to human, were included providing a useful resource for the study of all known interactomes

White Rose E-theses Online

Why and How to Benchmark XML Databases

Author: Busse R.
Carey M.J.
Florescu D.
Kersten M.L. (Martin)
Manolescu I.
Schmidt A.R.
Waas F.
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/09/2001
Field of study

CWI's Institutional Repository