1,314 research outputs found
WIDE - A Distributed Architecture for Workflow Management
This paper presents the distributed architecture of the WIDE workflow management system. We show how distribution and scalability are obtained by the use of a distributed object model, a client/server architecture, and a distributed workflow server architecture. Specific attention is paid to the extended transaction support and active rule support subarchitectures
A survey of parallel execution strategies for transitive closure and logic programs
An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods
Optimization of systems of algebraic equations for evaluating datalog queries
A Datalog program can be translated into a
system of fixpoint equations of relational
algebra; this paper studies how such a system
can be solved and optimized for a particular
query. The paper presents a structured approach
to optimization, by identifying several
optimization steps and by studying solution
methods for each step
Relation Liftings on Preorders and Posets
The category Rel(Set) of sets and relations can be described as a category of
spans and as the Kleisli category for the powerset monad. A set-functor can be
lifted to a functor on Rel(Set) iff it preserves weak pullbacks. We show that
these results extend to the enriched setting, if we replace sets by posets or
preorders. Preservation of weak pullbacks becomes preservation of exact lax
squares. As an application we present Moss's coalgebraic over posets
Identifying collateral and synthetic lethal vulnerabilities within the DNA-damage response.
BackgroundA pair of genes is defined as synthetically lethal if defects on both cause the death of the cell but a defect in only one of the two is compatible with cell viability. Ideally, if A and B are two synthetic lethal genes, inhibiting B should kill cancer cells with a defect on A, and should have no effects on normal cells. Thus, synthetic lethality can be exploited for highly selective cancer therapies, which need to exploit differences between normal and cancer cells.ResultsIn this paper, we present a new method for predicting synthetic lethal (SL) gene pairs. As neighbouring genes in the genome have highly correlated profiles of copy number variations (CNAs), our method clusters proximal genes with a similar CNA profile, then predicts mutually exclusive group pairs, and finally identifies the SL gene pairs within each group pairs. For mutual-exclusion testing we use a graph-based method which takes into account the mutation frequencies of different subjects and genes. We use two different methods for selecting the pair of SL genes; the first is based on the gene essentiality measured in various conditions by means of the "Gene Activity Ranking Profile" GARP score; the second leverages the annotations of gene to biological pathways.ConclusionsThis method is unique among current SL prediction approaches, it reduces false-positive SL predictions compared to previous methods, and it allows establishing explicit collateral lethality relationship of gene pairs within mutually exclusive group pairs
Optimization of multi-domain queries on the Web
Where can I attend an interesting database workshop close
to a sunny beach? Who are the strongest experts on service
computing based upon their recent publication record and
accepted European projects? Can I spend an April week-
end in a city served by a low-cost direct
flight from Milano
offering a Mahler's symphony? We regard the above queries
as multi-domain queries, i.e., queries that can be answered
by combining knowledge from two or more domains (such
as: seaside locations,
flights, publications, accepted projects,
conference offerings, and so on). This information is avail-
able on the Web, but no general-purpose software system
can accept the above queries nor compute the answer. At
the most, dedicated systems support specific multi-domain
compositions (e.g., Google-local locates information such as
restaurants and hotels upon geographic maps).
This paper presents an overall framework for multi-domain
queries on the Web. We address the following problems: (a)
expressing multi-domain queries with an abstract formalism,
(b) separating the treatment of "search" services within the
model, by highlighting their dierences from "exact" Web
services, (c) explaining how the same query can be mapped
to multiple "query plans", i.e., a well-dened scheduling of
service invocations, possibly in parallel, which complies with
their access limitations and preserves the ranking order in
which search services return results; (d) introducing cross-
domain joins as first-class operation within plans; (e) eval-
uating the query plans against several cost metrics so as to
choose the most promising one for execution. This frame-
work adapts to a variety of application contexts, ranging
from end-user-oriented mash-up scenarios up to complex ap-
plication integration scenarios
Visual exploration and retrieval of XML document collections with the generic system X2
This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user
first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically.
After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed
Ontology-Based Data Access and Integration
An ontology-based data integration (OBDI) system is an information management system consisting of three components: an ontology, a set of data sources, and the mapping between the two. The ontology is a conceptual, formal description of the domain of interest to a given organization (or a community of users), expressed in terms of relevant concepts, attributes of concepts, relationships between concepts, and logical assertions characterizing the domain knowledge. The data sources are the repositories accessible by the organization where data concerning the domain are stored. In the general case, such repositories are numerous, heterogeneous, each one managed and maintained independently from the others. The mapping is a precise specification of the correspondence between the data contained in the data sources and the elements of the ontology. The main purpose of an OBDI system is to allow information consumers to query the data using the elements in the ontology as predicates.
In the special case where the organization manages a single data source, the term ontology-based data access (ODBA) system is used
- …