433 research outputs found

    Workshop on Database Programming Languages

    Get PDF
    These are the revised proceedings of the Workshop on Database Programming Languages held at Roscoff, Finistère, France in September of 1987. The last few years have seen an enormous activity in the development of new programming languages and new programming environments for databases. The purpose of the workshop was to bring together researchers from both databases and programming languages to discuss recent developments in the two areas in the hope of overcoming some of the obstacles that appear to prevent the construction of a uniform database programming environment. The workshop, which follows a previous workshop held in Appin, Scotland in 1985, was extremely successful. The organizers were delighted with both the quality and volume of the submissions for this meeting, and it was regrettable that more papers could not be accepted. Both the stimulating discussions and the excellent food and scenery of the Brittany coast made the meeting thoroughly enjoyable. There were three main foci for this workshop: the type systems suitable for databases (especially object-oriented and complex-object databases,) the representation and manipulation of persistent structures, and extensions to deductive databases that allow for more general and flexible programming. Many of the papers describe recent results, or work in progress, and are indicative of the latest research trends in database programming languages. The organizers are extremely grateful for the financial support given by CRAI (Italy), Altaïr (France) and AT&T (USA). We would also like to acknowledge the organizational help provided by Florence Deshors, Hélène Gans and Pauline Turcaud of Altaïr, and by Karen Carter of the University of Pennsylvania

    On the complexity of queries with intersection joins

    Get PDF
    This thesis studies the complexity of join processing on interval data. It defines a class of queries, called Conjunctive Queries with Intersections Joins (IJQs). An IJQ is a query in which the variables range both over scalars and intervals with real-valued endpoints. The joins are expressed through intersection predicates; an intersection predicate over a multi-set that consists of both scalars and intervals is a true assertion, if the elements in the multi-set intersect; otherwise, it is false. The class of IJQs includes queries that are often asked in practice. This thesis introduces techniques for obtaining reductions from the problem of evaluating IJQs to the problem of evaluating Conjunctive Queries with Equality Joins (CQs). The key idea is the rewriting of an intersection predicate over a set of intervals into an equivalent predicate with equality conditions. This rewriting is achieved by building a segment tree where the nodes hierarchically encode intervals using bit-strings. Given a multi-set of intervals, their intersection is captured by certain equality conditions on the encoding of the nodes. Following that, it turns out that the problem of evaluating an IJQ on an input database containing intervals can be reduced to the problem of evaluating a union of CQs on a database containing scalars and vice versa. Such reductions lead to upper and lower bounds for the data complexity of Boolean IJQs, given upper and lower bounds for the data complexity Boolean CQs. The upper bounds are obtained using a reduction called forward reduction, which reduces any Boolean IJQ to a disjunction of Boolean CQs. The lower bounds are obtained by a reduction called backward reduction, in which any Boolean CQ from the aforementioned disjunction is reduced to the input Boolean IJQ. Overall, the two findings suggest that a Boolean IJQ is as difficult as the forward disjunctions' most difficult Boolean CQ. Last but not least, this thesis identifies an interesting subclass of Boolean IJQs that admit quasi-linear time computation in data complexity. They are referred to as ι\iota-acyclic IJQs

    The Complexity of Boolean Conjunctive Queries with Intersection Joins

    Full text link
    Intersection joins over interval data are relevant in spatial and temporal data settings. A set of intervals join if their intersection is non-empty. In case of point intervals, the intersection join becomes the standard equality join. We establish the complexity of Boolean conjunctive queries with intersection joins by a many-one equivalence to disjunctions of Boolean conjunctive queries with equality joins. The complexity of any query with intersection joins is that of the hardest query with equality joins in the disjunction exhibited by our equivalence. This is captured by a new width measure called the IJ-width. We also introduce a new syntactic notion of acyclicity called iota-acyclicity to characterise the class of Boolean queries with intersection joins that admit linear time computation modulo a poly-logarithmic factor in the data size. Iota-acyclicity is for intersection joins what alpha-acyclicity is for equality joins. It strictly sits between gamma-acyclicity and Berge-acyclicity. The intersection join queries that are not iota-acyclic are at least as hard as the Boolean triangle query with equality joins, which is widely considered not computable in linear time

    Statistical modelling of species distributions using presence-only data:A semantic and graphical approach using the tree of life

    Get PDF
    Understanding the mechanisms that determine and differentiate the establishment of organisms in space is an old and fundamental question in ecology. The emergence of life’s spatial patterns is guided by the confluence of three forces: the environmental filtering, which unbalances the probability of establishment for organisms given their evolutionary adaptations to local environmental conditions; the biological interactions, which restrict their establishment according to the presence (or absence) of other organisms; the diversification of organisms’ strategies (traits) to migrate and adapt to changing environments. The main hypothesis in this research is that the accumulated knowledge of biodiversity occurrences, the species taxonomic classification and geospatial environmental data can be integrated into a unified modelling framework to characterise the joint effect of these three forces and, thus, contribute with more general, accurate and statistically sound species distributions models (SDM)s. The first part of this thesis describes the design and implementation of a knowledge engine capable to synthesise and integrate environmental geospatial data, taxonomic relationships and species occurrences. It uses semantic queries to instantiate complex data structures, represented as networks of concepts (knowledge graphs). Local taxonomic trees, distributed over a hierarchical spatial system of regular lattices are used as knowledge graphs to perform data synthesis, geoprocessing, and transformations. The implementation uses efficient call-by-need evaluations that facilitates spatial and scale analysis on large datasets. The second part of the thesis corresponds to the statistical specification and implementation of two modelling frameworks for species distribution models (one for single species and other for multiple species). These models are designed for presence-only observations; obtained from the knowledge engine. The common specification of these models are that presence-only observations are the joint effect of two latent processes: one, that defines the species presence (ecological suitability); and other, that defines the probability of being sampled (sampling effort). The single species framework uses an informative sample, chosen by the modeller, to account for the sampling effort. Three modelling strategies are proposed for accounting the joint effect of the ecological and sampling process (independent processes, a common spatial random effect and correlated processes). The tree models were compared to the maximum entropy model (MaxEnt), a popular algorithm used in SDMs. In all cases, at least one model showed a better predictive performance than MaxEnt. The multi-species modelling framework is a generalisation of the single species framework for developing a joint species distribution model for presence-only data. The specification is a multilevel hierarchical logistic model with a single spatial random effect, common to all species of interest. The sampling effort is modelled as a complementary sample obtained by complementary observations from the taxa of interest using a regional taxonomic tree. The model was tested against simulated data. All simulated parameters were covered by the credible intervals of the posterior sampling. A study case in Easter Mexico was presented as an application of the model. The results obtained in the case study were consistent with the macroecological theory. The model showed to be effective in removing bias and noise given by the sampling effort. This effect was particularly impressive in urban areas, where the sampling intensity is greater. The research presented here provides an interdisciplinary approach for modelling joint species distributions aided by the automated selection of biological, spatial and environmental context

    An overview of decision table literature 1982-1995.

    Get PDF
    This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.

    An Adaptive Integration Architecture for Software Reuse

    Get PDF
    The problem of building large, reliable software systems in a controlled, cost-effective way, the so-called software crisis problem, is one of computer science\u27s great challenges. From the very outset of computing as science, software reuse has been touted as a means to overcome the software crisis issue. Over three decades later, the software community is still grappling with the problem of building large reliable software systems in a controlled, cost effective way; the software crisis problem is alive and well. Today, many computer scientists still regard software reuse as a very powerful vehicle to improve the practice of software engineering. The advantage of amortizing software development cost through reuse continues to be a major objective in the art of building software, even though the tools, methods, languages, and overall understanding of software engineering have changed significantly over the years. Our work is primarily focused on the development of an Adaptive Application Integration Architecture Framework. Without good integration tools and techniques, reuse is difficult and will probably not happen to any significant degree. In the development of the adaptive integration architecture framework, the primary enabling concept is object-oriented design supported by the unified modeling language. The concepts of software architecture, design patterns, and abstract data views are used in a structured and disciplined manner to established a generic framework. This framework is applied to solve the Enterprise Application Integration (EM) problem in the telecommunications operations support system (OSS) enterprise marketplace. The proposed adaptive application integration architecture framework facilitates application reusability and flexible business process re-engineering. The architecture addresses the need for modern businesses to continuously redefine themselves to address changing market conditions in an increasingly competitive environment. We have developed a number of Enterprise Application Integration design patterns to enable the implementation of an EAI framework in a definite and repeatable manner. The design patterns allow for integration of commercial off-the-shelf applications into a unified enterprise framework facilitating true application portfolio interoperability. The notion of treating application services as infrastructure services and using business processes to combine them arbitrarily provides a natural way of thinking about adaptable and reusable software systems. We present a mathematical formalism for the specification of design patterns. This specification constitutes an extension of the basic concepts from many-sorted algebra. In particular, the notion of signature is extended to that of a vector, consisting of a set of linearly independent signatures. The approach can be used to reason about various properties including efforts for component reuse and to facilitate complex largescale software development by providing the developer with design alternatives and support for automatic program verification

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster
    corecore