2,280 research outputs found

    Checking integrity constraints - how it differs in centralized, distributed and parallel databases

    Get PDF
    An important aim of a database system is to guarantee database consistency, which means that the data contained in a database is both accurate and valid. Integrity constraints represent knowledge about data with which a database must be consistent. The process of checking constraints to ensure that update operations or transactions which alter the database will preserve its consistency has proved to be extremely difficult to implement, particularly in distributed and parallel databases. In distributed databases the aim of the constraint checking is to reduce the amount of data needing to be accessed, the number of sites involved and the amount of data transferred across the network. In parallel databases the focus is on the total execution time taken in checking the constraints. This paper highlights the differences between centralized, distributed and parallel databases with respect to constraint checking

    Efficient Multi-way Theta-Join Processing Using MapReduce

    Full text link
    Multi-way Theta-join queries are powerful in describing complex relations and therefore widely employed in real practices. However, existing solutions from traditional distributed and parallel databases for multi-way Theta-join queries cannot be easily extended to fit a shared-nothing distributed computing paradigm, which is proven to be able to support OLAP applications over immense data volumes. In this work, we study the problem of efficient processing of multi-way Theta-join queries using MapReduce from a cost-effective perspective. Although there have been some works using the (key,value) pair-based programming model to support join operations, efficient processing of multi-way Theta-join queries has never been fully explored. The substantial challenge lies in, given a number of processing units (that can run Map or Reduce tasks), mapping a multi-way Theta-join query to a number of MapReduce jobs and having them executed in a well scheduled sequence, such that the total processing time span is minimized. Our solution mainly includes two parts: 1) cost metrics for both single MapReduce job and a number of MapReduce jobs executed in a certain order; 2) the efficient execution of a chain-typed Theta-join with only one MapReduce job. Comparing with the query evaluation strategy proposed in [23] and the widely adopted Pig Latin and Hive SQL solutions, our method achieves significant improvement of the join processing efficiency.Comment: VLDB201

    Context modeling and constraints binding in web service business processes

    Get PDF
    Context awareness is a principle used in pervasive services applications to enhance their exibility and adaptability to changing conditions and dynamic environments. Ontologies provide a suitable framework for context modeling and reasoning. We develop a context model for executable business processes { captured as an ontology for the web services domain. A web service description is attached to a service context profile, which is bound to the context ontology. Context instances can be generated dynamically at services runtime and are bound to context constraint services. Constraint services facilitate both setting up constraint properties and constraint checkers, which determine the dynamic validity of context instances. Data collectors focus on capturing context instances. Runtime integration of both constraint services and data collectors permit the business process to achieve dynamic business goals

    Dynamic integration of context model constraints in web service processes

    Get PDF
    Autonomic Web service composition has been a challenging topic for some years. The context in which composition takes places determines essential aspects. A context model can provide meaningful composition information for services process composition. An ontology-based approach for context information integration is the basis of a constraint approach to dynamically integrate context validation into service processes. The dynamic integration of context constraints into an orchestrated service process is a necessary direction to achieve autonomic service composition

    Bioinformatics Databases: State of the Art and Research Perspectives

    Get PDF
    Bioinformatics or computational biology, i.e. the application of mathematical and computer science methods to solving problems in molecular biology that require large scale data, computation, and analysis, is a research area currently receiving a considerable attention. Databases play an essential role in molecular biology and consequently in bioinformatics. molecular biology data are often relatively cheap to produce, leading to a proliferation of databases: the number of bioinformatics databases accessible worldwide probably lies between 500 and 1.000. Not only molecular biology data, but also molecular biology literature and literature references are stored in databases. Bioinformatics databases are often very large (e.g. the sequence database GenBank contains more than 4 × 10 6 nucleotide sequences) and in general grows rapidly (e.g. about 8000 abstracts are added every month to the literature database PubMed). Bioinformatics databases are heterogeneous in their data, in their data modeling paradigms, in their management systems, and in the data analysis tools they supports. Furthermore, bioinformatics databases are often implemented, queried, updated, and managed using methods rarely applied for other databases. This presentation aims at introducing in current bioinformatics databases, stressing their aspects departing from conventional databases. A more detailed survey can be found in [1] upon which thi

    A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

    Full text link
    The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

    ADEPT2 - Next Generation Process Management Technology

    Get PDF
    If current process management systems shall be applied to a broad spectrum of applications, they will have to be significantly improved with respect to their technological capabilities. In particular, in dynamic environments it must be possible to quickly implement and deploy new processes, to enable ad-hoc modifications of single process instances at runtime (e.g., to add, delete or shift process steps), and to support process schema evolution with instance migration, i.e., to propagate process schema changes to already running instances. These requirements must be met without affecting process consistency and by preserving the robustness of the process management system. In this paper we describe how these challenges have been addressed and solved in the ADEPT2 Process Management System. Our overall vision is to provide a next generation process management technology which can be used in a variety of application domains
    corecore