39 research outputs found

    The CVS algorithm for view synchronization in evolvable large-scale information systems

    Full text link

    Using Complex Substitution Strategies for View Synchronization

    Get PDF
    Abstract Large-scale information systems typically contain autonomous information sources (ISs) that dy- namically modify their content interfaces as well as their query services regardless of the data ware- houses (views) that are built on top of them Current view technology fails to provide adaptation techniques for such changes giving support to only static views in the sense that views become unde- fined when ISs undergo capability changes We propose to address this new view evolution problem - which we call view synchronization - by allowing view definitions to be dynamically evolved when they become undefined The foundations of our approach to view synchronization include the Evolvable- SQL view definition language (E-SQL) the model for information source description (MISD) and the concept of legal view rewritings In this paper we now introduce the concept of the strongest synch-equivalent view definition that explicitly defines the evolution semantics associated with an E-SQL view definition Plus we propose a strategy and proofs of correctness for transforming any user-specified E-SQL view definition into the strongest E-SQL query We also present the Complex View Synchronization (CVS) algorithm that fully exploits the constraints defined in MISD by al- lowing relation substitution to be done by a sequence of joins among candidate relations Examples illustrating this multi-step approach are given throughout the pape

    Web services synchronization health care application

    Full text link
    With the advance of Web Services technologies and the emergence of Web Services into the information space, tremendous opportunities for empowering users and organizations appear in various application domains including electronic commerce, travel, intelligence information gathering and analysis, health care, digital government, etc. In fact, Web services appear to be s solution for integrating distributed, autonomous and heterogeneous information sources. However, as Web services evolve in a dynamic environment which is the Internet many changes can occur and affect them. A Web service is affected when one or more of its associated information sources is affected by schema changes. Changes can alter the information sources contents but also their schemas which may render Web services partially or totally undefined. In this paper, we propose a solution for integrating information sources into Web services. Then we tackle the Web service synchronization problem by substituting the affected information sources. Our work is illustrated with a healthcare case study.Comment: 18 pages, 12 figure

    Synkromisering af XPath Views

    Get PDF

    Integration of Heterogeneous Databases: Discovery of Meta-Information and Maintenance of Schema-Restructuring Views

    Get PDF
    In today\u27s networked world, information is widely distributed across many independent databases in heterogeneous formats. Integrating such information is a difficult task and has been adressed by several projects. However, previous integration solutions, such as the EVE-Project, have several shortcomings. Database contents and structure change frequently, and users often have incomplete information about the data content and structure of the databases they use. When information from several such insufficiently described sources is to be extracted and integrated, two problems have to be solved: How can we discover the structure and contents of and interrelationships among unknown databases, and how can we provide durable integration views over several such databases? In this dissertation, we have developed solutions for those key problems in information integration. The first part of the dissertation addresses the fact that knowledge about the interrelationships between databases is essential for any attempt at solving the information integration problem. We are presenting an algorithm called FIND2 based on the clique-finding problem in graphs and k-uniform hypergraphs to discover redundancy relationships between two relations. Furthermore, the algorithm is enhanced by heuristics that significantly reduce the search space when necessary. Extensive experimental studies on the algorithm both with and without heuristics illustrate its effectiveness on a variety of real-world data sets. The second part of the dissertation addresses the durable view problem and presents the first algorithm for incremental view maintenance in schema-restructuring views. Such views are essential for the integration of heterogeneous databases. They are typically defined in schema-restructuring query languages like SchemaSQL, which can transform schema into data and vice versa, making traditional view maintenance based on differential queries impossible. Based on an existing algebra for SchemaSQL, we present an update propagation algorithm that propagates updates along the query algebra tree and prove its correctness. We also propose optimizations on our algorithm and present experimental results showing its benefits over view recomputation

    Manejo de cambios en la calidad de las fuentes en sistemas de integración de datos

    Get PDF
    Los Sistemas de Integración de Datos (DIS) integran información desde un conjunto de Fuentes de Datos heterogéneas y autónomas, y proveen dicha información a un conjunto de Vistas de Usuario. Consideramos un sistema donde se toman en cuenta las propiedades de calidad. En las fuentes existen los valores reales de las propiedades de calidad y en el sistema integrado existen los valores requeridos de estas propiedades. En este tipo de sistema, considerando la gran cantidad posible de fuentes y su autonomía, aparece un nuevo problema: los cambios en la calidad de las fuentes. Los valores reales de los elementos de las fuentes pueden cambiar con mucha frecuencia y de forma impredecible. Nos interesan las consecuencias que pueden tener los cambios en la calidad de las fuentes sobre la calidad global del sistema, e incluso sobre el esquema del DIS y la forma de procesar su información. Analizamos estas consecuencias basándonos en las diferentes posibilidades existentes para manejar los cambios en los esquemas de las fuentes en sistemas de este tipo. Además estudiamos dos propiedades en particular; frescura y precisión, y definimos estrategias para el manejo de los cambios en estas propiedades

    Consciosusness in Cognitive Architectures. A Principled Analysis of RCS, Soar and ACT-R

    Get PDF
    This report analyses the aplicability of the principles of consciousness developed in the ASys project to three of the most relevant cognitive architectures. This is done in relation to their aplicability to build integrated control systems and studying their support for general mechanisms of real-time consciousness.\ud To analyse these architectures the ASys Framework is employed. This is a conceptual framework based on an extension for cognitive autonomous systems of the General Systems Theory (GST).\ud A general qualitative evaluation criteria for cognitive architectures is established based upon: a) requirements for a cognitive architecture, b) the theoretical framework based on the GST and c) core design principles for integrated cognitive conscious control systems

    Security architecture methodology for large net-centric systems

    Get PDF
    This thesis describes an over-arching security architecture methodology for large network enabled systems that can be scaled down for smaller network centric operations such as present at the University of Missouri-Rolla. By leveraging the five elements of security policy & standards, security risk management, security auditing, security federation and security management, of the proposed security architecture and addressing the specific needs of UMR, the methodology was used to determine places of improvement for UMR --Abstract, page iii

    Efficient Incremental View Maintenance for Data Warehousing

    Get PDF
    Data warehousing and on-line analytical processing (OLAP) are essential elements for decision support applications. Since most OLAP queries are complex and are often executed over huge volumes of data, the solution in practice is to employ materialized views to improve query performance. One important issue for utilizing materialized views is to maintain the view consistency upon source changes. However, most prior work focused on simple SQL views with distributive aggregate functions, such as SUM and COUNT. This dissertation proposes to consider broader types of views than previous work. First, we study views with complex aggregate functions such as variance and regression. Such statistical functions are of great importance in practice. We propose a workarea function model and design a generic framework to tackle incremental view maintenance and answering queries using views for such functions. We have implemented this approach in a prototype system of IBM DB2. An extensive performance study shows significant performance gains by our techniques. Second, we consider materialized views with PIVOT and UNPIVOT operators. Such operators are widely used for OLAP applications and for querying sparse datasets. We demonstrate that the efficient maintenance of views with PIVOT and UNPIVOT operators requires more generalized operators, called GPIVOT and GUNPIVOT. We formally define and prove the query rewriting rules and propagation rules for such operators. We also design a novel view maintenance framework for applying these rules to obtain an efficient maintenance plan. Extensive performance evaluations reveal the effectiveness of our techniques. Third, materialized views are often integrated from multiple data sources. Due to source autonomicity and dynamicity, concurrency may occur during view maintenance. We propose a generic concurrency control framework to solve such maintenance anomalies. This solution extends previous work in that it solves the anomalies under both source data and schema changes and thus achieves full source autonomicity. We have implemented this technique in a data warehouse prototype developed at WPI. The extensive performance study shows that our techniques put little extra overhead on existing concurrent data update processing techniques while allowing for this new functionality

    Data quality maintenance in Data Integration Systems

    Get PDF
    A Data Integration System (DIS) is an information system that integrates data from a set of heterogeneous and autonomous information sources and provides it to users. Quality in these systems consists of various factors that are measured in data. Some of the usually considered ones are completeness, accuracy, accessibility, freshness, availability. In a DIS, quality factors are associated to the sources, to the extracted and transformed information, and to the information provided by the DIS to the user. At the same time, the user has the possibility of posing quality requirements associated to his data requirements. DIS Quality is considered as better, the nearer it is to the user quality requirements. DIS quality depends on data sources quality, on data transformations and on quality required by users. Therefore, DIS quality is a property that varies in function of the variations of these three other properties. The general goal of this thesis is to provide mechanisms for maintaining DIS quality at a level that satisfies the user quality requirements, minimizing the modifications to the system that are generated by quality changes. The proposal of this thesis allows constructing and maintaining a DIS that is tolerant to quality changes. This means that the DIS is constructed taking into account previsions of quality behavior, such that if changes occur according to these previsions the system is not affected at all by them. These previsions are provided by models of quality behavior of DIS data, which must be maintained up to date. With this strategy, the DIS is affected only when quality behavior models change, instead of being affected each time there is a quality variation in the system. The thesis has a probabilistic approach, which allows modeling the behavior of the quality factors at the sources and at the DIS, allows the users to state flexible quality requirements (using probabilities), and provides tools, such as certainty, mathematical expectation, etc., that help to decide which quality changes are relevant to the DIS quality. The probabilistic models are monitored in order to detect source quality changes, strategy that allows detecting changes on quality behavior and not only punctual quality changes. We propose to monitor also other DIS properties that affect its quality, and for each of these changes decide if they affect the behavior of DIS quality, taking into account DIS quality models. Finally, the probabilistic approach is also applied at the moment of determining actions to take in order to improve DIS quality. For the interpretation of DIS situation we propose to use statistics, which include, in particular, the history of the quality models
    corecore