1,031 research outputs found

    On Resolving Semantic Heterogeneities and Deriving Constraints in Schema Integration

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Extending and inferring functional dependencies in schema transformation

    Full text link

    A Service Late Binding Enabled Solution for Data Integration from Autonomous and Evolving Databases

    Get PDF
    Integrating data from autonomous, distributed and heterogeneous data sources to provide a unified vision is a common demand for many businesses. Since the data sources may evolve frequently to satisfy their own independent business needs, solutions which use hard coded queries to integrate participating databases may cause high maintenance costs when evolution occurs. Thus a new solution which can handle database evolution with lower maintenance effort is required. This thesis presents a new solution: Service Late binding Enabled Data Integration (SLEDI) which is set into a framework modeling the essential processes of the data integration activity. It integrates schematic heterogeneous relational databases with decreased maintenance costs for handling database evolution. An algorithm, named Information Provision Unit Describing (IPUD) is designed to describe each database as a set of Information Provision Units (IPUs). The IPUs are represented as Directed Acyclic Graph (DAG) structured data instead of hard coded queries, and further realized as data services. Hence the data integration is achieved through service invocations. Furthermore, a set of processes is defined to handle the database evolution through automatically identifying and modifying the IPUs which are affected by the evolution. An extensive evaluation based on a case study is presented. The result shows that the schematic heterogeneities defined in this thesis can be solved by IPUD except the relation isomorphism discrepancy. Ten out of thirteen types of schematic database evolution can be automatically handled by the evolution handling processes as long as the evolution is represented by the designed data model. The computational costs of the automatic evolution handling show a slow linear growth with the number of participating databases. Other characteristics addressed include SLEDI’s scalability, independence of application domain and databases model. The descriptive comparison with other data integration approaches shows that although the Data as a Service approach may result in lower performance under some circumstances, it supports better flexibility for integrating data from autonomous and evolving data sources

    Towards a management framework for data semantic conflicts : a financial applications perspective

    Get PDF
    Includes bibliographical references (p. 23-25).Raphael Yahalom and Stuart E. Madnick

    MONIL Language, an Alternative for Data Integration El Lenguaje MONIL, una Alternativa para la Integración de Datos

    Get PDF
    Abstract Data integration is a process of retrieving, merging and storing of data originated in heterogeneous sources of data. The main problem facing the data integration is the structural and semantic heterogeneity of participating data. A concern of research communities in computer sciences is the development of semi-automatic tools to assist the user in an effective way in the data integration processes. This paper introduces a programming language called MONIL, as an alternative to integrate data by means of design, storage and program execution. MONIL is based on the use of meta-data, conversion functions, a meta-model of integration and a scheme of integration suggestions. MONIL offers to the user a dedicated work environment with built-in semi-automatic tools supporting the integration process in three stages. Keywords: data integration, integration language, databases, metadata. Resumen La integración de datos es el proceso de extracción, mezcla y almacenamiento de datos provenientes de fuentes de datos heterogéneas. El problema principal que enfrenta la integración de datos es la heterogeneidad estructural y semántica de los datos que participan. Una preocupación en las comunidades de investigación de las ciencias computacionales, es el desarrollo de herramientas semiautomáticas que asistan a los usuarios de forma efectiva en los procesos de integración de datos. Este artículo presenta un lenguaje de programación llamado MONIL, como una alternativa para integrar datos mediante el diseño, almacenamiento y ejecución de programas. MONIL está basado en el uso de metadatos, funciones de conversión, un metamodelo de integración y un esquema de sugerencias de integración. MONIL ofrece al usuario un ambiente de trabajo dedicado con herramientas semiautomáticas integradas y que soportan un proceso de integración en tres etapas. Palabras claves: integración de datos, lenguaje de integración, bases de datos, bodegas de datos, metadatos

    Balance theory, unit relations, and attribution: The underlying integrity of Heiderian theory.

    Get PDF
    Fritz Heider's (1958c) book The Psychology of Interpersonal Relations and the handful of articles preceding it (e.g., Heider, 1944, Heider, 1946; Heider & Simmel, 1944) provide the cornerstone—and a major part of the foundation—of research and theory in social perception. Two very influential theories in social psychology, the causal attribution and psychological balance theories, grew out of the ideas and analysis of this seminal work. Heider himself viewed these developments as one may view a mildly wayward child, with a mixture of pleasure and a sense of regret (Heider, 1983). Heider had considered his ideas to be all of a piece, a relatively unified and coherent theory of social perception. Subsequent researchers had taken smaller bites and developed midrange theories, slightly out of the context of Heider's other ideas. Part of this result may be laid at the feet of Heider himself. None of the articles, and not even the 1958 book, fully developed the ideas, their connections, or his larger vision. Before the publication of his influential book, Heider's best-known papers were two—one on causal attribution (Heider, 1944) and one on balance (Heider, 1954/1958b)—both of which were available in the widely read Tagiuri and Petrullo (1958) volume on person perception

    Semantic validation in spatio-temporal schema integration

    Get PDF
    This thesis proposes to address the well-know database integration problem with a new method that combines functionality from database conceptual modeling techniques with functionality from logic-based reasoners. We elaborate on a hybrid - modeling+validation - integration approach for spatio-temporal information integration on the schema level. The modeling part of our methodology is supported by the spatio-temporal conceptual model MADS, whereas the validation part of the integration process is delegated to the description logics validation services. We therefore adhere to the principle that, rather than extending either formalism to try to cover all desirable functionality, a hybrid system, where the database component and the logic component would cooperate, each one performing the tasks for which it is best suited, is a viable solution for semantically rich information management. First, we develop a MADS-based flexible integration approach where the integrated schema designer has several viable ways to construct a final integrated schema. For different related schema elements we provide the designer with four general policies and with a set of structural solutions or structural patterns within each policy. To always guarantee an integrated solution, we provide for a preservation policy with multi-representation structural pattern. To state the inter-schema mappings, we elaborate on a correspondence language with explicit spatial and temporal operators. Thus, our correspondence language has three facets: structural, spatial, and temporal, allowing to relate the thematic representation as well as the spatial and temporal features. With the inter-schema mappings, the designer can state correspondences between related populations, and define the conditions that rule the matching at the instance level. These matching rules can then be used in query rewriting procedures or to match the instances within the data integration process. We associate a set of putative structural patterns to each type of population correspondence, providing a designer with a patterns' selection for flexible integrated schema construction. Second, we enhance our integration method by employing validation services of the description logic formalism. It is not guaranteed that the designer can state all the inter-schema mappings manually, and that they are all correct. We add the validation phase to ensure validity and completeness of the inter-schema mappings set. Inter-schema mappings cannot be validated autonomously, i.e., they are validated against the data model and the schemas they link. Thus, to implement our validation approach, we translate the data model, the source schemas and the inter-schema mappings into a description logic formalism, preserving the spatial and temporal semantics of the MADS data model. Thus, our modeling approach in description logic insures that the model designer will correctly define spatial and temporal schema elements and inter-schema mappings. The added value of the complete translation (i.e., including the data model and the source schemas) is that we validate not only the inter-schema mappings, but also the compliance of the source schemas to the data model, and infer implicit relationships within them. As the result of the validation procedure, the schema designer obtains the complete and valid set of inter-schema mappings and a set of valid (flexible) schematic patterns to apply to construct an integrated schema that meets application requirements. To further our work, we model a framework in which a schema designer is able to follow our integration method and realize the schema integration task in an assisted way. We design two models, UML and SEAM models, of a system that provides for integration functionalities. The models describe a framework where several tools are employed together, each involved in the service it is best suited for. We define the functionalities and the cooperation between the composing elements of the framework and detail the logics of the integration process in an UML activity diagram and in a SEAM operation model

    Developing ethnic identities in middle childhood

    Get PDF
    This thesis reports an investigation into the development of ethnic identity during middle childhood. It commences with a literature review on ethnic identification, attitudes and interactions and their dominant theories. It is argued that ethnic identity development is simultaneously cognitive and social and relates to cognitive changes, schemas and social relationships. This research combines different methodologies to explore the multifaceted nature of its development. The report of empirical work begins with an ethnography into ethnic interactions. Two critical themes are that children tended to play more with same-ethnic (ingroup) peers and expected these others to play together. This theme is examined in two experiments. 84 white, Asian and black children, aged 5,6-7, and 8-9 years, rated their own and white, Asian and black others' (,targets') liking for toys and foods. Ethnocentric inference (that ethnic ingroup members would like things similar to oneself) was found at 6-7 years. Verbal justifications from 8-9-year-olds indicate more sophisticated expectations about group members. A conceptual and methodological amalgamation of the last two phases was undertaken in three final studies. 220 7-year-old white and Asian children in same- or different-ethnic dyads discussed their preference for white and Asian targets. They also discussed targets' preferences for them and each other as pairs. Different-ethnic dyads had more difficulty resolving differences since each partner preferred an ingroup target. Same-ethnic dyads were more likely to select an ingroup target, pair ingroup targets together, and share their choices from the outset. Asian-dyads were more likely to reason by ethnicity. It is concluded that this investigation demonstrates that in middle childhood children prefer, identify and interact more with same-ethnic members. These processes are augmented by an emerging recognition that others sharing one's ethnicity also share deeper attributes. However, the relationship between identity components remains unclear and could be illuminated by further research

    Automatic grammar induction from free text using insights from cognitive grammar

    Get PDF
    Automatic identification of the grammatical structure of a sentence is useful in many Natural Language Processing (NLP) applications such as Document Summarisation, Question Answering systems and Machine Translation. With the availability of syntactic treebanks, supervised parsers have been developed successfully for many major languages. However, for low-resourced minority languages with fewer digital resources, this poses more of a challenge. Moreover, there are a number of syntactic annotation schemes motivated by different linguistic theories and formalisms which are sometimes language specific and they cannot always be adapted for developing syntactic parsers across different language families. This project aims to develop a linguistically motivated approach to the automatic induction of grammatical structures from raw sentences. Such an approach can be readily adapted to different languages including low-resourced minority languages. We draw the basic approach to linguistic analysis from usage-based, functional theories of grammar such as Cognitive Grammar, Computational Paninian Grammar and insights from psycholinguistic studies. Our approach identifies grammatical structure of a sentence by recognising domain-independent, general, cognitive patterns of conceptual organisation that occur in natural language. It also reflects some of the general psycholinguistic properties of parsing by humans - such as incrementality, connectedness and expectation. Our implementation has three components: Schema Definition, Schema Assembly and Schema Prediction. Schema Definition and Schema Assembly components were implemented algorithmically as a dictionary and rules. An Artificial Neural Network was trained for Schema Prediction. By using Parts of Speech tags to bootstrap the simplest case of token level schema definitions, a sentence is passed through all the three components incrementally until all the words are exhausted and the entire sentence is analysed as an instance of one final construction schema. The order in which all intermediate schemas are assembled to form the final schema can be viewed as the parse of the sentence. Parsers for English and Welsh (a low-resource minority language) were developed using the same approach with some changes to the Schema Definition component. We evaluated the parser performance by (a) Quantitative evaluation by comparing the parsed chunks against the constituents in a phrase structure tree (b) Manual evaluation by listing the range of linguistic constructions covered by the parser and by performing error analysis on the parser outputs (c) Evaluation by identifying the number of edits required for a correct assembly (d) Qualitative evaluation based on Likert scales in online surveys
    corecore