10 research outputs found

    Semantic Integration of heterogeneous data sources in the MOMIS Data Transformation System

    Get PDF
    In the last twenty years, many data integration systems following a classical wrapper/mediator architecture and providing a Global Virtual Schema (a.k.a. Global Virtual View - GVV) have been proposed by the research community. The main issues faced by these approaches range from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. Despite the research effort, all the approaches proposed require a lot of user intervention for customizing and managing the data integration and reconciliation tasks. In some cases, the effort and the complexity of the task is huge, since it requires the development of specific programming codes. Unfortunately, due to the specificity to be addressed, application codes and solutions are not frequently reusable in other domains. For this reason, the Lowell Report 2005 has provided the guideline for the definition of a public benchmark for information integration problem. The proposal, called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches), focuses on how the data integration systems manage syntactic and semantic heterogeneities, which definitely are the greatest technical challenges in the field. We developed a Data Transformation System (DTS) that supports data transformation functions and produces query translation in order to push down to the sources the execution. Our DTS is based on MOMIS, a mediator-based data integration system that our research group is developing and supporting since 1999. In this paper, we show how the DTS is able to solve all the twelve queries of the THALIA benchmark by using a simple combination of declarative translation functions already available in the standard SQL language. We think that this is a remarkable result, mainly for two reasons: firstly to the best of our knowledge there is no system that has provided a complete answer to the benchmark, secondly, our queries does not require any overhead of new code

    Query processing and optimization in deductive databasses with certainty constraints

    Get PDF
    Uncertainty reasoning has been identified as an important and challenging issue in the database research presented in the Lowell report [ea05]. Many logic frameworks have been proposed to represent and reason about uncertainty in deductive databases. Based on the way in which uncertainties are associated with the facts and rules in programs, these frameworks have been classified into: "annotation based" (AB) and "implication based" (IB). [Shi05] has investigated the relative expressive powers of AB and IB frameworks and has introduced the notion of certainty constraints, which makes them equivalent in terms of expressive power. Due to this equivalence, we developed transformation algorithms operating between AB and IB frameworks. With presence of certainty constraints in rule bodies in logic programs, query processing and optimizations become more complicated. The bottom-up query evaluation algorithms Naive, Semi-Naive, and Semi-Naive with Partition in parametric framework [SZ04, SZ08] do not consider certainty constraints. We extend these algorithms by incorporating a new checker module and develop extended evaluation algorithms which deal with certainty constraints. We have developed the proposed techniques and conducted many experiments to measure efficiency. Our results and benchmarks indicate that the proposed techniques and strategies yield a useful and efficient evaluation engine for deductive databases with certainty constraints

    The Origin of the Sacco-Vanzetti Case

    Get PDF
    For the first time in the thirty-three years since Sacco and Vanzetti were executed, on August 23, 1927, there has appeared an apologia for the Commonwealth of Massachusetts. The work bears the title: Sacco-Vanzetti: The Murder and the Myth. The author is Robert H. Montgomery, a Harvard Law School graduate (1912) and a corporation lawyer in Boston for nearly fifty years. His clients include textile mills, such as the American Woolen Company (center of the famous Lawrence Strike of 1911), New England Telephone & Telegraph Co., and large electric power interests. The approach of Attorney Montgomery to the Sacco-Vanzetti case can be gauged from his political and social philosophy, suggested by his half-century of absorption in the affairs of large corporations. Since the author obviously considers himself the ultimate authority on the case, it is curious that the book should end on a note of despair: The truth is mighty, but it will not prevail against a Great Lie, and the Sacco-Vanzetti Myth is the greatest lie of them all. \u2

    CWI Self-evaluation 1999-2004

    Get PDF

    Participatory design, time and continuity : the case of place.

    Get PDF
    Thesis. 1978. M.Arch.A.S..--Massachusetts Institute of Technology. Dept. of Architecture.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ROTCH.Includes bibliographical references.M.C.P.M.Arch.A.S.

    Techniques for improving efficiency and scalability for the integration of information retrieval and databases

    Get PDF
    PhDThis thesis is on the topic of integration of Information Retrieval (IR) and Databases (DB), with particular focuses on improving efficiency and scalability of integrated IR and DB technology (IR+DB). The main purpose of this study is to develop efficient and scalable techniques for supporting integrated IR and DB technology, which is a popular approach today for handling complex queries over text and structured data. Our specific interest in this thesis is how to efficiently handle queries over large-scale text and structured data. The work is based on a technology that integrates probability theory and relational algebra, where retrievals for text and data are to be expressed in probabilistic logical programs such as probabilistic relational algebra or probabilistic Datalog. To support efficient processing of probabilistic logical programs, we proposed three optimization techniques that focus on aspects covered logical and physical layers, which include: scoring-driven query optimization using scoring expression, query processing with top-k incorporated pipeline, and indexing with relational inverted index. Specifically, scoring expressions are proposed for expressing the scoring or probabilistic semantics of implied scoring functions of PRA expressions, so that efficient query execution plan can be generated by rule-based scoring-driven optimizer. Secondly, to balance efficiency and effectiveness so that to improve query response time, we studied methods for incorporating topk algorithms into pipelined query execution engine for IR+DB systems. Thirdly, the proposed relational inverted index integrates IR-style inverted index and DB-style tuple-based index, which can be used to support efficient probability estimation and aggregation as well as conventional relational operations. Experiments were carried out to investigate the performances of proposed techniques. Experimental results showed that the efficiency and scalability of an IR+DB prototype have been improved, while the system can handle queries efficiently on considerable large data sets for a number of IR tasks

    Database support for large-scale multimedia retrieval

    Get PDF
    With the increasing proliferation of recording devices and the resulting abundance of multimedia data available nowadays, searching and managing these ever-growing collections becomes more and more difficult. In order to support retrieval tasks within large multimedia collections, not only the sheer size, but also the complexity of data and their associated metadata pose great challenges, in particular from a data management perspective. Conventional approaches to address this task have been shown to have only limited success, particularly due to the lack of support for the given data and the required query paradigms. In the area of multimedia research, the missing support for efficiently and effectively managing multimedia data and metadata has recently been recognised as a stumbling block that constraints further developments in the field. In this thesis, we bridge the gap between the database and the multimedia retrieval research areas. We approach the problem of providing a data management system geared towards large collections of multimedia data and the corresponding query paradigms. To this end, we identify the necessary building-blocks for a multimedia data management system which adopts the relational data model and the vector-space model. In essence, we make the following main contributions towards a holistic model of a database system for multimedia data: We introduce an architectural model describing a data management system for multimedia data from a system architecture perspective. We further present a data model which supports the storage of multimedia data and the corresponding metadata, and provides similarity-based search operations. This thesis describes an extensive query model for a very broad range of different query paradigms specifying both logical and executional aspects of a query. Moreover, we consider the efficiency and scalability of the system in a distribution and a storage model, and provide a large and diverse set of index structures for high-dimensional data coming from the vector-space model. Thee developed models crystallise into the scalable multimedia data management system ADAMpro which has been implemented within the iMotion/vitrivr retrieval stack. We quantitatively evaluate our concepts on collections that exceed the current state of the art. The results underline the benefits of our approach and assist in understanding the role of the introduced concepts. Moreover, the findings provide important implications for future research in the field of multimedia data management

    Uphill All the Way: The Fortunes of Progressivism, 1919-1929

    Get PDF
    With very few exceptions, the conventional narrative of American history dates the end of the Progressive Era to the postwar turmoil of 1919 and 1920, culminating with the election of Warren G. Harding and a mandate for Normalcy. And yet, as this dissertation explores, progressives, while knocked back on their heels by these experiences, nonetheless continued to fight for change even during the unfavorable political climate of the Twenties. The Era of Normalcy itself was a much more chaotic and contested political period - marked by strikes, race riots, agrarian unrest, cultural conflict, government scandals, and economic depression - than the popular imagination often recalls. While examining the trajectory of progressives during the Harding and Coolidge years, this study also inquires into how civic progressivism - a philosophy rooted in preserving the public interest and producing change through elevated citizenship and educated public opinion - was tempered and transformed by the events of the post-war period and the New Era. With an eye to the many fruitful and flourishing fields that have come to enhance the study of political ideology in recent decades, this dissertation revisits the question of progressive persistence, and examines the rhetorical and ideological transformations it was forced to make to remain relevant in an age of consumerism, technological change, and cultural conflict. In so doing, this study aims to reevaluate progressivism's contributions to the New Era and help to define the ideological transformations that occurred between early twentieth century reform and the liberalism of the New Deal
    corecore