22 research outputs found

    Parallel-Correctness and Transferability for Conjunctive Queries under Bag Semantics

    Get PDF
    Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query. This property is referred to as parallel-correctness. Another key problem is to detect whether the data reshuffle step can be avoided when evaluating subsequent queries. The latter problem is referred to as transfer of parallel-correctness. This paper extends the study of parallel-correctness and transfer of parallel-correctness of conjunctive queries to incorporate bag semantics. We provide semantical characterizations for both problems, obtain complexity bounds and discuss the relationship with their set semantics counterparts. Finally, we revisit both problems under a modified distribution model that takes advantage of a linear order on compute nodes and obtain tight complexity bounds

    Parallel-Correctness and Containment for Conjunctive Queries with Union and Negation

    Get PDF
    Single-round multiway join algorithms first reshuffle data over many servers and then evaluate the query at hand in a parallel and communication-free way. A key question is whether a given distribution policy for the reshuffle is adequate for computing a given query, also referred to as parallel-correctness. This paper extends the study of the complexity of parallel-correctness and its constituents, parallel-soundness and parallel-completeness, to unions of conjunctive queries with and without negation. As a by-product it is shown that the containment problem for conjunctive queries with negation is coNEXPTIME-complete

    Locality-Aware Distribution Schemes

    Get PDF
    One of the bottlenecks in parallel query processing is the cost of shuffling data across nodes in a cluster. Ideally, given a distribution of the data across the nodes and a query, we want to execute the query by performing only local computation and no communication: in this case, the query is called parallel-correct with respect to the data distribution. Previous work studied this problem for Conjunctive Queries in the case where the distribution scheme is oblivious, i.e., the location of each tuple depends only on the tuple and is independent of the instance. In this work, we show that oblivious schemes have a fundamental theoretical limitation, and initiate the formal study of distribution schemes that are locality-aware. In particular, we focus on a class of distribution schemes called co-hash distribution schemes, which are widely used in parallel systems. In co-hash partitioning, some tables are initially hashed, and the remaining tables are co-located so that a join condition is always satisfied. Given a co-hash distribution scheme, we formally study the complexity of deciding various desirable properties, including obliviousness and redundancy. Then, for a given Conjunctive Query and co-hash scheme, we determine the computational complexity of deciding whether the query is parallel-correct. We also explore a stronger notion of correctness, called parallel disjoint correctness, which guarantees that the query result will be disjointly partitioned across nodes, i.e., there is no duplication of results

    An evaluation of the challenges of Multilingualism in Data Warehouse development

    Get PDF
    In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which has implications for data warehouse design and reporting performance. Data warehouses are a core component of most Business Intelligence systems and the star schema is the approach most widely used to develop data warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in the Star Schema and identify that current approaches have serious limitations which include data redundancy and data manipulation, performance and maintenance issues. We propose a new approach to enable the optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce satisfactory results when used in a proof-of-concept environment. Future work will include testing the approach in an enterprise environmen

    Genre-based literacy pedagogy: the nature and value of genre knowledge in teaching and learning writing on a university first year media studies course

    Get PDF
    A thesis submitted to the Faculty of Humanities, Uniiversity of Luton, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyIn the teaching and learning of literacy, descriptions of text have a problematic status as a result of the growing understanding of literacy as both a cognitive process and a social practice. In the teaching of academic subjects at university, student text is not usually an object of study. The research in this thesis draws on a language based theory oflearning to place textual description at the centre of the teaching and learning of both literacy and academic subjects at university. Participant observation and practice-based research methods were used to implement a form of text-oriented literacy teaching and to explore its compatibility with processes and practices orientations to literacy. Over an eighteen month period, systemic functional grammar was used to investigate and describe the texts of a film studies classroom and the descriptions were used in genre based literacy pedagogy. The effects of the pedagogy are measured in terms of students' performance in an end of course assignment, students' accounts of their writing processes, and student and subject-tutor perception of the text description and the pedagogy. In the thesis, a linguistic description of a key curriculum genre -a Taxonomic Film Analysis -is presented. An account is given of the pedagogy by means of which this essay genre was represented in the film studies classroom as a realisation of choices from linguistic, conceptual and activity systems. Systemic functional grammar-based text description is seen to have provided a means whereby a literacy tutor could collaborate with a subject tutor to provide a subject-specific form of literacy teaching which was evaluated as relevant by students and tutors. The account and the evaluation help to clarify the role that description of text can play in relation to processes and practices ofliteracy use in the teaching and learning of literacy in a film studies classroom and have implications for the teaching and learning of literacy at university more generally

    Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021

    Get PDF
    The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at Università degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction
    corecore