1,999 research outputs found

    Explain3D: Explaining Disagreements in Disjoint Datasets

    Get PDF
    Data plays an important role in applications, analytic processes, and many aspects of human activity. As data grows in size and complexity, we are met with an imperative need for tools that promote understanding and explanations over data-related operations. Data management research on explanations has focused on the assumption that data resides in a single dataset, under one common schema. But the reality of today's data is that it is frequently un-integrated, coming from different sources with different schemas. When different datasets provide different answers to semantically similar questions, understanding the reasons for the discrepancies is challenging and cannot be handled by the existing single-dataset solutions. In this paper, we propose Explain3D, a framework for explaining the disagreements across disjoint datasets (3D). Explain3D focuses on identifying the reasons for the differences in the results of two semantically similar queries operating on two datasets with potentially different schemas. Our framework leverages the queries to perform a semantic mapping across the relevant parts of their provenance; discrepancies in this mapping point to causes of the queries' differences. Exploiting the queries gives Explain3D an edge over traditional schema matching and record linkage techniques, which are query-agnostic. Our work makes the following contributions: (1) We formalize the problem of deriving optimal explanations for the differences of the results of semantically similar queries over disjoint datasets. (2) We design a 3-stage framework for solving the optimal explanation problem. (3) We develop a smart-partitioning optimizer that improves the efficiency of the framework by orders of magnitude. (4)~We experiment with real-world and synthetic data to demonstrate that Explain3D can derive precise explanations efficiently

    Service Abstractions for Scalable Deep Learning Inference at the Edge

    Get PDF
    Deep learning driven intelligent edge has already become a reality, where millions of mobile, wearable, and IoT devices analyze real-time data and transform those into actionable insights on-device. Typical approaches for optimizing deep learning inference mostly focus on accelerating the execution of individual inference tasks, without considering the contextual correlation unique to edge environments and the statistical nature of learning-based computation. Specifically, they treat inference workloads as individual black boxes and apply canonical system optimization techniques, developed over the last few decades, to handle them as yet another type of computation-intensive applications. As a result, deep learning inference on edge devices still face the ever increasing challenges of customization to edge device heterogeneity, fuzzy computation redundancy between inference tasks, and end-to-end deployment at scale. In this thesis, we propose the first framework that automates and scales the end-to-end process of deploying efficient deep learning inference from the cloud to heterogeneous edge devices. The framework consists of a series of service abstractions that handle DNN model tailoring, model indexing and query, and computation reuse for runtime inference respectively. Together, these services bridge the gap between deep learning training and inference, eliminate computation redundancy during inference execution, and further lower the barrier for deep learning algorithm and system co-optimization. To build efficient and scalable services, we take a unique algorithmic approach of harnessing the semantic correlation between the learning-based computation. Rather than viewing individual tasks as isolated black boxes, we optimize them collectively in a white box approach, proposing primitives to formulate the semantics of the deep learning workloads, algorithms to assess their hidden correlation (in terms of the input data, the neural network models, and the deployment trials) and merge common processing steps to minimize redundancy

    The 14th Overture Workshop: Towards Analytical Tool Chains

    Get PDF
    This report contains the proceedings from the 14th Overture workshop organized in connection with the Formal Methods 2016 symposium. This includes nine papers describing different technological progress in relation to the Overture/VDM tool support and its connection with other tools such as Crescendo, Symphony, INTO-CPS, TASTE and ViennaTalk

    Iconic Indexing for Video Search

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London

    Emerging technologies for learning report (volume 3)

    Get PDF

    Three dimensional compact abstract cell complexes topological data structure for buildings in CityGML

    Get PDF
    As the significance of visualising objects in three dimensional is now recognised, most city modelling approaches support 3D primitives in the construction (3D) of objects and visualisation. Although the visualisation of city models is in 3D, the topological information maintained remains in two dimensional (2D). This hinders the 3D model to serve its full potential, as the topological information that gives meaning to the objects is not preserved explicitly. The support of 3D topology is crucial for 3D spatial analysis that requires connectivity information and adjacencies in order to produce accurate output in 3D. This research investigates the implementation of a 3D topological model specifically using the Compact Abstract Cell Complexes (CACC) topological data structure for preserving the topological information of buildings in City Geographic Markup Language (CityGML). As the international standard for city modelling, the topological component of CityGML is in 2D via the simple topology-incidence. The use of the simple topology-incidence mechanism within CityGML allows only explicitly stored surfaces can be referenced. This then brings up the issue of inconsistent visualisation which is usually resolved by modelling the two buildings with two separate surfaces representing the common surface. However, the connectivity information between the two connected buildings are not preserved in CityGML as they do not share the same explicitly stored surface. Three objectives were established for the study namely to determine the specifications of a topological data structure for preserving topological information of buildings in CityGML, to implement a topological structure for buildings in CityGML that supports connectivity queries and adjacency analyses for city modelling, and to validate the proposed topological data structure in terms of geometric and topological properties in comparison to the existing CityGML topology mechanism. Several tasks were carried out to complete this research, including extraction of geometrical properties from CityGML, generation of topological links, adjacency analysis using topological information, and visualisation of 3D model and adjacency analysis results. The absence of a comprehensive topological model within CityGML made it necessary to use the geometric properties of the buildings in CityGML as a stand-in model to extract the topological properties that would subsequently be the basis for generating topological links. The CACC topological model preserves topological information by building topological links where points are connected to build alpha-0 links (1D lines), alpha-0 links are connected to build alpha-1 links (2D surfaces), alpha-1 links are connected to build alpha-2 links (3D volumes) and alpha-3 links represent the connectivity between 3D buildings. This allows connectivity between elements of different dimension as any link can be decomposed to its related lower dimension elements. Next, by implementing CACC topological model, the connectivity information for two buildings that are connected but modelled with two separate surfaces can be preserved. The support of topological information via the CACC topological model also allows the seamless execution of adjacency queries between building elements, including elements of different dimensions
    corecore