1,999 research outputs found
Explain3D: Explaining Disagreements in Disjoint Datasets
Data plays an important role in applications, analytic processes, and many
aspects of human activity. As data grows in size and complexity, we are met
with an imperative need for tools that promote understanding and explanations
over data-related operations. Data management research on explanations has
focused on the assumption that data resides in a single dataset, under one
common schema. But the reality of today's data is that it is frequently
un-integrated, coming from different sources with different schemas. When
different datasets provide different answers to semantically similar questions,
understanding the reasons for the discrepancies is challenging and cannot be
handled by the existing single-dataset solutions.
In this paper, we propose Explain3D, a framework for explaining the
disagreements across disjoint datasets (3D). Explain3D focuses on identifying
the reasons for the differences in the results of two semantically similar
queries operating on two datasets with potentially different schemas. Our
framework leverages the queries to perform a semantic mapping across the
relevant parts of their provenance; discrepancies in this mapping point to
causes of the queries' differences. Exploiting the queries gives Explain3D an
edge over traditional schema matching and record linkage techniques, which are
query-agnostic. Our work makes the following contributions: (1) We formalize
the problem of deriving optimal explanations for the differences of the results
of semantically similar queries over disjoint datasets. (2) We design a 3-stage
framework for solving the optimal explanation problem. (3) We develop a
smart-partitioning optimizer that improves the efficiency of the framework by
orders of magnitude. (4)~We experiment with real-world and synthetic data to
demonstrate that Explain3D can derive precise explanations efficiently
Service Abstractions for Scalable Deep Learning Inference at the Edge
Deep learning driven intelligent edge has already become a reality, where millions of mobile, wearable, and IoT devices analyze real-time data and transform those into actionable insights on-device. Typical approaches for optimizing deep learning inference mostly focus on accelerating the execution of individual inference tasks, without considering the contextual correlation unique to edge environments and the statistical nature of learning-based computation. Specifically, they treat inference workloads as individual black boxes and apply canonical system optimization techniques, developed over the last few decades, to handle them as yet another type of computation-intensive applications. As a result, deep learning inference on edge devices still face the ever increasing challenges of customization to edge device heterogeneity, fuzzy computation redundancy between inference tasks, and end-to-end deployment at scale. In this thesis, we propose the first framework that automates and scales the end-to-end process of deploying efficient deep learning inference from the cloud to heterogeneous edge devices. The framework consists of a series of service abstractions that handle DNN model tailoring, model indexing and query, and computation reuse for runtime inference respectively. Together, these services bridge the gap between deep learning training and inference, eliminate computation redundancy during inference execution, and further lower the barrier for deep learning algorithm and system co-optimization. To build efficient and scalable services, we take a unique algorithmic approach of harnessing the semantic correlation between the learning-based computation. Rather than viewing individual tasks as isolated black boxes, we optimize them collectively in a white box approach, proposing primitives to formulate the semantics of the deep learning workloads, algorithms to assess their hidden correlation (in terms of the input data, the neural network models, and the deployment trials) and merge common processing steps to minimize redundancy
The 14th Overture Workshop: Towards Analytical Tool Chains
This report contains the proceedings from the 14th Overture workshop organized in connection with the Formal Methods 2016 symposium. This includes nine papers describing different technological progress in relation to the Overture/VDM tool support and its connection with other tools such as Crescendo, Symphony, INTO-CPS, TASTE and ViennaTalk
Iconic Indexing for Video Search
Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London
Three dimensional compact abstract cell complexes topological data structure for buildings in CityGML
As the significance of visualising objects in three dimensional is now recognised, most city modelling approaches support 3D primitives in the construction (3D) of objects and visualisation. Although the visualisation of city models is in 3D, the topological information maintained remains in two dimensional (2D). This hinders the 3D model to serve its full potential, as the topological information that gives meaning to the objects is not preserved explicitly. The support of 3D topology is crucial for 3D spatial analysis that requires connectivity information and adjacencies in order to produce accurate output in 3D. This research investigates the implementation of a 3D topological model specifically using the Compact Abstract Cell Complexes (CACC) topological data structure for preserving the topological information of buildings in City Geographic Markup Language (CityGML). As the international standard for city modelling, the topological component of CityGML is in 2D via the simple topology-incidence. The use of the simple topology-incidence mechanism within CityGML allows only explicitly stored surfaces can be referenced. This then brings up the issue of inconsistent visualisation which is usually resolved by modelling the two buildings with two separate surfaces representing the common surface. However, the connectivity information between the two connected buildings are not preserved in CityGML as they do not share the same explicitly stored surface. Three objectives were established for the study namely to determine the specifications of a topological data structure for preserving topological information of buildings in CityGML, to implement a topological structure for buildings in CityGML that supports connectivity queries and adjacency analyses for city modelling, and to validate the proposed topological data structure in terms of geometric and topological properties in comparison to the existing CityGML topology mechanism. Several tasks were carried out to complete this research, including extraction of geometrical properties from CityGML, generation of topological links, adjacency analysis using topological information, and visualisation of 3D model and adjacency analysis results. The absence of a comprehensive topological model within CityGML made it necessary to use the geometric properties of the buildings in CityGML as a stand-in model to extract the topological properties that would subsequently be the basis for generating topological links. The CACC topological model preserves topological information by building topological links where points are connected to build alpha-0 links (1D lines), alpha-0 links are connected to build alpha-1 links (2D surfaces), alpha-1 links are connected to build alpha-2 links (3D volumes) and alpha-3 links represent the connectivity between 3D buildings. This allows connectivity between elements of different dimension as any link can be decomposed to its related lower dimension elements. Next, by implementing CACC topological model, the connectivity information for two buildings that are connected but modelled with two separate surfaces can be preserved. The support of topological information via the CACC topological model also allows the seamless execution of adjacency queries between building elements, including elements of different dimensions
- …