139 research outputs found

    Approximating Distance Measures for the Skyline

    Get PDF
    In multi-parameter decision making, data is usually modeled as a set of points whose dimension is the number of parameters, and the skyline or Pareto points represent the possible optimal solutions for various optimization problems. The structure and computation of such points have been well studied, particularly in the database community. As the skyline can be quite large in high dimensions, one often seeks a compact summary. In particular, for a given integer parameter k, a subset of k points is desired which best approximates the skyline under some measure. Various measures have been proposed, but they mostly treat the skyline as a discrete object. By viewing the skyline as a continuous geometric hull, we propose a new measure that evaluates the quality of a subset by the Hausdorff distance of its hull to the full hull. We argue that in many ways our measure more naturally captures what it means to approximate the skyline. For our new geometric skyline approximation measure, we provide a plethora of results. Specifically, we provide (1) a near linear time exact algorithm in two dimensions, (2) APX-hardness results for dimensions three and higher, (3) approximation algorithms for related variants of our problem, and (4) a practical and efficient heuristic which uses our geometric insights into the problem, as well as various experimental results to show the efficacy of our approach

    The Internet of Things as a Privacy-Aware Database Machine

    Get PDF
    Instead of using a computer cluster with homogeneous nodes and very fast high bandwidth connections, we want to present the vision to use the Internet of Things (IoT) as a database machine. This is among others a key factor for smart (assistive) systems in apartments (AAL, ambient assisted living), offices (AAW, ambient assisted working), Smart Cities as well as factories (IIoT, Industry 4.0). It is important to massively distribute the calculation of analysis results on sensor nodes and other low-resource appliances in the environment, not only for reasons of performance, but also for reasons of privacy and protection of corporate knowledge. Thus, functions crucial for assistive systems, such as situation, activity, and intention recognition, are to be automatically transformed not only in database queries, but also in local nodes of lower performance. From a database-specific perspective, analysis operations on large quantities of distributed sensor data, currently based on classical big-data techniques and executed on large, homogeneously equipped parallel computers have to be automatically transformed to billions of processors with energy and capacity restrictions. In this visionary paper, we will focus on the database-specific perspective and the fundamental research questions in the underlying database theory

    CPS Transformation of Beta-Redexes

    Get PDF
    The extra compaction of the most compacting CPS transformation in existence, which is due to Sabry and Felleisen, is generally attributed to (1) making continuations occur first in CPS terms and (2) classifying more redexes as administrative. We show that this extra compaction is actually independent of the relative positions of values and continuations and furthermore that it is solely due to a context-sensitive transformation of beta-redexes. We stage the more compact CPS transformation into a first-order uncurrying phase and a context-insensitive CPS transformation. We also define a context-insensitive CPS transformation that provides the extra compaction. This CPS transformation operates in one pass and is dependently typed

    Rewriting with Acyclic Queries: Mind Your Head

    Get PDF
    The paper studies the rewriting problem, that is, the decision problem whether, for a given conjunctive query Q and a set ? of views, there is a conjunctive query Q\u27 over ? that is equivalent to Q, for cases where the query, the views, and/or the desired rewriting are acyclic or even more restricted. It shows that, if Q itself is acyclic, an acyclic rewriting exists if there is any rewriting. An analogous statement also holds for free-connex acyclic, hierarchical, and q-hierarchical queries. Regarding the complexity of the rewriting problem, the paper identifies a border between tractable and (presumably) intractable variants of the rewriting problem: for schemas of bounded arity, the acyclic rewriting problem is NP-hard, even if both Q and the views in ? are acyclic or hierarchical. However, it becomes tractable, if the views are free-connex acyclic (i.e., in a nutshell, their body is (i) acyclic and (ii) remains acyclic if their head is added as an additional atom)

    Consistent Query Answering for Expressive Constraints under Tuple-Deletion Semantics

    Full text link
    We study consistent query answering in relational databases. We consider an expressive class of schema constraints that generalizes both tuple-generating dependencies and equality-generating dependencies. We establish the complexity of consistent query answering and repair checking under tuple-deletion semantics for different fragments of the above constraint language. In particular, we identify new subclasses of constraints in which the above problems are tractable or even first-order rewritable

    The Internet of Things as a Privacy-Aware Database Machine

    Get PDF
    Instead of using a computer cluster with homogeneous nodes and very fast high bandwidth connections, we want to present the vision to use the Internet of Things (IoT) as a database machine. This is among others a key factor for smart (assistive) systems in apartments (AAL, ambient assisted living), offices (AAW, ambient assisted working), Smart Cities as well as factories (IIoT, Industry 4.0). It is important to massively distribute the calculation of analysis results on sensor nodes and other low-resource appliances in the environment, not only for reasons of performance, but also for reasons of privacy and protection of corporate knowledge. Thus, functions crucial for assistive systems, such as situation, activity, and intention recognition, are to be automatically transformed not only in database queries, but also in local nodes of lower performance. From a database-specific perspective, analysis operations on large quantities of distributed sensor data, currently based on classical big-data techniques and executed on large, homogeneously equipped parallel computers have to be automatically transformed to billions of processors with energy and capacity restrictions. In this visionary paper, we will focus on the database-specific perspective and the fundamental research questions in the underlying database theory

    What people study when they study Tumblr:Classifying Tumblr-related academic research

    Get PDF
    Purpose: Since its launch in 2007, research has been carried out on the popular social networking website Tumblr. This paper identifies published Tumblr based research, classifies it to understand approaches and methods, and provides methodological recommendations for others. / Design/methodology/approach: Research regarding Tumblr was identified. Following a review of the literature, a classification scheme was adapted and applied, to understand research focus. Papers were quantitatively classified using open coded content analysis of method, subject, approach, and topic. / Findings: The majority of published work relating to Tumblr concentrates on conceptual issues, followed by aspects of the messages sent. This has evolved over time. Perceived benefits are the platform’s long-form text posts, ability to track tags, and the multimodal nature of the platform. Severe research limitations are caused by the lack of demographic, geo-spatial, and temporal metadata attached to individual posts, the limited API, restricted access to data, and the large amounts of ephemeral posts on the site. / Research limitations/implications: This study focuses on Tumblr: the applicability of the approach to other media is not considered. We focus on published research and conference papers: there will be book content which was not found using our method. Tumblr as a platform has falling user numbers which may be of concern to researchers. / Practical implications: We identify practical barriers to research on the Tumblr platform including lack of metadata and access to big data, explaining why Tumblr is not as popular as Twitter in academic studies. - Social implications This paper highlights the breadth of topics covered by social media researchers, which allows us to understand popular online platforms. / Originality/value: There has not yet been an overarching study to look at the methods and purpose of those who study Tumblr. We identify Tumblr related research papers from the first appearing in July 2011 until July 2015. Our classification derived here provides a framework that can be used to analyse social media research, and in which to position Tumblr related work, with recommendations on benefits and limitations of the platform for researchers
    • …
    corecore