4,265 research outputs found

    Measuring Visual Complexity of Cluster-Based Visualizations

    Full text link
    Handling visual complexity is a challenging problem in visualization owing to the subjectiveness of its definition and the difficulty in devising generalizable quantitative metrics. In this paper we address this challenge by measuring the visual complexity of two common forms of cluster-based visualizations: scatter plots and parallel coordinatess. We conceptualize visual complexity as a form of visual uncertainty, which is a measure of the degree of difficulty for humans to interpret a visual representation correctly. We propose an algorithm for estimating visual complexity for the aforementioned visualizations using Allen's interval algebra. We first establish a set of primitive 2-cluster cases in scatter plots and another set for parallel coordinatess based on symmetric isomorphism. We confirm that both are the minimal sets and verify the correctness of their members computationally. We score the uncertainty of each primitive case based on its topological properties, including the existence of overlapping regions, splitting regions and meeting points or edges. We compare a few optional scoring schemes against a set of subjective scores by humans, and identify the one that is the most consistent with the subjective scores. Finally, we extend the 2-cluster measure to k-cluster measure as a general purpose estimator of visual complexity for these two forms of cluster-based visualization

    The visual uncertainty paradigm for controlling screen-space information in visualization

    Get PDF
    The information visualization pipeline serves as a lossy communication channel for presentation of data on a screen-space of limited resolution. The lossy communication is not just a machine-only phenomenon due to information loss caused by translation of data, but also a reflection of the degree to which the human user can comprehend visual information. The common entity in both aspects is the uncertainty associated with the visual representation. However, in the current linear model of the visualization pipeline, visual representation is mostly considered as the ends rather than the means for facilitating the analysis process. While the perceptual side of visualization is also being studied, little attention is paid to the way the visualization appears on the display. Thus, we believe there is a need to study the appearance of the visualization on a limited-resolution screen in order to understand its own properties and how they influence the way they represent the data. I argue that the visual uncertainty paradigm for controlling screen-space information will enable us in achieving user-centric optimization of a visualization in different application scenarios. Conceptualization of visual uncertainty enables us to integrate the encoding and decoding aspects of visual representation into a holistic framework facilitating the definition of metrics that serve as a bridge between the last stages of the visualization pipeline and the user's perceptual system. The goal of this dissertation is three-fold: i) conceptualize a visual uncertainty taxonomy in the context of pixel-based, multi-dimensional visualization techniques that helps systematic definition of screen-space metrics, ii) apply the taxonomy for identifying sources of useful visual uncertainty that helps in protecting privacy of sensitive data and also for identifying the types of uncertainty that can be reduced through interaction techniques, and iii) application of the metrics for designing information-assisted models that help in visualization of high-dimensional, temporal data

    Privacy-Friendly Mobility Analytics using Aggregate Location Data

    Get PDF
    Location data can be extremely useful to study commuting patterns and disruptions, as well as to predict real-time traffic volumes. At the same time, however, the fine-grained collection of user locations raises serious privacy concerns, as this can reveal sensitive information about the users, such as, life style, political and religious inclinations, or even identities. In this paper, we study the feasibility of crowd-sourced mobility analytics over aggregate location information: users periodically report their location, using a privacy-preserving aggregation protocol, so that the server can only recover aggregates -- i.e., how many, but not which, users are in a region at a given time. We experiment with real-world mobility datasets obtained from the Transport For London authority and the San Francisco Cabs network, and present a novel methodology based on time series modeling that is geared to forecast traffic volumes in regions of interest and to detect mobility anomalies in them. In the presence of anomalies, we also make enhanced traffic volume predictions by feeding our model with additional information from correlated regions. Finally, we present and evaluate a mobile app prototype, called Mobility Data Donors (MDD), in terms of computation, communication, and energy overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201

    SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

    Get PDF
    The kk-Nearest Neighbor Search (kk-NNS) is the backbone of several cloud-based services such as recommender systems, face recognition, and database search on text and images. In these services, the client sends the query to the cloud server and receives the response in which case the query and response are revealed to the service provider. Such data disclosures are unacceptable in several scenarios due to the sensitivity of data and/or privacy laws. In this paper, we introduce SANNS, a system for secure kk-NNS that keeps client's query and the search result confidential. SANNS comprises two protocols: an optimized linear scan and a protocol based on a novel sublinear time clustering-based algorithm. We prove the security of both protocols in the standard semi-honest model. The protocols are built upon several state-of-the-art cryptographic primitives such as lattice-based additively homomorphic encryption, distributed oblivious RAM, and garbled circuits. We provide several contributions to each of these primitives which are applicable to other secure computation tasks. Both of our protocols rely on a new circuit for the approximate top-kk selection from nn numbers that is built from O(n+k2)O(n + k^2) comparators. We have implemented our proposed system and performed extensive experimental results on four datasets in two different computation environments, demonstrating more than 18−31×18-31\times faster response time compared to optimally implemented protocols from the prior work. Moreover, SANNS is the first work that scales to the database of 10 million entries, pushing the limit by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202

    Methods for deriving and calibrating privacy-preserving heat maps from mobile sports tracking application data

    Get PDF
    AbstractUtilization of movement data from mobile sports tracking applications is affected by its inherent biases and sensitivity, which need to be understood when developing value-added services for, e.g., application users and city planners. We have developed a method for generating a privacy-preserving heat map with user diversity (ppDIV), in which the density of trajectories, as well as the diversity of users, is taken into account, thus preventing the bias effects caused by participation inequality. The method is applied to public cycling workouts and compared with privacy-preserving kernel density estimation (ppKDE) focusing only on the density of the recorded trajectories and privacy-preserving user count calculation (ppUCC), which is similar to the quadrat-count of individual application users. An awareness of privacy was introduced to all methods as a data pre-processing step following the principle of k-Anonymity. Calibration results for our heat maps using bicycle counting data gathered by the city of Helsinki are good (R2>0.7) and raise high expectations for utilizing heat maps in a city planning context. This is further supported by the diurnal distribution of the workouts indicating that, in addition to sports-oriented cyclists, many utilitarian cyclists are tracking their commutes. However, sports tracking data can only enrich official in-situ counts with its high spatio-temporal resolution and coverage, not replace them
    • …
    corecore