4,265 research outputs found
Measuring Visual Complexity of Cluster-Based Visualizations
Handling visual complexity is a challenging problem in visualization owing to
the subjectiveness of its definition and the difficulty in devising
generalizable quantitative metrics. In this paper we address this challenge by
measuring the visual complexity of two common forms of cluster-based
visualizations: scatter plots and parallel coordinatess. We conceptualize
visual complexity as a form of visual uncertainty, which is a measure of the
degree of difficulty for humans to interpret a visual representation correctly.
We propose an algorithm for estimating visual complexity for the aforementioned
visualizations using Allen's interval algebra. We first establish a set of
primitive 2-cluster cases in scatter plots and another set for parallel
coordinatess based on symmetric isomorphism. We confirm that both are the
minimal sets and verify the correctness of their members computationally. We
score the uncertainty of each primitive case based on its topological
properties, including the existence of overlapping regions, splitting regions
and meeting points or edges. We compare a few optional scoring schemes against
a set of subjective scores by humans, and identify the one that is the most
consistent with the subjective scores. Finally, we extend the 2-cluster measure
to k-cluster measure as a general purpose estimator of visual complexity for
these two forms of cluster-based visualization
The visual uncertainty paradigm for controlling screen-space information in visualization
The information visualization pipeline serves as a lossy communication channel for presentation of data on a screen-space of limited resolution. The lossy communication is not just a machine-only phenomenon due to information loss caused by translation of data, but also a reflection of the degree to which the human user can comprehend visual information. The common entity in both aspects is the uncertainty associated with the visual representation. However, in the current linear model of the visualization pipeline, visual representation is mostly considered as the ends rather than the means for facilitating the analysis process. While the perceptual side of visualization is also being studied, little attention is paid to the way the visualization appears on the display. Thus, we believe there is a need to study the appearance of the visualization on a limited-resolution screen in order to understand its own properties and how they influence the way they represent the data.
I argue that the visual uncertainty paradigm for controlling screen-space information will enable us in achieving user-centric optimization of a visualization in different application scenarios. Conceptualization of visual uncertainty enables us to integrate the encoding and decoding aspects of visual representation into a holistic framework facilitating the definition of metrics that serve as a bridge between the last stages of the visualization pipeline and the user's perceptual system. The goal of this dissertation is three-fold: i) conceptualize a visual uncertainty taxonomy in the context of pixel-based, multi-dimensional visualization techniques that helps systematic definition of screen-space metrics, ii) apply the taxonomy for identifying sources of useful visual uncertainty that helps in protecting privacy of sensitive data and also for identifying the types of uncertainty that can be reduced through interaction techniques, and iii) application of the metrics for designing information-assisted models that help in visualization of high-dimensional, temporal data
Privacy-Friendly Mobility Analytics using Aggregate Location Data
Location data can be extremely useful to study commuting patterns and
disruptions, as well as to predict real-time traffic volumes. At the same time,
however, the fine-grained collection of user locations raises serious privacy
concerns, as this can reveal sensitive information about the users, such as,
life style, political and religious inclinations, or even identities. In this
paper, we study the feasibility of crowd-sourced mobility analytics over
aggregate location information: users periodically report their location, using
a privacy-preserving aggregation protocol, so that the server can only recover
aggregates -- i.e., how many, but not which, users are in a region at a given
time. We experiment with real-world mobility datasets obtained from the
Transport For London authority and the San Francisco Cabs network, and present
a novel methodology based on time series modeling that is geared to forecast
traffic volumes in regions of interest and to detect mobility anomalies in
them. In the presence of anomalies, we also make enhanced traffic volume
predictions by feeding our model with additional information from correlated
regions. Finally, we present and evaluate a mobile app prototype, called
Mobility Data Donors (MDD), in terms of computation, communication, and energy
overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201
SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search
The -Nearest Neighbor Search (-NNS) is the backbone of several
cloud-based services such as recommender systems, face recognition, and
database search on text and images. In these services, the client sends the
query to the cloud server and receives the response in which case the query and
response are revealed to the service provider. Such data disclosures are
unacceptable in several scenarios due to the sensitivity of data and/or privacy
laws.
In this paper, we introduce SANNS, a system for secure -NNS that keeps
client's query and the search result confidential. SANNS comprises two
protocols: an optimized linear scan and a protocol based on a novel sublinear
time clustering-based algorithm. We prove the security of both protocols in the
standard semi-honest model. The protocols are built upon several
state-of-the-art cryptographic primitives such as lattice-based additively
homomorphic encryption, distributed oblivious RAM, and garbled circuits. We
provide several contributions to each of these primitives which are applicable
to other secure computation tasks. Both of our protocols rely on a new circuit
for the approximate top- selection from numbers that is built from comparators.
We have implemented our proposed system and performed extensive experimental
results on four datasets in two different computation environments,
demonstrating more than faster response time compared to
optimally implemented protocols from the prior work. Moreover, SANNS is the
first work that scales to the database of 10 million entries, pushing the limit
by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202
Recommended from our members
Scalable and privacy-respectful interactive discovery of place semantics from human mobility traces
Mobility diaries of a large number of people are needed for assessing transportation infrastructure and for spatial development planning. Acquisition of personal mobility diaries through population surveys is a costly and error-prone endeavour. We examine an alternative approach to obtaining similar information from episodic digital traces of people’s presence in various locations, which appear when people use their mobile devices for making phone calls, accessing the internet, or posting georeferenced contents (texts, photos, or videos) in social media. Having episodic traces of a person over a long time period, it is possible to detect significant (repeatedly visited) personal places and identify them as home, work, or place of social activities based on temporal patterns of a person’s presence in these places. Such analysis, however, can lead to compromising personal privacy. We have investigated the feasibility of deriving place meanings and reconstructing personal mobility diaries while preserving the privacy of individuals whose data are analysed. We have devised a visual analytics approach and a set of supporting tools making such privacy-preserving analysis possible. The approach was tested in two case studies with publicly available data: simulated tracks from the VAST Challenge 2014 and real traces built from georeferenced Twitter posts
Methods for deriving and calibrating privacy-preserving heat maps from mobile sports tracking application data
AbstractUtilization of movement data from mobile sports tracking applications is affected by its inherent biases and sensitivity, which need to be understood when developing value-added services for, e.g., application users and city planners. We have developed a method for generating a privacy-preserving heat map with user diversity (ppDIV), in which the density of trajectories, as well as the diversity of users, is taken into account, thus preventing the bias effects caused by participation inequality. The method is applied to public cycling workouts and compared with privacy-preserving kernel density estimation (ppKDE) focusing only on the density of the recorded trajectories and privacy-preserving user count calculation (ppUCC), which is similar to the quadrat-count of individual application users. An awareness of privacy was introduced to all methods as a data pre-processing step following the principle of k-Anonymity. Calibration results for our heat maps using bicycle counting data gathered by the city of Helsinki are good (R2>0.7) and raise high expectations for utilizing heat maps in a city planning context. This is further supported by the diurnal distribution of the workouts indicating that, in addition to sports-oriented cyclists, many utilitarian cyclists are tracking their commutes. However, sports tracking data can only enrich official in-situ counts with its high spatio-temporal resolution and coverage, not replace them
- …