93,201 research outputs found
"If You Can't Beat them, Join them": A Usability Approach to Interdependent Privacy in Cloud Apps
Cloud storage services, like Dropbox and Google Drive, have growing
ecosystems of 3rd party apps that are designed to work with users' cloud files.
Such apps often request full access to users' files, including files shared
with collaborators. Hence, whenever a user grants access to a new vendor, she
is inflicting a privacy loss on herself and on her collaborators too. Based on
analyzing a real dataset of 183 Google Drive users and 131 third party apps, we
discover that collaborators inflict a privacy loss which is at least 39% higher
than what users themselves cause. We take a step toward minimizing this loss by
introducing the concept of History-based decisions. Simply put, users are
informed at decision time about the vendors which have been previously granted
access to their data. Thus, they can reduce their privacy loss by not
installing apps from new vendors whenever possible. Next, we realize this
concept by introducing a new privacy indicator, which can be integrated within
the cloud apps' authorization interface. Via a web experiment with 141
participants recruited from CrowdFlower, we show that our privacy indicator can
significantly increase the user's likelihood of choosing the app that minimizes
her privacy loss. Finally, we explore the network effect of History-based
decisions via a simulation on top of large collaboration networks. We
demonstrate that adopting such a decision-making process is capable of reducing
the growth of users' privacy loss by 70% in a Google Drive-based network and by
40% in an author collaboration network. This is despite the fact that we
neither assume that users cooperate nor that they exhibit altruistic behavior.
To our knowledge, our work is the first to provide quantifiable evidence of the
privacy risk that collaborators pose in cloud apps. We are also the first to
mitigate this problem via a usable privacy approach.Comment: Authors' extended version of the paper published at CODASPY 201
Kronecker Graphs: An Approach to Modeling Networks
How can we model networks with a mathematically tractable model that allows
for rigorous analysis of network properties? Networks exhibit a long list of
surprising properties: heavy tails for the degree distribution; small
diameters; and densification and shrinking diameters over time. Most present
network models either fail to match several of the above properties, are
complicated to analyze mathematically, or both. In this paper we propose a
generative model for networks that is both mathematically tractable and can
generate networks that have the above mentioned properties. Our main idea is to
use the Kronecker product to generate graphs that we refer to as "Kronecker
graphs".
First, we prove that Kronecker graphs naturally obey common network
properties. We also provide empirical evidence showing that Kronecker graphs
can effectively model the structure of real networks.
We then present KronFit, a fast and scalable algorithm for fitting the
Kronecker graph generation model to large real networks. A naive approach to
fitting would take super- exponential time. In contrast, KronFit takes linear
time, by exploiting the structure of Kronecker matrix multiplication and by
using statistical simulation techniques.
Experiments on large real and synthetic networks show that KronFit finds
accurate parameters that indeed very well mimic the properties of target
networks. Once fitted, the model parameters can be used to gain insights about
the network structure, and the resulting synthetic graphs can be used for null-
models, anonymization, extrapolations, and graph summarization
Information system support in construction industry with semantic web technologies and/or autonomous reasoning agents
Information technology support is hard to find for the early design phases of the architectural design process. Many of the existing issues in such design decision support tools appear to be caused by a mismatch between the ways in which designers think and the ways in which information systems aim to give support. We therefore started an investigation of existing theories of design thinking, compared to the way in which design decision support systems provide information to the designer. We identify two main strategies towards information system support in the early design phase: (1) applications for making design try-outs, and (2) applications as autonomous reasoning agents. We outline preview implementations for both approaches and indicate to what extent these strategies can be used to improve information system support for the architectural designer
Cascading Behavior in Large Blog Graphs
How do blogs cite and influence each other? How do such links evolve? Does
the popularity of old blog posts drop exponentially with time? These are some
of the questions that we address in this work. Our goal is to build a model
that generates realistic cascades, so that it can help us with link prediction
and outlier detection.
Blogs (weblogs) have become an important medium of information because of
their timely publication, ease of use, and wide availability. In fact, they
often make headlines, by discussing and discovering evidence about political
events and facts. Often blogs link to one another, creating a publicly
available record of how information and influence spreads through an underlying
social network. Aggregating links from several blog posts creates a directed
graph which we analyze to discover the patterns of information propagation in
blogspace, and thereby understand the underlying social network. Not only are
blogs interesting on their own merit, but our analysis also sheds light on how
rumors, viruses, and ideas propagate over social and computer networks.
Here we report some surprising findings of the blog linking and information
propagation structure, after we analyzed one of the largest available datasets,
with 45,000 blogs and ~ 2.2 million blog-postings. Our analysis also sheds
light on how rumors, viruses, and ideas propagate over social and computer
networks. We also present a simple model that mimics the spread of information
on the blogosphere, and produces information cascades very similar to those
found in real life
Random Surfing Without Teleportation
In the standard Random Surfer Model, the teleportation matrix is necessary to
ensure that the final PageRank vector is well-defined. The introduction of this
matrix, however, results in serious problems and imposes fundamental
limitations to the quality of the ranking vectors. In this work, building on
the recently proposed NCDawareRank framework, we exploit the decomposition of
the underlying space into blocks, and we derive easy to check necessary and
sufficient conditions for random surfing without teleportation.Comment: 13 pages. Published in the Volume: "Algorithms, Probability, Networks
and Games, Springer-Verlag, 2015". (The updated version corrects small
typos/errors
BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking
Data generation is a key issue in big data benchmarking that aims to generate
application-specific data sets to meet the 4V requirements of big data.
Specifically, big data generators need to generate scalable data (Volume) of
different types (Variety) under controllable generation rates (Velocity) while
keeping the important characteristics of raw data (Veracity). This gives rise
to various new challenges about how we design generators efficiently and
successfully. To date, most existing techniques can only generate limited types
of data and support specific big data systems such as Hadoop. Hence we develop
a tool, called Big Data Generator Suite (BDGS), to efficiently generate
scalable big data while employing data models derived from real data to
preserve data veracity. The effectiveness of BDGS is demonstrated by developing
six data generators covering three representative data types (structured,
semi-structured and unstructured) and three data sources (text, graph, and
table data)
Network Sampling: From Static to Streaming Graphs
Network sampling is integral to the analysis of social, information, and
biological networks. Since many real-world networks are massive in size,
continuously evolving, and/or distributed in nature, the network structure is
often sampled in order to facilitate study. For these reasons, a more thorough
and complete understanding of network sampling is critical to support the field
of network science. In this paper, we outline a framework for the general
problem of network sampling, by highlighting the different objectives,
population and units of interest, and classes of network sampling methods. In
addition, we propose a spectrum of computational models for network sampling
methods, ranging from the traditionally studied model based on the assumption
of a static domain to a more challenging model that is appropriate for
streaming domains. We design a family of sampling methods based on the concept
of graph induction that generalize across the full spectrum of computational
models (from static to streaming) while efficiently preserving many of the
topological properties of the input graphs. Furthermore, we demonstrate how
traditional static sampling algorithms can be modified for graph streams for
each of the three main classes of sampling methods: node, edge, and
topology-based sampling. Our experimental results indicate that our proposed
family of sampling methods more accurately preserves the underlying properties
of the graph for both static and streaming graphs. Finally, we study the impact
of network sampling algorithms on the parameter estimation and performance
evaluation of relational classification algorithms
- …