2,369 research outputs found
Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
Modeling visual search not only offers an opportunity to predict the
usability of an interface before actually testing it on real users, but also
advances scientific understanding about human behavior. In this work, we first
conduct a set of analyses on a large-scale dataset of visual search tasks on
realistic webpages. We then present a deep neural network that learns to
predict the scannability of webpage content, i.e., how easy it is for a user to
find a specific target. Our model leverages both heuristic-based features such
as target size and unstructured features such as raw image pixels. This
approach allows us to model complex interactions that might be involved in a
realistic visual search task, which can not be easily achieved by traditional
analytical models. We analyze the model behavior to offer our insights into how
the salience map learned by the model aligns with human intuition and how the
learned semantic representation of each target type relates to its visual
search performance.Comment: the 2020 CHI Conference on Human Factors in Computing System
Fast Shortest Path Distance Estimation in Large Networks
We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications.
In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks.
We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random.
Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship
NAIS: Neural Attentive Item Similarity Model for Recommendation
Item-to-item collaborative filtering (aka. item-based CF) has been long used
for building recommender systems in industrial settings, owing to its
interpretability and efficiency in real-time personalization. It builds a
user's profile as her historically interacted items, recommending new items
that are similar to the user's profile. As such, the key to an item-based CF
method is in the estimation of item similarities. Early approaches use
statistical measures such as cosine similarity and Pearson coefficient to
estimate item similarities, which are less accurate since they lack tailored
optimization for the recommendation task. In recent years, several works
attempt to learn item similarities from data, by expressing the similarity as
an underlying model and estimating model parameters by optimizing a
recommendation-aware objective function. While extensive efforts have been made
to use shallow linear models for learning item similarities, there has been
relatively less work exploring nonlinear neural network models for item-based
CF.
In this work, we propose a neural network model named Neural Attentive Item
Similarity model (NAIS) for item-based CF. The key to our design of NAIS is an
attention network, which is capable of distinguishing which historical items in
a user profile are more important for a prediction. Compared to the
state-of-the-art item-based CF method Factored Item Similarity Model (FISM),
our NAIS has stronger representation power with only a few additional
parameters brought by the attention network. Extensive experiments on two
public benchmarks demonstrate the effectiveness of NAIS. This work is the first
attempt that designs neural network models for item-based CF, opening up new
research possibilities for future developments of neural recommender systems
On the Internet Delay Space Dimensionality
We investigate the dimensionality properties of the Internet delay space, i.e., the matrix of measured round-trip latencies between Internet hosts. Previous work on network coordinates has indicated that this matrix can be embedded, with reasonably low distortion, in a low-dimensional Euclidean space. Our work addresses the question: to what extent is the dimensionality an intrinsic property of the distance matrix, defined without reference to a host metric such as Euclidean space? Does the intrinsic dimensionality of the Internet delay space match the dimension determined using embedding techniques? if not, what
explain the discrepancy? What properties of the network contribute to its overall dimensionality? Using a dataset obtained via the King method, we compare three intrinsically-defined measures of dimensionality with the dimension obtained using network embedding techniques to establish the following conclusions. First, the structure of the delay space is best described by fractal measures of dimension rather than by integer-valued parameters, such as the embedding dimension. Second, the
intrinsic dimension is inherently than the embedding dimension; in fact by some measures it is less than 2. Third, the Internet dimensionality can be reduced by decomposing its delay space into pieces consisting of hosts which share an upstream Tier-1 autonomous system in common. Finally, we
argue that fractal dimensionality measures and non-linear embedding algorithms are capable of detecting subtle features of the delay space geometry which are not detected by other embedding techniques
- …