66,707 research outputs found
Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook
A crucial task in the analysis of on-line social-networking systems is to
identify important people --- those linked by strong social ties --- within an
individual's network neighborhood. Here we investigate this question for a
particular category of strong ties, those involving spouses or romantic
partners. We organize our analysis around a basic question: given all the
connections among a person's friends, can you recognize his or her romantic
partner from the network structure alone? Using data from a large sample of
Facebook users, we find that this task can be accomplished with high accuracy,
but doing so requires the development of a new measure of tie strength that we
term `dispersion' --- the extent to which two people's mutual friends are not
themselves well-connected. The results offer methods for identifying types of
structurally significant people in on-line applications, and suggest a
potential expansion of existing theories of tie strength.Comment: Proc. 17th ACM Conference on Computer Supported Cooperative Work and
Social Computing (CSCW), 201
Algorithmic Identification of Probabilities
TThe problem is to identify a probability associated with a set of natural
numbers, given an infinite data sequence of elements from the set. If the given
sequence is drawn i.i.d. and the probability mass function involved (the
target) belongs to a computably enumerable (c.e.) or co-computably enumerable
(co-c.e.) set of computable probability mass functions, then there is an
algorithm to almost surely identify the target in the limit. The technical tool
is the strong law of large numbers. If the set is finite and the elements of
the sequence are dependent while the sequence is typical in the sense of
Martin-L\"of for at least one measure belonging to a c.e. or co-c.e. set of
computable measures, then there is an algorithm to identify in the limit a
computable measure for which the sequence is typical (there may be more than
one such measure). The technical tool is the theory of Kolmogorov complexity.
We give the algorithms and consider the associated predictions.Comment: 19 pages LaTeX.Corrected errors and rewrote the entire paper. arXiv
admin note: text overlap with arXiv:1208.500
Efficient Optimization for Rank-based Loss Functions
The accuracy of information retrieval systems is often measured using complex
loss functions such as the average precision (AP) or the normalized discounted
cumulative gain (NDCG). Given a set of positive and negative samples, the
parameters of a retrieval system can be estimated by minimizing these loss
functions. However, the non-differentiability and non-decomposability of these
loss functions does not allow for simple gradient based optimization
algorithms. This issue is generally circumvented by either optimizing a
structured hinge-loss upper bound to the loss function or by using asymptotic
methods like the direct-loss minimization framework. Yet, the high
computational complexity of loss-augmented inference, which is necessary for
both the frameworks, prohibits its use in large training data sets. To
alleviate this deficiency, we present a novel quicksort flavored algorithm for
a large class of non-decomposable loss functions. We provide a complete
characterization of the loss functions that are amenable to our algorithm, and
show that it includes both AP and NDCG based loss functions. Furthermore, we
prove that no comparison based algorithm can improve upon the computational
complexity of our approach asymptotically. We demonstrate the effectiveness of
our approach in the context of optimizing the structured hinge loss upper bound
of AP and NDCG loss for learning models for a variety of vision tasks. We show
that our approach provides significantly better results than simpler
decomposable loss functions, while requiring a comparable training time.Comment: 15 pages, 2 figure
Proof-Pattern Recognition and Lemma Discovery in ACL2
We present a novel technique for combining statistical machine learning for
proof-pattern recognition with symbolic methods for lemma discovery. The
resulting tool, ACL2(ml), gathers proof statistics and uses statistical
pattern-recognition to pre-processes data from libraries, and then suggests
auxiliary lemmas in new proofs by analogy with already seen examples. This
paper presents the implementation of ACL2(ml) alongside theoretical
descriptions of the proof-pattern recognition and lemma discovery methods
involved in it
Compositional Distributional Semantics with Long Short Term Memory
We are proposing an extension of the recursive neural network that makes use
of a variant of the long short-term memory architecture. The extension allows
information low in parse trees to be stored in a memory register (the `memory
cell') and used much later higher up in the parse tree. This provides a
solution to the vanishing gradient problem and allows the network to capture
long range dependencies. Experimental results show that our composition
outperformed the traditional neural-network composition on the Stanford
Sentiment Treebank.Comment: 10 pages, 7 figure
- …