39,864 research outputs found
Threesomes, Degenerates, and Love Triangles
The 3SUM problem is to decide, given a set of real numbers, whether any
three sum to zero. It is widely conjectured that a trivial -time
algorithm is optimal and over the years the consequences of this conjecture
have been revealed. This 3SUM conjecture implies lower bounds on
numerous problems in computational geometry and a variant of the conjecture
implies strong lower bounds on triangle enumeration, dynamic graph algorithms,
and string matching data structures.
In this paper we refute the 3SUM conjecture. We prove that the decision tree
complexity of 3SUM is and give two subquadratic 3SUM
algorithms, a deterministic one running in
time and a randomized one running in time with
high probability. Our results lead directly to improved bounds for -variate
linear degeneracy testing for all odd . The problem is to decide, given
a linear function and a set , whether . We show the
decision tree complexity of this problem is .
Finally, we give a subcubic algorithm for a generalization of the
-product over real-valued matrices and apply it to the problem of
finding zero-weight triangles in weighted graphs. We give a
depth- decision tree for this problem, as well as an
algorithm running in time
A New Approach to Speeding Up Topic Modeling
Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic
modeling paradigm, and recently finds many applications in computer vision and
computational biology. In this paper, we propose a fast and accurate batch
algorithm, active belief propagation (ABP), for training LDA. Usually batch LDA
algorithms require repeated scanning of the entire corpus and searching the
complete topic space. To process massive corpora having a large number of
topics, the training iteration of batch LDA algorithms is often inefficient and
time-consuming. To accelerate the training speed, ABP actively scans the subset
of corpus and searches the subset of topic space for topic modeling, therefore
saves enormous training time in each iteration. To ensure accuracy, ABP selects
only those documents and topics that contribute to the largest residuals within
the residual belief propagation (RBP) framework. On four real-world corpora,
ABP performs around to times faster than state-of-the-art batch LDA
algorithms with a comparable topic modeling accuracy.Comment: 14 pages, 12 figure
Finding groups in data: Cluster analysis with ants
Wepresent in this paper a modification of Lumer and Faieta’s algorithm for data clustering. This approach
mimics the clustering behavior observed in real ant colonies. This algorithm discovers automatically
clusters in numerical data without prior knowledge of possible number of clusters. In this paper we focus
on ant-based clustering algorithms, a particular kind of a swarm intelligent system, and on the effects on
the final clustering by using during the classification differentmetrics of dissimilarity: Euclidean, Cosine,
and Gower measures. Clustering with swarm-based algorithms is emerging as an alternative to more
conventional clustering methods, such as e.g. k-means, etc. Among the many bio-inspired techniques, ant
clustering algorithms have received special attention, especially because they still require much
investigation to improve performance, stability and other key features that would make such algorithms
mature tools for data mining.
As a case study, this paper focus on the behavior of clustering procedures in those new approaches.
The proposed algorithm and its modifications are evaluated in a number of well-known benchmark
datasets. Empirical results clearly show that ant-based clustering algorithms performs well when
compared to another techniques
- …