Search CORE

39,864 research outputs found

Threesomes, Degenerates, and Love Triangles

Author: Grønlund Allan
Pettie Seth
Publication venue
Publication date: 30/05/2014
Field of study

The 3SUM problem is to decide, given a set of

n

real numbers, whether any three sum to zero. It is widely conjectured that a trivial

O(n^2)

-time algorithm is optimal and over the years the consequences of this conjecture have been revealed. This 3SUM conjecture implies

\Omega(n^2)

lower bounds on numerous problems in computational geometry and a variant of the conjecture implies strong lower bounds on triangle enumeration, dynamic graph algorithms, and string matching data structures. In this paper we refute the 3SUM conjecture. We prove that the decision tree complexity of 3SUM is

O(n^{3/2}\sqrt{\log n})

and give two subquadratic 3SUM algorithms, a deterministic one running in

O(n^2 / (\log n/\log\log n)^{2/3})

time and a randomized one running in

O(n^2 (\log\log n)^2 / \log n)

time with high probability. Our results lead directly to improved bounds for

k

-variate linear degeneracy testing for all odd

k\ge 3

. The problem is to decide, given a linear function

f(x_1,\ldots,x_k) = \alpha_0 + \sum_{1\le i\le k} \alpha_i x_i

and a set

A \subset \mathbb{R}

, whether

0\in f(A^k)

. We show the decision tree complexity of this problem is

O(n^{k/2}\sqrt{\log n})

. Finally, we give a subcubic algorithm for a generalization of the

(\min,+)

-product over real-valued matrices and apply it to the problem of finding zero-weight triangles in weighted graphs. We give a depth-

O(n^{5/2}\sqrt{\log n})

decision tree for this problem, as well as an algorithm running in time

O(n^3 (\log\log n)^2/\log n)

arXiv.org e-Print Archive

CiteSeerX

Crossref

A New Approach to Speeding Up Topic Modeling

Author: Jia Zeng
Senior Member
Xiao-qin Cao
Zhi-qiang Liu
Publication venue
Publication date: 07/04/2014
Field of study

Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic modeling paradigm, and recently finds many applications in computer vision and computational biology. In this paper, we propose a fast and accurate batch algorithm, active belief propagation (ABP), for training LDA. Usually batch LDA algorithms require repeated scanning of the entire corpus and searching the complete topic space. To process massive corpora having a large number of topics, the training iteration of batch LDA algorithms is often inefficient and time-consuming. To accelerate the training speed, ABP actively scans the subset of corpus and searches the subset of topic space for topic modeling, therefore saves enormous training time in each iteration. To ensure accuracy, ABP selects only those documents and topics that contribute to the largest residuals within the residual belief propagation (RBP) framework. On four real-world corpora, ABP performs around

10

100

times faster than state-of-the-art batch LDA algorithms with a comparable topic modeling accuracy.Comment: 14 pages, 12 figure

arXiv.org e-Print Archive

CiteSeerX

Finding groups in data: Cluster analysis with ants

Author: Berger
Bonabeau
Bonabeau
Brito
Brucker
Chu
Deneubourg
Deneubourg
Dorigo
Dubes
Ester
Franks
Ganti
Gibson
Guha
Halkidi
Handl
Hansen
Jain
Karypis
Kaufman
Kennedy
Lee
Lumer
MacQueen
Ng
Oprisan
Rijsbergen
Urszula Boryczka
Welch
Zait
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Wepresent in this paper a modification of Lumer and Faieta’s algorithm for data clustering. This approach mimics the clustering behavior observed in real ant colonies. This algorithm discovers automatically clusters in numerical data without prior knowledge of possible number of clusters. In this paper we focus on ant-based clustering algorithms, a particular kind of a swarm intelligent system, and on the effects on the final clustering by using during the classification differentmetrics of dissimilarity: Euclidean, Cosine, and Gower measures. Clustering with swarm-based algorithms is emerging as an alternative to more conventional clustering methods, such as e.g. k-means, etc. Among the many bio-inspired techniques, ant clustering algorithms have received special attention, especially because they still require much investigation to improve performance, stability and other key features that would make such algorithms mature tools for data mining. As a case study, this paper focus on the behavior of clustering procedures in those new approaches. The proposed algorithm and its modifications are evaluated in a number of well-known benchmark datasets. Empirical results clearly show that ant-based clustering algorithms performs well when compared to another techniques

Crossref

Bournemouth University Research Online