63,748 research outputs found
Typical solution time for a vertex-covering algorithm on finite-connectivity random graphs
In this letter, we analytically describe the typical solution time needed by
a backtracking algorithm to solve the vertex-cover problem on
finite-connectivity random graphs. We find two different transitions: The first
one is algorithm-dependent and marks the dynamical transition from linear to
exponential solution times. The second one gives the maximum computational
complexity, and is found exactly at the threshold where the system undergoes an
algorithm-independent phase transition in its solvability. Analytical results
are corroborated by numerical simulations.Comment: 4 pages, 2 figures, to appear in Phys. Rev. Let
Estimation of instrinsic dimension via clustering
The problem of estimating the intrinsic dimension of a set of points in high dimensional space is a critical issue for a wide range of disciplines, including genomics, finance, and networking. Current estimation techniques are dependent on either the ambient or intrinsic dimension in terms of computational complexity, which may cause these methods to become intractable for large data sets. In this paper, we present a clustering-based methodology that exploits the inherent self-similarity of data to efficiently estimate the intrinsic dimension of a set of points. When the data satisfies a specified general clustering condition, we prove that the estimated dimension approaches the true Hausdorff dimension. Experiments show that the clustering-based approach allows for more efficient and accurate intrinsic dimension estimation compared with all prior techniques, even when the data does not conform to obvious self-similarity structure. Finally, we present empirical results which show the clustering-based estimation allows for a natural partitioning of the data points that lie on separate manifolds of varying intrinsic dimension
A new sequential covering strategy for inducing classification rules with ant colony algorithms
Ant colony optimization (ACO) algorithms have been successfully applied to discover a list of classification rules. In general, these algorithms follow a sequential covering strategy, where a single rule is discovered at each iteration of the algorithm in order to build a list of rules. The sequential covering strategy has the drawback of not coping with the problem of rule interaction, i.e., the outcome of a rule affects the rules that can be discovered subsequently since the search space is modified due to the removal of examples covered by previous rules. This paper proposes a new sequential covering strategy for ACO classification algorithms to mitigate the problem of rule interaction, where the order of the rules is implicitly encoded as pheromone values and the search is guided by the quality of a candidate list of rules. Our experiments using 18 publicly available data sets show that the predictive accuracy obtained by a new ACO classification algorithm implementing the proposed sequential covering strategy is statistically significantly higher than the predictive accuracy of state-of-the-art rule induction classification algorithms
Boosting search by rare events
Randomized search algorithms for hard combinatorial problems exhibit a large
variability of performances. We study the different types of rare events which
occur in such out-of-equilibrium stochastic processes and we show how they
cooperate in determining the final distribution of running times. As a
byproduct of our analysis we show how search algorithms are optimized by random
restarts.Comment: 4 pages, 3 eps figures. References update
Rule-based Machine Learning Methods for Functional Prediction
We describe a machine learning method for predicting the value of a
real-valued function, given the values of multiple input variables. The method
induces solutions from samples in the form of ordered disjunctive normal form
(DNF) decision rules. A central objective of the method and representation is
the induction of compact, easily interpretable solutions. This rule-based
decision model can be extended to search efficiently for similar cases prior to
approximating function values. Experimental results on real-world data
demonstrate that the new techniques are competitive with existing machine
learning and statistical methods and can sometimes yield superior regression
performance.Comment: See http://www.jair.org/ for any accompanying file
- …