177,741 research outputs found
Solving Medium to Large Sized Euclidean Generalized Minimum Spanning Tree Problems
The generalized minimum spanning tree problem is a generalization of the minimum spanning tree problem. This network design problems finds several practical applications, especially when one considers the design of a large-capacity backbone network connecting several individual networks. In this paper we study the performance of six neighborhood search heuristics based on tabu search and variable neighborhood search on this problem domain. Our principal finding is that a tabu search heuristic almost always provides the best quality solution for small to medium sized instances within short execution times while variable neighborhood decomposition search provides the best quality solutions for most large instances.
Suffix Tree of Alignment: An Efficient Index for Similar Data
We consider an index data structure for similar strings. The generalized
suffix tree can be a solution for this. The generalized suffix tree of two
strings and is a compacted trie representing all suffixes in and
. It has leaves and can be constructed in time.
However, if the two strings are similar, the generalized suffix tree is not
efficient because it does not exploit the similarity which is usually
represented as an alignment of and .
In this paper we propose a space/time-efficient suffix tree of alignment
which wisely exploits the similarity in an alignment. Our suffix tree for an
alignment of and has leaves where is the sum of
the lengths of all parts of different from and is the sum of the
lengths of some common parts of and . We did not compromise the pattern
search to reduce the space. Our suffix tree can be searched for a pattern
in time where is the number of occurrences of in and
. We also present an efficient algorithm to construct the suffix tree of
alignment. When the suffix tree is constructed from scratch, the algorithm
requires time where is the sum of the lengths
of other common substrings of and . When the suffix tree of is
already given, it requires time.Comment: 12 page
Entity Personalized Talent Search Models with Tree Interaction Features
Talent Search systems aim to recommend potential candidates who are a good
match to the hiring needs of a recruiter expressed in terms of the recruiter's
search query or job posting. Past work in this domain has focused on linear and
nonlinear models which lack preference personalization in the user-level due to
being trained only with globally collected recruiter activity data. In this
paper, we propose an entity-personalized Talent Search model which utilizes a
combination of generalized linear mixed (GLMix) models and gradient boosted
decision tree (GBDT) models, and provides personalized talent recommendations
using nonlinear tree interaction features generated by the GBDT. We also
present the offline and online system architecture for the productionization of
this hybrid model approach in our Talent Search systems. Finally, we provide
offline and online experiment results benchmarking our entity-personalized
model with tree interaction features, which demonstrate significant
improvements in our precision metrics compared to globally trained
non-personalized models.Comment: This paper has been accepted for publication at ACM WWW 201
A Probabilistic Tabu Search Algorithm for the Generalized Minimum Spanning Tree Problem
In this paper we present a probabilistic tabu search algorithm for the generalized minimum spanning tree problem. The basic idea behind the algorithm is to use preprocessing operations to arrive at a probability value for each vertex which roughly corresponds to its probability of being included in an optimal solution, and to use such probability values to shrink the size of the neighborhood of solutions to manageable proportions. We report results from computational experiments that demonstrate the superiority of this method over the generic tabu search method.
S-TREE: Self-Organizing Trees for Data Clustering and Online Vector Quantization
This paper introduces S-TREE (Self-Organizing Tree), a family of models that use unsupervised learning to construct hierarchical representations of data and online tree-structured vector quantizers. The S-TREE1 model, which features a new tree-building algorithm, can be implemented with various cost functions. An alternative implementation, S-TREE2, which uses a new double-path search procedure, is also developed. S-TREE2 implements an online procedure that approximates an optimal (unstructured) clustering solution while imposing a tree-structure constraint. The performance of the S-TREE algorithms is illustrated with data clustering and vector quantization examples, including a Gauss-Markov source benchmark and an image compression application. S-TREE performance on these tasks is compared with the standard tree-structured vector quantizer (TSVQ) and the generalized Lloyd algorithm (GLA). The image reconstruction quality with S-TREE2 approaches that of GLA while taking less than 10% of computer time. S-TREE1 and S-TREE2 also compare favorably with the standard TSVQ in both the time needed to create the codebook and the quality of image reconstruction.Office of Naval Research (N00014-95-10409, N00014-95-0G57
Phase Transition in the Aldous-Shields Model of Growing Trees
We study analytically the late time statistics of the number of particles in
a growing tree model introduced by Aldous and Shields. In this model, a cluster
grows in continuous time on a binary Cayley tree, starting from the root, by
absorbing new particles at the empty perimeter sites at a rate proportional to
c^{-l} where c is a positive parameter and l is the distance of the perimeter
site from the root. For c=1, this model corresponds to random binary search
trees and for c=2 it corresponds to digital search trees in computer science.
By introducing a backward Fokker-Planck approach, we calculate the mean and the
variance of the number of particles at large times and show that the variance
undergoes a `phase transition' at a critical value c=sqrt{2}. While for
c>sqrt{2} the variance is proportional to the mean and the distribution is
normal, for c<sqrt{2} the variance is anomalously large and the distribution is
non-Gaussian due to the appearance of extreme fluctuations. The model is
generalized to one where growth occurs on a tree with branches and, in this
more general case, we show that the critical point occurs at c=sqrt{m}.Comment: Latex 17 pages, 6 figure
- …