64,785 research outputs found
Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs
We develop data structures for dynamic closest pair problems with arbitrary
distance functions, that do not necessarily come from any geometric structure
on the objects. Based on a technique previously used by the author for
Euclidean closest pairs, we show how to insert and delete objects from an
n-object set, maintaining the closest pair, in O(n log^2 n) time per update and
O(n) space. With quadratic space, we can instead use a quadtree-like structure
to achieve an optimal time bound, O(n) per update. We apply these data
structures to hierarchical clustering, greedy matching, and TSP heuristics, and
discuss other potential applications in machine learning, Groebner bases, and
local improvement algorithms for partition and placement problems. Experiments
show our new methods to be faster in practice than previously used heuristics.Comment: 20 pages, 9 figures. A preliminary version of this paper appeared at
the 9th ACM-SIAM Symp. on Discrete Algorithms, San Francisco, 1998, pp.
619-628. For source code and experimental results, see
http://www.ics.uci.edu/~eppstein/projects/pairs
Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping
We consider the problem of estimating a sparse multi-response regression
function, with an application to expression quantitative trait locus (eQTL)
mapping, where the goal is to discover genetic variations that influence
gene-expression levels. In particular, we investigate a shrinkage technique
capable of capturing a given hierarchical structure over the responses, such as
a hierarchical clustering tree with leaf nodes for responses and internal nodes
for clusters of related responses at multiple granularity, and we seek to
leverage this structure to recover covariates relevant to each
hierarchically-defined cluster of responses. We propose a tree-guided group
lasso, or tree lasso, for estimating such structured sparsity under
multi-response regression by employing a novel penalty function constructed
from the tree. We describe a systematic weighting scheme for the overlapping
groups in the tree-penalty such that each regression coefficient is penalized
in a balanced manner despite the inhomogeneous multiplicity of group
memberships of the regression coefficients due to overlaps among groups. For
efficient optimization, we employ a smoothing proximal gradient method that was
originally developed for a general class of structured-sparsity-inducing
penalties. Using simulated and yeast data sets, we demonstrate that our method
shows a superior performance in terms of both prediction errors and recovery of
true sparsity patterns, compared to other methods for learning a
multivariate-response regression.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS549 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Impact of different time series aggregation methods on optimal energy system design
Modelling renewable energy systems is a computationally-demanding task due to
the high fluctuation of supply and demand time series. To reduce the scale of
these, this paper discusses different methods for their aggregation into
typical periods. Each aggregation method is applied to a different type of
energy system model, making the methods fairly incomparable. To overcome this,
the different aggregation methods are first extended so that they can be
applied to all types of multidimensional time series and then compared by
applying them to different energy system configurations and analyzing their
impact on the cost optimal design. It was found that regardless of the method,
time series aggregation allows for significantly reduced computational
resources. Nevertheless, averaged values lead to underestimation of the real
system cost in comparison to the use of representative periods from the
original time series. The aggregation method itself, e.g. k means clustering,
plays a minor role. More significant is the system considered: Energy systems
utilizing centralized resources require fewer typical periods for a feasible
system design in comparison to systems with a higher share of renewable
feed-in. Furthermore, for energy systems based on seasonal storage, currently
existing models integration of typical periods is not suitable
A clustering particle swarm optimizer for locating and tracking multiple optima in dynamic environments
This article is posted here with permission from the IEEE - Copyright @ 2010 IEEEIn the real world, many optimization problems are dynamic. This requires an optimization algorithm to not only find the global optimal solution under a specific environment but also to track the trajectory of the changing optima over dynamic environments. To address this requirement, this paper investigates a clustering particle swarm optimizer (PSO) for dynamic optimization problems. This algorithm employs a hierarchical clustering method to locate and track multiple peaks. A fast local search method is also introduced to search optimal solutions in a promising subregion found by the clustering method. Experimental study is conducted based on the moving peaks benchmark to test the performance of the clustering PSO in comparison with several state-of-the-art algorithms from the literature. The experimental results show the efficiency of the clustering PSO for locating and tracking multiple optima in dynamic environments in comparison with other particle swarm optimization models based on the multiswarm method.This work was supported by the Engineering and Physical Sciences Research Council of U.K., under Grant EP/E060722/1
- …