75,443 research outputs found
Asymptotically Exact, Embarrassingly Parallel MCMC
Communication costs, resulting from synchronization requirements during
learning, can greatly slow down many parallel machine learning algorithms. In
this paper, we present a parallel Markov chain Monte Carlo (MCMC) algorithm in
which subsets of data are processed independently, with very little
communication. First, we arbitrarily partition data onto multiple machines.
Then, on each machine, any classical MCMC method (e.g., Gibbs sampling) may be
used to draw samples from a posterior distribution given the data subset.
Finally, the samples from each machine are combined to form samples from the
full posterior. This embarrassingly parallel algorithm allows each machine to
act independently on a subset of the data (without communication) until the
final combination stage. We prove that our algorithm generates asymptotically
exact samples and empirically demonstrate its ability to parallelize burn-in
and sampling in several models
An improved Ant Colony System for the Sequential Ordering Problem
It is not rare that the performance of one metaheuristic algorithm can be
improved by incorporating ideas taken from another. In this article we present
how Simulated Annealing (SA) can be used to improve the efficiency of the Ant
Colony System (ACS) and Enhanced ACS when solving the Sequential Ordering
Problem (SOP). Moreover, we show how the very same ideas can be applied to
improve the convergence of a dedicated local search, i.e. the SOP-3-exchange
algorithm. A statistical analysis of the proposed algorithms both in terms of
finding suitable parameter values and the quality of the generated solutions is
presented based on a series of computational experiments conducted on SOP
instances from the well-known TSPLIB and SOPLIB2006 repositories. The proposed
ACS-SA and EACS-SA algorithms often generate solutions of better quality than
the ACS and EACS, respectively. Moreover, the EACS-SA algorithm combined with
the proposed SOP-3-exchange-SA local search was able to find 10 new best
solutions for the SOP instances from the SOPLIB2006 repository, thus improving
the state-of-the-art results as known from the literature. Overall, the best
known or improved solutions were found in 41 out of 48 cases.Comment: 30 pages, 8 tables, 11 figure
An approach for selecting cost estimation techniques for innovative high value manufacturing products
This paper presents an approach for determining the most appropriate technique for cost estimation of innovative high value manufacturing products depending on the amount of prior data available. Case study data from the United States Scheduled Annual Summary Reports for the Joint Strike Fighter (1997-2010) is used to exemplify how, depending on the attributes of a priori data certain techniques for cost estimation are more suitable than others. The data attribute focused on is the computational complexity involved in identifying whether or not there are patterns suited for propagation. Computational complexity is calculated based upon established mathematical principles for pattern recognition which argue that at least 42 data sets are required for the application of standard regression analysis techniques. The paper proposes that below this threshold a generic dependency model and starting conditions should be used and iteratively adapted to the context. In the special case of having less than four datasets available it is suggested that no contemporary cost estimating techniques other than analogy or expert opinion are currently applicable and alternate techniques must be explored if more quantitative results are desired. By applying the mathematical principles of complexity groups the paper argues that when less than four consecutive datasets are available the principles of topological data analysis should be applied. The preconditions being that the cost variance of at least three cost variance types for one to three time discrete continuous intervals is available so that it can be quantified based upon its geometrical attributes, visualised as an n-dimensional point cloud and then evaluated based upon the symmetrical properties of the evolving shape. Further work is suggested to validate the provided decision-trees in cost estimation practice
- …