75,443 research outputs found

    Asymptotically Exact, Embarrassingly Parallel MCMC

    Full text link
    Communication costs, resulting from synchronization requirements during learning, can greatly slow down many parallel machine learning algorithms. In this paper, we present a parallel Markov chain Monte Carlo (MCMC) algorithm in which subsets of data are processed independently, with very little communication. First, we arbitrarily partition data onto multiple machines. Then, on each machine, any classical MCMC method (e.g., Gibbs sampling) may be used to draw samples from a posterior distribution given the data subset. Finally, the samples from each machine are combined to form samples from the full posterior. This embarrassingly parallel algorithm allows each machine to act independently on a subset of the data (without communication) until the final combination stage. We prove that our algorithm generates asymptotically exact samples and empirically demonstrate its ability to parallelize burn-in and sampling in several models

    An improved Ant Colony System for the Sequential Ordering Problem

    Full text link
    It is not rare that the performance of one metaheuristic algorithm can be improved by incorporating ideas taken from another. In this article we present how Simulated Annealing (SA) can be used to improve the efficiency of the Ant Colony System (ACS) and Enhanced ACS when solving the Sequential Ordering Problem (SOP). Moreover, we show how the very same ideas can be applied to improve the convergence of a dedicated local search, i.e. the SOP-3-exchange algorithm. A statistical analysis of the proposed algorithms both in terms of finding suitable parameter values and the quality of the generated solutions is presented based on a series of computational experiments conducted on SOP instances from the well-known TSPLIB and SOPLIB2006 repositories. The proposed ACS-SA and EACS-SA algorithms often generate solutions of better quality than the ACS and EACS, respectively. Moreover, the EACS-SA algorithm combined with the proposed SOP-3-exchange-SA local search was able to find 10 new best solutions for the SOP instances from the SOPLIB2006 repository, thus improving the state-of-the-art results as known from the literature. Overall, the best known or improved solutions were found in 41 out of 48 cases.Comment: 30 pages, 8 tables, 11 figure

    An approach for selecting cost estimation techniques for innovative high value manufacturing products

    Get PDF
    This paper presents an approach for determining the most appropriate technique for cost estimation of innovative high value manufacturing products depending on the amount of prior data available. Case study data from the United States Scheduled Annual Summary Reports for the Joint Strike Fighter (1997-2010) is used to exemplify how, depending on the attributes of a priori data certain techniques for cost estimation are more suitable than others. The data attribute focused on is the computational complexity involved in identifying whether or not there are patterns suited for propagation. Computational complexity is calculated based upon established mathematical principles for pattern recognition which argue that at least 42 data sets are required for the application of standard regression analysis techniques. The paper proposes that below this threshold a generic dependency model and starting conditions should be used and iteratively adapted to the context. In the special case of having less than four datasets available it is suggested that no contemporary cost estimating techniques other than analogy or expert opinion are currently applicable and alternate techniques must be explored if more quantitative results are desired. By applying the mathematical principles of complexity groups the paper argues that when less than four consecutive datasets are available the principles of topological data analysis should be applied. The preconditions being that the cost variance of at least three cost variance types for one to three time discrete continuous intervals is available so that it can be quantified based upon its geometrical attributes, visualised as an n-dimensional point cloud and then evaluated based upon the symmetrical properties of the evolving shape. Further work is suggested to validate the provided decision-trees in cost estimation practice
    corecore