71 research outputs found

    Diversified Late Acceptance Search

    Get PDF
    The well-known Late Acceptance Hill Climbing (LAHC) search aims to overcome the main downside of traditional Hill Climbing (HC) search, which is often quickly trapped in a local optimum due to strictly accepting only non-worsening moves within each iteration. In contrast, LAHC also accepts worsening moves, by keeping a circular array of fitness values of previously visited solutions and comparing the fitness values of candidate solutions against the least recent element in the array. While this straightforward strategy has proven effective, there are nevertheless situations where LAHC can unfortunately behave in a similar manner to HC. For example, when a new local optimum is found, often the same fitness value is stored many times in the array. To address this shortcoming, we propose new acceptance and replacement strategies to take into account worsening, improving, and sideways movement scenarios with the aim to improve the diversity of values in the array. Compared to LAHC, the proposed Diversified Late Acceptance Search approach is shown to lead to better quality solutions that are obtained with a lower number of iterations on benchmark Travelling Salesman Problems and Quadratic Assignment Problems

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Global Landscape Structure and the Random MAX-SAT Phase Transition

    Get PDF
    We revisit the fitness landscape structure of random MAX-SAT instances, and address the question: what structural features change when we go from easy underconstrained instances to hard overconstrained ones? Some standard techniques such as autocorrelation analysis fail to explain what makes instances hard to solve for stochastic local search algorithms, indicating that deeper landscape features are required to explain the observed performance differences. We address this question by means of local optima network (LON) analysis and visualisation. Our results reveal that the number, size, and, most importantly, the connectivity pattern of local and global optima change significantly over the easy-hard transition. Our empirical results suggests that the landscape of hard MAX-SAT instances may feature sub-optimal funnels, that is, clusters of sub-optimal solutions where stochastic local search methods can get trapped

    Portfolio Approaches for Constraint Optimization Problems

    Get PDF
    International audienceWithin the Constraints Satisfiability Problems (CSP) context, a methodology that has proved to be particularly performant consists in using a portfolio of different constraint solvers. Nevertheless, comparatively few studies and investigations has been done in the world of Constraint Optimization Problems (COP). In this work, we provide a generalization to COP as well as an empirical evaluation of different state of the art existing CSP portfolio approaches properly adapted to deal with COP. Experimental results confirm the effectiveness of portfolios even in the optimization field, and could give rise to some interesting future research

    Integrating Quantitative Knowledge into a Qualitative Gene Regulatory Network

    Get PDF
    Despite recent improvements in molecular techniques, biological knowledge remains incomplete. Any theorizing about living systems is therefore necessarily based on the use of heterogeneous and partial information. Much current research has focused successfully on the qualitative behaviors of macromolecular networks. Nonetheless, it is not capable of taking into account available quantitative information such as time-series protein concentration variations. The present work proposes a probabilistic modeling framework that integrates both kinds of information. Average case analysis methods are used in combination with Markov chains to link qualitative information about transcriptional regulations to quantitative information about protein concentrations. The approach is illustrated by modeling the carbon starvation response in Escherichia coli. It accurately predicts the quantitative time-series evolution of several protein concentrations using only knowledge of discrete gene interactions and a small number of quantitative observations on a single protein concentration. From this, the modeling technique also derives a ranking of interactions with respect to their importance during the experiment considered. Such a classification is confirmed by the literature. Therefore, our method is principally novel in that it allows (i) a hybrid model that integrates both qualitative discrete model and quantities to be built, even using a small amount of quantitative information, (ii) new quantitative predictions to be derived, (iii) the robustness and relevance of interactions with respect to phenotypic criteria to be precisely quantified, and (iv) the key features of the model to be extracted that can be used as a guidance to design future experiments

    An evaluation of three DoE-guided meta-heuristic-based solution methods for a three-echelon sustainable distribution network

    Get PDF
    This article evaluates the efficiency of three meta-heuristic optimiser (viz. MOGA-II, MOPSO and NSGA-II)-based solution methods for designing a sustainable three-echelon distribution network. The distribution network employs a bi-objective location-routing model. Due to the mathematically NP-hard nature of the model a multi-disciplinary optimisation commercial platform, modeFRONTIER®, is adopted to utilise the solution methods. The proposed Design of Experiment (DoE)-guided solution methods are of two phased that solve the NP-hard model to attain minimal total costs and total CO2 emission from transportation. Convergence of the optimisers are tested and compared. Ranking of the realistic results are examined using Pareto frontiers and the Technique for Order Preference by Similarity to Ideal Solution approach, followed by determination of the optimal transportation routes. A case of an Irish dairy processing industry’s three-echelon logistics network is considered to validate the solution methods. The results obtained through the proposed methods provide information on open/closed distribution centres (DCs), vehicle routing patterns connecting plants to DCs, open DCs to retailers and retailers to retailers, and number of trucks required in each route to transport the products. It is found that the DoE-guided NSGA-II optimiser based solution is more efficient when compared with the DoE-guided MOGA-II and MOPSO optimiser based solution methods in solving the bi-objective NP-hard three-echelon sustainable model. This efficient solution method enable managers to structure the physical distribution network on the demand side of a logistics network, minimising total cost and total CO2 emission from transportation while satisfying all operational constraints
    corecore