20 research outputs found

    Bundle methods in nonsmooth DC optimization

    Get PDF
    Due to the complexity of many practical applications, we encounter optimization problems with nonsmooth functions, that is, functions which are not continuously differentiable everywhere. Classical gradient-based methods are not applicable to solve such problems, since they may fail in the nonsmooth setting. Therefore, it is imperative to develop numerical methods specifically designed for nonsmooth optimization. To date, bundle methods are considered to be the most efficient and reliable general purpose solvers for this type of problems. The idea in bundle methods is to approximate the subdifferential of the objective function by a bundle of subgradients. This information is then used to build a model for the objective. However, this model is typically convex and, due to this, it may be inaccurate and unable to adequately reflect the behaviour of the objective function in the nonconvex case. These circumstances motivate to design new bundle methods based on nonconvex models of the objective function. In this dissertation, the main focus is on nonsmooth DC optimization that constitutes an important and broad subclass of nonconvex optimization problems. A DC function can be presented as a difference of two convex functions. Thus, we can obtain a model that utilizes explicitly both the convexity and concavity of the objective by approximating separately the convex and concave parts. This way we end up with a nonconvex DC model describing the problem more accurately than the convex one. Based on the new DC model we introduce three different bundle methods. Two of them are designed for unconstrained DC optimization and the third one is capable of solving also multiobjective and constrained DC problems. The finite convergence is proved for each method. The numerical results demonstrate the efficiency of the methods and show the benefits obtained from the utilization of the DC decomposition. Even though the usage of the DC decomposition can improve the performance of the bundle methods, it is not always available or possible to construct. Thus, we present another bundle method for a general objective function implicitly collecting information about the DC structure. This method is developed for large-scale nonsmooth optimization and its convergence is proved for semismooth functions. The efficiency of the method is shown with numerical results. As an application of the developed methods, we consider the clusterwise linear regression (CLR) problems. By applying the support vector machines (SVM) approach a new model for these problems is proposed. The objective in the new formulation of the CLR problem is expressed as a DC function and a method based on one of the presented bundle methods is designed to solve it. Numerical results demonstrate robustness of the new approach to outliers.Monissa käytännön sovelluksissa tarkastelun kohteena oleva ongelma on monimutkainen ja joudutaan näin ollen mallintamaan epäsileillä funktioilla, jotka eivät välttämättä ole jatkuvasti differentioituvia kaikkialla. Klassisia gradienttiin perustuvia optimointimenetelmiä ei voida käyttää epäsileisiin tehtäviin, sillä epäsileillä funktioilla ei ole olemassa klassista gradienttia kaikkialla. Näin ollen epäsileään optimointiin on välttämätöntä kehittää omia numeerisia ratkaisumenetelmiä. Näistä kimppumenetelmiä pidetään tällä hetkellä kaikista tehokkaimpina ja luotettavimpina yleismenetelminä kyseisten tehtävien ratkaisemiseksi. Ideana kimppumenetelmissä on approksimoida kohdefunktion alidifferentiaalia kimpulla, joka on muodostettu keräämällä kohdefunktion aligradientteja edellisiltä iteraatiokierroksilta. Tätä tietoa hyödyntämällä voidaan muodostaa kohdefunktiolle malli, joka on alkuperäistä tehtävää helpompi ratkaista. Käytetty malli on tyypillisesti konveksi ja näin ollen se voi olla epätarkka ja kykenemätön esittämään alkuperäisen tehtävän rakennetta epäkonveksissa tapauksessa. Tästä syystä väitöskirjassa keskitytään kehittämään uusia kimppumenetelmiä, jotka mallinnusvaiheessa muodostavat kohdefunktiolle epäkonveksin mallin. Pääpaino väitöskirjassa on epäsileissä optimointitehtävissä, joissa funktiot voidaan esittää kahden konveksin funktion erotuksena (difference of two convex functions). Kyseisiä funktioita kutsutaan DC-funktioiksi ja ne muodostavat tärkeän ja laajan epäkonveksien funktioiden osajoukon. Tämä valinta mahdollistaa kohdefunktion konveksisuuden ja konkaavisuuden eksplisiittisen hyödyntämisen, sillä uusi malli kohdefunktiolle muodostetaan yhdistämällä erilliset konveksille ja konkaaville osalle rakennetut mallit. Tällä tavalla päädytään epäkonveksiin DC-malliin, joka pystyy kuvaamaan ratkaistavaa tehtävää tarkemmin kuin konveksi arvio. Väitöskirjassa esitetään kolme erilaista uuden DC-mallin pohjalta kehitettyä kimppumenetelmää sekä todistetaan menetelmien konvergenssit. Kaksi näistä menetelmistä on suunniteltu rajoitteettomaan DC-optimointiin ja kolmannella voidaan ratkaista myös monitavoitteisia ja rajoitteellisia DC-optimointitehtäviä. Numeeriset tulokset havainnollistavat menetelmien tehokkuutta sekä DC-hajotelman käytöstä saatuja etuja. Vaikka DC-hajotelman käyttö voi parantaa kimppumenetelmien suoritusta, sitä ei aina ole saatavilla tai mahdollista muodostaa. Tästä syystä väitöskirjassa esitetään myös neljäs kimppumenetelmä konvergenssitodistuksineen yleiselle kohdefunktiolle, jossa kerätään implisiittisesti tietoa kohdefunktion DC-rakenteesta. Menetelmä on kehitetty erityisesti suurille epäsileille optimointitehtäville ja sen tehokkuus osoitetaan numeerisella testauksella Sovelluksena väitöskirjassa tarkastellaan datalle klustereittain tehtävää lineaarista regressiota (clusterwise linear regression). Kyseiselle sovellukselle muodostetaan uusi malli hyödyntäen koneoppimisessa käytettyä SVM-lähestymistapaa (support vector machines approach) ja saatu kohdefunktio esitetään DC-funktiona. Näin ollen yhtä kehitetyistä kimppumenetelmistä sovelletaan tehtävän ratkaisemiseen. Numeeriset tulokset havainnollistavat uuden lähestymistavan robustisuutta ja tehokkuutta

    A unified framework for bivariate clustering and regression problems via mixed-integer linear programming

    Get PDF
    Clustering and regression are two of the most important problems in data analysis and machine learning. Recently, mixed-integer linear programs (MILPs) have been presented in the literature to solve these problems. By modelling the problems as MILPs, they are able to be solved very quickly by commercial solvers. In particular, MILPs for bivariate clusterwise linear regression (CLR) and (continuous) piecewise linear regression (PWLR) have recently appeared. These MILP models make use of binary variables and logical implications modelled through big-M\mathcal{M} constraints. In this paper, we present these models in the context of a unifying MILP framework for bivariate clustering and regression problems. We then present two new formulations within this framework, the first for ordered CLR, and the second for clusterwise piecewise linear regression (CPWLR). The CPWLR problem concerns simultaneously clustering discrete data, while modelling each cluster with a continuous PWL function. Extending upon the framework, we discuss how outlier detection can be implemented within the models, and how specific decomposition methods can be used to find speedups in the runtime. Experimental results show when each model is the most effective

    Generalized Clusterwise Regression for Simultaneous Estimation of Optimal Pavement Clusters and Performance Models

    Full text link
    The existing state-of-the-art approach of Clusterwise Regression (CR) to estimate pavement performance models (PPMs) pre-specifies explanatory variables without testing their significance; as an input, this approach requires the number of clusters for a given data set. Time-consuming ‘trial and error’ methods are required to determine the optimal number of clusters. A common objective function is the minimization of the total sum of squared errors (SSE). Given that SSE decreases monotonically as a function of the number of clusters, the optimal number of clusters with minimum SSE always is the total number of data points. Hence, the minimization of SSE is not the best objective function to seek for an optimal number of clusters. In previous studies, the PPMs were restricted to be either linear or nonlinear, irrespective of which functional form provided the best results. The existing mathematical programming formulations did not include constraints that ensured the minimum number of observations required in each cluster to achieve statistical significance. In addition, a pavement sample could be associated with multiple performance models. Hence, additional modeling was required to combine the results from multiple models. To address all these limitations, this research proposes a generalized CR that simultaneously 1) finds the optimal number of pavement clusters, 2) assigns pavement samples into clusters, 3) estimates the coefficients of cluster-specific explanatory variables, and 4) determines the best functional form between linear and nonlinear models. Linear and nonlinear functional forms were investigated to select the best model specification. A mixed-integer nonlinear mathematical program was formulated with the Bayesian Information Criteria (BIC) as the objective function. The advantage of using BIC is that it penalizes for including additional parameters (i.e., number of clusters and/or explanatory variables). Hence, the optimal CR models provided a balance between goodness of fit and model complexity. In addition, the search process for the best model specification using BIC has the property of consistency, which asymptotically selects this model with a probability of ‘1’. Comprehensive solution algorithms – Simulated Annealing coupled with Ordinary Least Squares for linear models and All Subsets Regression for nonlinear models – were implemented to solve the proposed mathematical problem. The algorithms selected the best model specification for each cluster after exploring all possible combinations of potentially significant explanatory variables. Potential multicollinearity issues were investigated and addressed as required. Variables identified as significant explanatory variables were average daily traffic, pavement age, rut depth along the pavement, annual average precipitation and minimum temperature, road functional class, prioritization category, and the number of lanes. All these variables were considered in the literature as the most critical factors for pavement deterioration. In addition, the predictive capability of the estimated models was investigated. The results showed that the models were robust without any overfitting issues, and provided small prediction errors. The models developed using the proposed approach provided superior explanatory power compared to those that were developed using the existing state-of-the-art approach of clusterwise regression. In particular, for the data set used in this research, nonlinear models provided better explanatory power than did the linear models. As expected, the results illustrated that different clusters might require different explanatory variables and associated coefficients. Similarly, determining the optimal number of clusters while estimating the corresponding PPMs contributed significantly to reduce the estimation error

    Mine evaluation optimisation

    Get PDF
    The definition of a mineral resource during exploration is a fundamental part of lease evaluation, which establishes the fair market value of the entire asset being explored in the open market. Since exact prediction of grades between sampled points is not currently possible by conventional methods, an exact agreement between predicted and actual grades will nearly always contain some error. These errors affect the evaluation of resources so impacting on characterisation of risks, financial projections and decisions about whether it is necessary to carry on with the further phases or not. The knowledge about minerals below the surface, even when it is based upon extensive geophysical analysis and drilling, is often too fragmentary to indicate with assurance where to drill, how deep to drill and what can be expected. Thus, the exploration team knows only the density of the rock and the grade along the core. The purpose of this study is to improve the process of resource evaluation in the exploration stage by increasing prediction accuracy and making an alternative assessment about the spatial characteristics of gold mineralisation. There is significant industrial interest in finding alternatives which may speed up the drilling phase, identify anomalies, worthwhile targets and help in establishing fair market value. Recent developments in nonconvex optimisation and high-dimensional statistics have led to the idea that some engineering problems such as predicting gold variability at the exploration stage can be solved with the application of clusterwise linear and penalised maximum likelihood regression techniques. This thesis attempts to solve the distribution of the mineralisation in the underlying geology using clusterwise linear regression and convex Least Absolute Shrinkage and Selection Operator (LASSO) techniques. The two presented optimisation techniques compute predictive solutions within a domain using physical data provided directly from drillholes. The decision-support techniques attempt a useful compromise between the traditional and recently introduced methods in optimisation and regression analysis that are developed to improve exploration targeting and to predict the gold occurrences at previously unsampled locations.Doctor of Philosoph

    Branching strategies for mixed-integer programs containing logical constraints and decomposable structure

    Get PDF
    Decision-making optimisation problems can include discrete selections, e.g. selecting a route, arranging non-overlapping items or designing a network of items. Branch-and-bound (B&B), a widely applied divide-and-conquer framework, often solves such problems by considering a continuous approximation, e.g. replacing discrete variable domains by a continuous superset. Such approximations weaken the logical relations, e.g. for discrete variables corresponding to Boolean variables. Branching in B&B reintroduces logical relations by dividing the search space. This thesis studies designing B&B branching strategies, i.e. how to divide the search space, for optimisation problems that contain both a logical and a continuous structure. We begin our study with a large-scale, industrially-relevant optimisation problem where the objective consists of machine-learnt gradient-boosted trees (GBTs) and convex penalty functions. GBT functions contain if-then queries which introduces a logical structure to this problem. We propose decomposition-based rigorous bounding strategies and an iterative heuristic that can be embedded into a B&B algorithm. We approach branching with two strategies: a pseudocost initialisation and strong branching that target the structure of GBT and convex penalty aspects of the optimisation objective, respectively. Computational tests show that our B&B approach outperforms state-of-the-art solvers in deriving rigorous bounds on optimality. Our second project investigates how satisfiability modulo theories (SMT) derived unsatisfiable cores may be utilised in a B&B context. Unsatisfiable cores are subsets of constraints that explain an infeasible result. We study two-dimensional bin packing (2BP) and develop a B&B algorithm that branches on SMT unsatisfiable cores. We use the unsatisfiable cores to derive cuts that break 2BP symmetries. Computational results show that our B&B algorithm solves 20% more instances when compared with commercial solvers on the tested instances. Finally, we study convex generalized disjunctive programming (GDP), a framework that supports logical variables and operators. Convex GDP includes disjunctions of mathematical constraints, which motivate branching by partitioning the disjunctions. We investigate separation by branching, i.e. eliminating solutions that prevent rigorous bound improvement, and propose a greedy algorithm for building the branches. We propose three scoring methods for selecting the next branching disjunction. We also analyse how to leverage infeasibility to expedite the B&B search. Computational results show that our scoring methods can reduce the number of explored B&B nodes by an order of magnitude when compared with scoring methods proposed in literature. Our infeasibility analysis further reduces the number of explored nodes.Open Acces

    Rainfall prediction in Australia : Clusterwise linear regression approach

    Get PDF
    Accurate rainfall prediction is a challenging task because of the complex physical processes involved. This complexity is compounded in Australia as the climate can be highly variable. Accurate rainfall prediction is immensely benecial for making informed policy, planning and management decisions, and can assist with the most sustainable operation of water resource systems. Short-term prediction of rainfall is provided by meteorological services; however, the intermediate to long-term prediction of rainfall remains challenging and contains much uncertainty. Many prediction approaches have been proposed in the literature, including statistical and computational intelligence approaches. However, finding a method to model the complex physical process of rainfall, especially in Australia where the climate is highly variable, is still a major challenge. The aims of this study are to: (a) develop an optimization based clusterwise linear regression method, (b) develop new prediction methods based on clusterwise linear regression, (c) assess the influence of geographic regions on the performance of prediction models in predicting monthly and weekly rainfall in Australia, (d) determine the combined influence of meteorological variables on rainfall prediction in Australia, and (e) carry out a comparative analysis of new and existing prediction techniques using Australian rainfall data. In this study, rainfall data with five input meteorological variables from 24 geographically diverse weather stations in Australia, over the period January 1970 to December 2014, have been taken from the Scientific Information for Land Owners (SILO). We also consider the climate zones when selecting weather stations, because Australia experiences a variety of climates due to its size. The data was divided into training and testing periods for evaluation purposes. In this study, optimization based clusterwise linear regression is modified and new prediction methods are developed for rainfall prediction. The proposed method is applied to predict monthly and weekly rainfall. The prediction performance of the clusterwise linear regression method was evaluated by comparing observed and predicted rainfall values using the performance measures: root mean squared error, the mean absolute error, the mean absolute scaled error and the Nash-Sutclie coefficient of efficiency. The proposed method is also compared with the clusterwise linear regression based on the maximum likelihood estimation, linear support vector machines for regression, support vector machines for regression with radial basis kernel function, multiple linear regression, artificial neural networks with and without hidden layer and k-nearest neighbours methods using computational results. Initially, to determine the appropriate input variables to be used in the investigation, we assessed all combinations of meteorological variables. The results confirm that single meteorological variables alone are unable to predict rainfall accurately. The prediction performance of all selected models was improved by adding the input variables in most locations. To assess the influence of geographic regions on the performance of prediction models and to compare the prediction performance of models, we trained models with the best combination of input variables and predicted monthly and weekly rainfall over the test periods. The results of this analysis confirm that the prediction performance of all selected models varied considerably with geographic regions for both weekly and monthly rainfall predictions. It is found that models have the lowest prediction error in the desert climate zone and highest in subtropical and tropical zones. The results also demonstrate that the proposed algorithm is capable of finding the patterns and trends of the observations for monthly and weekly rainfall predictions in all geographic regions. In desert, tropical and subtropical climate zones, the proposed method outperform other methods in most locations for both monthly and weekly rainfall predictions. In temperate and grassland zones the prediction performance of the proposed model is better in some locations while in the remaining locations it is slightly lower than the other models.Doctor of Philosoph

    From metaheuristics to learnheuristics: Applications to logistics, finance, and computing

    Get PDF
    Un gran nombre de processos de presa de decisions en sectors estratègics com el transport i la producció representen problemes NP-difícils. Sovint, aquests processos es caracteritzen per alts nivells d'incertesa i dinamisme. Les metaheurístiques són mètodes populars per a resoldre problemes d'optimització difícils en temps de càlcul raonables. No obstant això, sovint assumeixen que els inputs, les funcions objectiu, i les restriccions són deterministes i conegudes. Aquests constitueixen supòsits forts que obliguen a treballar amb problemes simplificats. Com a conseqüència, les solucions poden conduir a resultats pobres. Les simheurístiques integren la simulació a les metaheurístiques per resoldre problemes estocàstics d'una manera natural. Anàlogament, les learnheurístiques combinen l'estadística amb les metaheurístiques per fer front a problemes en entorns dinàmics, en què els inputs poden dependre de l'estructura de la solució. En aquest context, les principals contribucions d'aquesta tesi són: el disseny de les learnheurístiques, una classificació dels treballs que combinen l'estadística / l'aprenentatge automàtic i les metaheurístiques, i diverses aplicacions en transport, producció, finances i computació.Un gran número de procesos de toma de decisiones en sectores estratégicos como el transporte y la producción representan problemas NP-difíciles. Frecuentemente, estos problemas se caracterizan por altos niveles de incertidumbre y dinamismo. Las metaheurísticas son métodos populares para resolver problemas difíciles de optimización de manera rápida. Sin embargo, suelen asumir que los inputs, las funciones objetivo y las restricciones son deterministas y se conocen de antemano. Estas fuertes suposiciones conducen a trabajar con problemas simplificados. Como consecuencia, las soluciones obtenidas pueden tener un pobre rendimiento. Las simheurísticas integran simulación en metaheurísticas para resolver problemas estocásticos de una manera natural. De manera similar, las learnheurísticas combinan aprendizaje estadístico y metaheurísticas para abordar problemas en entornos dinámicos, donde los inputs pueden depender de la estructura de la solución. En este contexto, las principales aportaciones de esta tesis son: el diseño de las learnheurísticas, una clasificación de trabajos que combinan estadística / aprendizaje automático y metaheurísticas, y varias aplicaciones en transporte, producción, finanzas y computación.A large number of decision-making processes in strategic sectors such as transport and production involve NP-hard problems, which are frequently characterized by high levels of uncertainty and dynamism. Metaheuristics have become the predominant method for solving challenging optimization problems in reasonable computing times. However, they frequently assume that inputs, objective functions and constraints are deterministic and known in advance. These strong assumptions lead to work on oversimplified problems, and the solutions may demonstrate poor performance when implemented. Simheuristics, in turn, integrate simulation into metaheuristics as a way to naturally solve stochastic problems, and, in a similar fashion, learnheuristics combine statistical learning and metaheuristics to tackle problems in dynamic environments, where inputs may depend on the structure of the solution. The main contributions of this thesis include (i) a design for learnheuristics; (ii) a classification of works that hybridize statistical and machine learning and metaheuristics; and (iii) several applications for the fields of transport, production, finance and computing

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
    corecore