12 research outputs found

    Rank-based Decomposable Losses in Machine Learning: A Survey

    Full text link
    Recent works have revealed an essential paradigm in designing loss functions that differentiate individual losses vs. aggregate losses. The individual loss measures the quality of the model on a sample, while the aggregate loss combines individual losses/scores over each training sample. Both have a common procedure that aggregates a set of individual values to a single numerical value. The ranking order reflects the most fundamental relation among individual values in designing losses. In addition, decomposability, in which a loss can be decomposed into an ensemble of individual terms, becomes a significant property of organizing losses/scores. This survey provides a systematic and comprehensive review of rank-based decomposable losses in machine learning. Specifically, we provide a new taxonomy of loss functions that follows the perspectives of aggregate loss and individual loss. We identify the aggregator to form such losses, which are examples of set functions. We organize the rank-based decomposable losses into eight categories. Following these categories, we review the literature on rank-based aggregate losses and rank-based individual losses. We describe general formulas for these losses and connect them with existing research topics. We also suggest future research directions spanning unexplored, remaining, and emerging issues in rank-based decomposable losses.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

    Efficient and Modular Implicit Differentiation

    Full text link
    Automatic differentiation (autodiff) has revolutionized machine learning. It allows expressing complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently, differentiation of optimization problem solutions has attracted widespread attention with applications such as optimization as a layer, and in bi-level problems such as hyper-parameter optimization and meta-learning. However, the formulas for these derivatives often involve case-by-case tedious mathematical derivations. In this paper, we propose a unified, efficient and modular approach for implicit differentiation of optimization problems. In our approach, the user defines (in Python in the case of our implementation) a function FF capturing the optimality conditions of the problem to be differentiated. Once this is done, we leverage autodiff of FF and implicit differentiation to automatically differentiate the optimization problem. Our approach thus combines the benefits of implicit differentiation and autodiff. It is efficient as it can be added on top of any state-of-the-art solver and modular as the optimality condition specification is decoupled from the implicit differentiation mechanism. We show that seemingly simple principles allow to recover many recently proposed implicit differentiation methods and create new ones easily. We demonstrate the ease of formulating and solving bi-level optimization problems using our framework. We also showcase an application to the sensitivity analysis of molecular dynamics.Comment: V2: some corrections and link to softwar

    Supervised classification and mathematical optimization

    Get PDF
    Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data.Ministerio de Ciencia e Innovaci贸nJunta de Andaluc铆

    Supervised Classification and Mathematical Optimization

    Get PDF
    Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data

    A vision-based optical character recognition system for real-time identification of tractors in a port container terminal

    Get PDF
    Automation has been seen as a promising solution to increase the productivity of modern sea port container terminals. The potential of increase in throughput, work efficiency and reduction of labor cost have lured stick holders to strive for the introduction of automation in the overall terminal operation. A specific container handling process that is readily amenable to automation is the deployment and control of gantry cranes in the container yard of a container terminal where typical operations of truck identification, loading and unloading containers, and job management are primarily performed manually in a typical terminal. To facilitate the overall automation of the gantry crane operation, we devised an approach for the real-time identification of tractors through the recognition of the corresponding number plates that are located on top of the tractor cabin. With this crucial piece of information, remote or automated yard operations can then be performed. A machine vision-based system is introduced whereby these number plates are read and identified in real-time while the tractors are operating in the terminal. In this paper, we present the design and implementation of the system and highlight the major difficulties encountered including the recognition of character information printed on the number plates due to poor image integrity. Working solutions are proposed to address these problems which are incorporated in the overall identification system.postprin

    Job shop scheduling with artificial immune systems

    Get PDF
    The job shop scheduling is complex due to the dynamic environment. When the information of the jobs and machines are pre-defined and no unexpected events occur, the job shop is static. However, the real scheduling environment is always dynamic due to the constantly changing information and different uncertainties. This study discusses this complex job shop scheduling environment, and applies the AIS theory and switching strategy that changes the sequencing approach to the dispatching approach by taking into account the system status to solve this problem. AIS is a biological inspired computational paradigm that simulates the mechanisms of the biological immune system. Therefore, AIS presents appealing features of immune system that make AIS unique from other evolutionary intelligent algorithm, such as self-learning, long-lasting memory, cross reactive response, discrimination of self from non-self, fault tolerance, and strong adaptability to the environment. These features of AIS are successfully used in this study to solve the job shop scheduling problem. When the job shop environment is static, sequencing approach based on the clonal selection theory and immune network theory of AIS is applied. This approach achieves great performance, especially for small size problems in terms of computation time. The feature of long-lasting memory is demonstrated to be able to accelerate the convergence rate of the algorithm and reduce the computation time. When some unexpected events occasionally arrive at the job shop and disrupt the static environment, an extended deterministic dendritic cell algorithm (DCA) based on the DCA theory of AIS is proposed to arrange the rescheduling process to balance the efficiency and stability of the system. When the disturbances continuously occur, such as the continuous jobs arrival, the sequencing approach is changed to the dispatching approach that involves the priority dispatching rules (PDRs). The immune network theory of AIS is applied to propose an idiotypic network model of PDRs to arrange the application of various dispatching rules. The experiments show that the proposed network model presents strong adaptability to the dynamic job shop scheduling environment.postprin

    Learning Bayesian networks based on optimization approaches

    Get PDF
    Learning accurate classifiers from preclassified data is a very active research topic in machine learning and artifcial intelligence. There are numerous classifier paradigms, among which Bayesian Networks are very effective and well known in domains with uncertainty. Bayesian Networks are widely used representation frameworks for reasoning with probabilistic information. These models use graphs to capture dependence and independence relationships between feature variables, allowing a concise representation of the knowledge as well as efficient graph based query processing algorithms. This representation is defined by two components: structure learning and parameter learning. The structure of this model represents a directed acyclic graph. The nodes in the graph correspond to the feature variables in the domain, and the arcs (edges) show the causal relationships between feature variables. A directed edge relates the variables so that the variable corresponding to the terminal node (child) will be conditioned on the variable corresponding to the initial node (parent). The parameter learning represents probabilities and conditional probabilities based on prior information or past experience. The set of probabilities are represented in the conditional probability table. Once the network structure is constructed, the probabilistic inferences are readily calculated, and can be performed to predict the outcome of some variables based on the observations of others. However, the problem of structure learning is a complex problem since the number of candidate structures grows exponentially when the number of feature variables increases. This thesis is devoted to the development of learning structures and parameters in Bayesian Networks. Different models based on optimization techniques are introduced to construct an optimal structure of a Bayesian Network. These models also consider the improvement of the Naive Bayes' structure by developing new algorithms to alleviate the independence assumptions. We present various models to learn parameters of Bayesian Networks; in particular we propose optimization models for the Naive Bayes and the Tree Augmented Naive Bayes by considering different objective functions. To solve corresponding optimization problems in Bayesian Networks, we develop new optimization algorithms. Local optimization methods are introduced based on the combination of the gradient and Newton methods. It is proved that the proposed methods are globally convergent and have superlinear convergence rates. As a global search we use the global optimization method, AGOP, implemented in the open software library GANSO. We apply the proposed local methods in the combination with AGOP. Therefore, the main contributions of this thesis include (a) new algorithms for learning an optimal structure of a Bayesian Network; (b) new models for learning the parameters of Bayesian Networks with the given structures; and finally (c) new optimization algorithms for optimizing the proposed models in (a) and (b). To validate the proposed methods, we conduct experiments across a number of real world problems. Print version is available at: http://library.federation.edu.au/record=b1804607~S4Doctor of Philosoph
    corecore