12 research outputs found
Rank-based Decomposable Losses in Machine Learning: A Survey
Recent works have revealed an essential paradigm in designing loss functions
that differentiate individual losses vs. aggregate losses. The individual loss
measures the quality of the model on a sample, while the aggregate loss
combines individual losses/scores over each training sample. Both have a common
procedure that aggregates a set of individual values to a single numerical
value. The ranking order reflects the most fundamental relation among
individual values in designing losses. In addition, decomposability, in which a
loss can be decomposed into an ensemble of individual terms, becomes a
significant property of organizing losses/scores. This survey provides a
systematic and comprehensive review of rank-based decomposable losses in
machine learning. Specifically, we provide a new taxonomy of loss functions
that follows the perspectives of aggregate loss and individual loss. We
identify the aggregator to form such losses, which are examples of set
functions. We organize the rank-based decomposable losses into eight
categories. Following these categories, we review the literature on rank-based
aggregate losses and rank-based individual losses. We describe general formulas
for these losses and connect them with existing research topics. We also
suggest future research directions spanning unexplored, remaining, and emerging
issues in rank-based decomposable losses.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
Efficient and Modular Implicit Differentiation
Automatic differentiation (autodiff) has revolutionized machine learning. It
allows expressing complex computations by composing elementary ones in creative
ways and removes the burden of computing their derivatives by hand. More
recently, differentiation of optimization problem solutions has attracted
widespread attention with applications such as optimization as a layer, and in
bi-level problems such as hyper-parameter optimization and meta-learning.
However, the formulas for these derivatives often involve case-by-case tedious
mathematical derivations. In this paper, we propose a unified, efficient and
modular approach for implicit differentiation of optimization problems. In our
approach, the user defines (in Python in the case of our implementation) a
function capturing the optimality conditions of the problem to be
differentiated. Once this is done, we leverage autodiff of and implicit
differentiation to automatically differentiate the optimization problem. Our
approach thus combines the benefits of implicit differentiation and autodiff.
It is efficient as it can be added on top of any state-of-the-art solver and
modular as the optimality condition specification is decoupled from the
implicit differentiation mechanism. We show that seemingly simple principles
allow to recover many recently proposed implicit differentiation methods and
create new ones easily. We demonstrate the ease of formulating and solving
bi-level optimization problems using our framework. We also showcase an
application to the sensitivity analysis of molecular dynamics.Comment: V2: some corrections and link to softwar
Supervised classification and mathematical optimization
Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely
useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data.Ministerio de Ciencia e Innovaci贸nJunta de Andaluc铆
Supervised Classification and Mathematical Optimization
Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off-the-shelf Supervised Classification methods. Moreover, Mathematical Optimization turns out to be extremely useful to address important issues in Classification, such as identifying relevant variables, improving the interpretability of classifiers or dealing with vagueness/noise in the data
A vision-based optical character recognition system for real-time identification of tractors in a port container terminal
Automation has been seen as a promising solution to increase the productivity of modern sea port container terminals. The potential of increase in throughput, work efficiency and reduction of labor cost have lured stick holders to strive for the introduction of automation in the overall terminal operation. A specific container handling process that is readily amenable to automation is the deployment and control of gantry cranes in the container yard of a container terminal where typical operations of truck identification, loading and unloading containers, and job management are primarily performed manually in a typical terminal. To facilitate the overall automation of the gantry crane operation, we devised an approach for the real-time identification of tractors through the recognition of the corresponding number plates that are located on top of the tractor cabin. With this crucial piece of information, remote or automated yard operations can then be performed. A machine vision-based system is introduced whereby these number plates are read and identified in real-time while the tractors are operating in the terminal. In this paper, we present the design and implementation of the system and highlight the major difficulties encountered including the recognition of character information printed on the number plates due to poor image integrity. Working solutions are proposed to address these problems which are incorporated in the overall identification system.postprin
Job shop scheduling with artificial immune systems
The job shop scheduling is complex due to the dynamic environment. When the information of the jobs and machines are pre-defined and no unexpected events occur, the job shop is static. However, the real scheduling environment is always dynamic due to the constantly changing information and different uncertainties. This study discusses this complex job shop scheduling environment, and applies the AIS theory and switching strategy that changes the sequencing approach to the dispatching approach by taking into account the system status to solve this problem. AIS is a biological inspired computational paradigm that simulates the mechanisms of the biological immune system. Therefore, AIS presents appealing features of immune system that make AIS unique from other evolutionary intelligent algorithm, such as self-learning, long-lasting memory, cross reactive response, discrimination of self from non-self, fault tolerance, and strong adaptability to the environment. These features of AIS are successfully used in this study to solve the job shop scheduling problem. When the job shop environment is static, sequencing approach based on the clonal selection theory and immune network theory of AIS is applied. This approach achieves great performance, especially for small size problems in terms of computation time. The feature of long-lasting memory is demonstrated to be able to accelerate the convergence rate of the algorithm and reduce the computation time. When some unexpected events occasionally arrive at the job shop and disrupt the static environment, an extended deterministic dendritic cell algorithm (DCA) based on the DCA theory of AIS is proposed to arrange the rescheduling process to balance the efficiency and stability of the system. When the disturbances continuously occur, such as the continuous jobs arrival, the sequencing approach is changed to the dispatching approach that involves the priority dispatching rules (PDRs). The immune network theory of AIS is applied to propose an idiotypic network model of PDRs to arrange the application of various dispatching rules. The experiments show that the proposed network model presents strong adaptability to the dynamic job shop scheduling environment.postprin
Learning Bayesian networks based on optimization approaches
Learning accurate classifiers from preclassified data is a very active research topic in machine learning and artifcial intelligence. There are numerous classifier paradigms, among which Bayesian Networks are very effective and well known in domains with uncertainty. Bayesian Networks are widely used representation frameworks for reasoning with probabilistic information. These models use graphs to capture dependence and independence relationships between feature variables, allowing a concise representation of the knowledge as well as efficient graph based query processing algorithms. This representation is defined by two components: structure learning and parameter learning. The structure of this model represents a directed acyclic graph. The nodes in the graph correspond to the feature variables in the domain, and the arcs (edges) show the causal relationships between feature variables. A directed edge relates the variables so that the variable corresponding to the terminal node (child) will be conditioned on the variable corresponding to the initial node (parent). The parameter learning represents probabilities and conditional probabilities based on prior information or past experience. The set of probabilities are represented in the conditional probability table. Once the network structure is constructed, the probabilistic inferences are readily calculated, and can be performed to predict the outcome of some variables based on the observations of others. However, the problem of structure learning is a complex problem since the number of candidate structures grows exponentially when the number of feature variables increases. This thesis is devoted to the development of learning structures and parameters in Bayesian Networks. Different models based on optimization techniques are introduced to construct an optimal structure of a Bayesian Network. These models also consider the improvement of the Naive Bayes' structure by developing new algorithms to alleviate the independence assumptions. We present various models to learn parameters of Bayesian Networks; in particular we propose optimization models for the Naive Bayes and the Tree Augmented Naive Bayes by considering different objective functions. To solve corresponding optimization problems in Bayesian Networks, we develop new optimization algorithms. Local optimization methods are introduced based on the combination of the gradient and Newton methods. It is proved that the proposed methods are globally convergent and have superlinear convergence rates. As a global search we use the global optimization method, AGOP, implemented in the open software library GANSO. We apply the proposed local methods in the combination with AGOP. Therefore, the main contributions of this thesis include (a) new algorithms for learning an optimal structure of a Bayesian Network; (b) new models for learning the parameters of Bayesian Networks with the given structures; and finally (c) new optimization algorithms for optimizing the proposed models in (a) and (b). To validate the proposed methods, we conduct experiments across a number of real world problems. Print version is available at: http://library.federation.edu.au/record=b1804607~S4Doctor of Philosoph