78 research outputs found
The primal power affine scaling method
In this paper, we present a variant of the primal affine scaling method, which we call the primal power affine scaling method. This method is defined by choosing a real r >0.5, and is similar to the power barrier variant of the primal-dual homotopy methods considered by den Hertog, Roos and Terlaky and Sheu and Fang. Here, we analyze the methods for r >1. The analysis for 0.50 2/(2 r -1) and with a variable asymptotic step size α k uniformly bounded away from 2/(2 r +1), the primal sequence converges to the relative interior of the optimal primal face, and the dual sequence converges to the power center of the optimal dual face. We also present an accelerated version of the method. We show that the two-step superlieear convergence rate of the method is 1+ r /( r +1), while the three-step convergence rate is 1+ 3 r /( r +2). Using the measure of Ostrowski, we note thet the three-step method for r =4 is more efficient than the two-step quadratically convergent method, which is the limit of the two-step method as r approaches infinity.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/44270/1/10479_2005_Article_BF02206824.pd
Recommended from our members
FROM OPTIMIZATION TO EQUILIBRATION: UNDERSTANDING AN EMERGING PARADIGM IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
Many existing machine learning (ML) algorithms cannot be viewed as gradient descent on some single objective. The solution trajectories taken by these algorithms naturally exhibit rotation, sometimes forming cycles, a behavior that is not expected with (full-batch) gradient descent. However, these algorithms can be viewed more generally as solving for the equilibrium of a game with possibly multiple competing objectives. Moreover, some recent ML models, specifically generative adversarial networks (GANs) and its variants, are now explicitly formulated as equilibrium problems. Equilibrium problems present challenges beyond those encountered in optimization such as limit-cycles and chaotic attractors and are able to abstract away some of the difficulties encountered when training models like GANs.
In this thesis, I aim to advance our understanding of equilibrium problems so as to improve state-of-the-art in GANs and related domains. In the following chapters, I will present work on designing a no-regret framework for solving monotone equilibrium problems in online or streaming settings (with applications to Reinforcement Learning), ensuring convergence when training a GAN to fit a normal distribution to data by Crossing-the-Curl, improving state-of-the-art image generation with techniques derived from theory, and borrowing tools from dynamical systems theory for analyzing the complex dynamics of GAN training
Making decisions based on context: models and applications in cognitive sciences and natural language processing
It is known that humans are capable of making decisions based on context and generalizing what they have learned. This dissertation considers two related problem areas and proposes different models that take context information into account. By including the context, the proposed models exhibit strong performance in each of the problem areas considered.
The first problem area focuses on a context association task studied in cognitive science, which evaluates the ability of a learning agent to associate specific stimuli with an appropriate response in particular spatial contexts. Four neural circuit models are proposed to model how the stimulus and context information are processed to produce a response. The neural networks are trained by modifying the strength of neural connections (weights) using principles of Hebbian learning. Such learning is considered biologically plausible, in contrast to back propagation techniques that do not have a solid neurophysiological basis. A series of theoretical results for the neural circuit models are established, guaranteeing convergence to an optimal configuration when all the stimulus-context pairs are provided during training. Among all the models, a specific model based on ideas from recommender systems trained with a primal-dual update rule, achieves perfect performance in learning and generalizing the mapping from context-stimulus pairs to correct responses.
The second problem area considered in the thesis focuses on clinical natural language processing (NLP). A particular application is the development of deep-learning models for analyzing radiology reports. Four NLP tasks are considered including anatomy named entity recognition, negation detection, incidental finding detection, and clinical concept extraction. A hierarchical Recurrent Neural Network (RNN) is proposed for anatomy named entity recognition, which is then used to produce a set of features for incidental finding detection of pulmonary nodules. A clinical context word embedding model is obtained, which is used with an RNN to model clinical concept extraction. Finally, feature-enriched RNN and transformer-based models with contextual word embedding are proposed for negation detection. All these models take the (clinical) context information into account. The models are evaluated on different datasets and are shown to achieve strong performance, largely outperforming the state-of-art
Conditional Gradient Methods
The purpose of this survey is to serve both as a gentle introduction and a
coherent overview of state-of-the-art Frank--Wolfe algorithms, also called
conditional gradient algorithms, for function minimization. These algorithms
are especially useful in convex optimization when linear optimization is
cheaper than projections.
The selection of the material has been guided by the principle of
highlighting crucial ideas as well as presenting new approaches that we believe
might become important in the future, with ample citations even of old works
imperative in the development of newer methods. Yet, our selection is sometimes
biased, and need not reflect consensus of the research community, and we have
certainly missed recent important contributions. After all the research area of
Frank--Wolfe is very active, making it a moving target. We apologize sincerely
in advance for any such distortions and we fully acknowledge: We stand on the
shoulder of giants.Comment: 238 pages with many figures. The FrankWolfe.jl Julia package
(https://github.com/ZIB-IOL/FrankWolfe.jl) providces state-of-the-art
implementations of many Frank--Wolfe method
Conformational Ensembles and Sampled Energy Landscapes: Analysis and Comparison
We present novel algorithms and software addressing four core problemsin computational structural biology, namely analyzing a conformationalensemble, comparing two conformational ensembles, analyzing a sampledenergy landscape, and comparing two sampled energy landscapes. Usingrecent developments in computational topology, graph theory, andcombinatorial optimization, we make two notable contributions. First,we a present a generic algorithm analyzing height fields. We then usethis algorithm to perform density based clustering of conformations,and to analyze a sampled energy landscape in terms of basins andtransitions between them. In both cases, topological persistence isused to manage ruggedness. Second, we introduce two algorithms tocompare transition graphs. The first is the classical earth mover distance metric which depends only on local minimum energyconfigurations along with their statistical weights, while the secondincorporates topological constraints inherent to conformationaltransitions.Illustrations are provided on a simplified protein model (BLN69), whosefrustrated potential energy landscape has been thoroughly studied.The software implementing our tools is also made available, and shouldprove valuable wherever conformational ensembles and energy landscapesare used
- …