12 research outputs found
Continuation Path Learning for Homotopy Optimization
Homotopy optimization is a traditional method to deal with a complicated
optimization problem by solving a sequence of easy-to-hard surrogate
subproblems. However, this method can be very sensitive to the continuation
schedule design and might lead to a suboptimal solution to the original
problem. In addition, the intermediate solutions, often ignored by classic
homotopy optimization, could be useful for many real-world applications. In
this work, we propose a novel model-based approach to learn the whole
continuation path for homotopy optimization, which contains infinite
intermediate solutions for any surrogate subproblems. Rather than the classic
unidirectional easy-to-hard optimization, our method can simultaneously
optimize the original problem and all surrogate subproblems in a collaborative
manner. The proposed model also supports real-time generation of any
intermediate solution, which could be desirable for many applications.
Experimental studies on different problems show that our proposed method can
significantly improve the performance of homotopy optimization and provide
extra helpful information to support better decision-making.Comment: Accepted by the 40th International Conference on Machine Learning
(ICML 2023
All You Need is a Good Functional Prior for Bayesian Deep Learning
The Bayesian treatment of neural networks dictates that a prior distribution
is specified over their weight and bias parameters. This poses a challenge
because modern neural networks are characterized by a large number of
parameters, and the choice of these priors has an uncontrolled effect on the
induced functional prior, which is the distribution of the functions obtained
by sampling the parameters from their prior distribution. We argue that this is
a hugely limiting aspect of Bayesian deep learning, and this work tackles this
limitation in a practical and effective way. Our proposal is to reason in terms
of functional priors, which are easier to elicit, and to "tune" the priors of
neural network parameters in a way that they reflect such functional priors.
Gaussian processes offer a rigorous framework to define prior distributions
over functions, and we propose a novel and robust framework to match their
prior with the functional prior of neural networks based on the minimization of
their Wasserstein distance. We provide vast experimental evidence that coupling
these priors with scalable Markov chain Monte Carlo sampling offers
systematically large performance improvements over alternative choices of
priors and state-of-the-art approximate Bayesian deep learning approaches. We
consider this work a considerable step in the direction of making the
long-standing challenge of carrying out a fully Bayesian treatment of neural
networks, including convolutional neural networks, a concrete possibility
Dynamic Mathematics for Automated Machine Learning Techniques
Machine Learning and Neural Networks have been gaining popularity and are widely considered as the driving force of the Fourth Industrial Revolution. However, modern machine learning techniques such as backpropagation training was firmly established in 1986 while computer vision was revolutionised in 2012 with the introduction of AlexNet. Given all these accomplishments, why are neural networks still not an integral part of our society? ``Because they are difficult to implement in practice.'' I'd like to use machine learning, but I can't invest much time. The concept of Automated Machine Learning (AutoML) was first proposed by Professor Frank Hutter of the University of Freiburg. Machine learning is not simple; it requires a practitioner to have thorough understanding on the attributes of their data and the components which their model entails. AutoML is the effort to automate all tedious aspects of machine learning to form a clean data analysis pipeline. This thesis is our effort to develop and to understand ways to automate machine learning. Specifically, we focused on Recurrent Neural Networks (RNNs), Meta-Learning, and Continual Learning. We studied continual learning to enable a network to sequentially acquire skills in a dynamic environment; we studied meta-learning to understand how a network can be configured efficiently; and we studied RNNs to understand the consequences of consecutive actions. Our RNN-study focused on mathematical interpretability. We described a large variety of RNNs as one mathematical class to understand their core network mechanism. This enabled us to extend meta-learning beyond network configuration for network pruning and continual learning. This also provided insights for us to understand how a single network should be consecutively configured and led us to the creation of a simple generic patch that is compatible to several existing continual learning archetypes. This patch enhanced the robustness of continual learning techniques and allowed them to generalise data better. By and large, this thesis presented a series of extensions to enable AutoML to be made simple, efficient, and robust. More importantly, all of our methods are motivated with mathematical understandings through the lens of dynamical systems. Thus, we also increased the interpretability of AutoML concepts
An Initial Framework Assessing the Safety of Complex Systems
Trabajo presentado en la Conference on Complex Systems, celebrada online del 7 al 11 de diciembre de 2020.Atmospheric blocking events, that is large-scale nearly stationary atmospheric pressure patterns, are often associated with extreme weather in the mid-latitudes, such as heat waves and cold spells which have significant consequences on ecosystems, human health and economy. The high impact of blocking events has motivated numerous studies. However, there is not yet a comprehensive theory explaining their onset, maintenance and decay and their numerical prediction remains a challenge. In recent years, a number of studies have successfully employed complex network descriptions of fluid transport to characterize dynamical patterns in geophysical flows. The aim of the current work is to investigate the potential of so called Lagrangian flow networks for the detection and perhaps forecasting of atmospheric blocking events. The network is constructed by associating nodes to regions of the atmosphere and establishing links based on the flux of material between these nodes during a given time interval. One can then use effective tools and metrics developed in the context of graph theory to explore the atmospheric flow properties. In particular, Ser-Giacomi et al. [1] showed how optimal paths in a Lagrangian flow network highlight distinctive circulation patterns associated with atmospheric blocking events. We extend these results by studying the behavior of selected network measures (such as degree, entropy and harmonic closeness centrality)at the onset of and during blocking situations, demonstrating their ability to trace the spatio-temporal characteristics of these events.This research was conducted as part of the CAFE (Climate Advanced Forecasting of sub-seasonal Extremes) Innovative Training Network which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 813844
Networked Data Analytics: Network Comparison And Applied Graph Signal Processing
Networked data structures has been getting big, ubiquitous, and pervasive. As our day-to-day activities become more incorporated with and influenced by the digital world, we rely more on our intuition to provide us a high-level idea and subconscious understanding of the encountered data. This thesis aims at translating the qualitative intuitions we have about networked data into quantitative and formal tools by designing rigorous yet reasonable algorithms. In a nutshell, this thesis constructs models to compare and cluster networked data, to simplify a complicated networked structure, and to formalize the notion of smoothness and variation for domain-specific signals on a network. This thesis consists of two interrelated thrusts which explore both the scenarios where networks have intrinsic value and are themselves the object of study, and where the interest is for signals defined on top of the networks, so we leverage the information in the network to analyze the signals. Our results suggest that the intuition we have in analyzing huge data can be transformed into rigorous algorithms, and often the intuition results in superior performance, new observations, better complexity, and/or bridging two commonly implemented methods. Even though different in the principles they investigate, both thrusts are constructed on what we think as a contemporary alternation in data analytics: from building an algorithm then understanding it to having an intuition then building an algorithm around it.
We show that in order to formalize the intuitive idea to measure the difference between a pair of networks of arbitrary sizes, we could design two algorithms based on the intuition to find mappings between the node sets or to map one network into the subset of another network. Such methods also lead to a clustering algorithm to categorize networked data structures. Besides, we could define the notion of frequencies of a given network by ordering features in the network according to how important they are to the overall information conveyed by the network. These proposed algorithms succeed in comparing collaboration histories of researchers, clustering research communities via their publication patterns, categorizing moving objects from uncertain measurmenets, and separating networks constructed from different processes.
In the context of data analytics on top of networks, we design domain-specific tools by leveraging the recent advances in graph signal processing, which formalizes the intuitive notion of smoothness and variation of signals defined on top of networked structures, and generalizes conventional Fourier analysis to the graph domain. In specific, we show how these tools can be used to better classify the cancer subtypes by considering genetic profiles as signals on top of gene-to-gene interaction networks, to gain new insights to explain the difference between human beings in learning new tasks and switching attentions by considering brain activities as signals on top of brain connectivity networks, as well as to demonstrate how common methods in rating prediction are special graph filters and to base on this observation to design novel recommendation system algorithms
The structure and dynamics of multilayer networks
In the past years, network theory has successfully characterized the
interaction among the constituents of a variety of complex systems, ranging
from biological to technological, and social systems. However, up until
recently, attention was almost exclusively given to networks in which all
components were treated on equivalent footing, while neglecting all the extra
information about the temporal- or context-related properties of the
interactions under study. Only in the last years, taking advantage of the
enhanced resolution in real data sets, network scientists have directed their
interest to the multiplex character of real-world systems, and explicitly
considered the time-varying and multilayer nature of networks. We offer here a
comprehensive review on both structural and dynamical organization of graphs
made of diverse relationships (layers) between its constituents, and cover
several relevant issues, from a full redefinition of the basic structural
measures, to understanding how the multilayer nature of the network affects
processes and dynamics.Comment: In Press, Accepted Manuscript, Physics Reports 201
Statistical Foundations of Actuarial Learning and its Applications
This open access book discusses the statistical modeling of insurance problems, a process which comprises data collection, data analysis and statistical model building to forecast insured events that may happen in the future. It presents the mathematical foundations behind these fundamental statistical concepts and how they can be applied in daily actuarial practice. Statistical modeling has a wide range of applications, and, depending on the application, the theoretical aspects may be weighted differently: here the main focus is on prediction rather than explanation. Starting with a presentation of state-of-the-art actuarial models, such as generalized linear models, the book then dives into modern machine learning tools such as neural networks and text recognition to improve predictive modeling with complex features. Providing practitioners with detailed guidance on how to apply machine learning methods to real-world data sets, and how to interpret the results without losing sight of the mathematical assumptions on which these methods are based, the book can serve as a modern basis for an actuarial education syllabus
Statistical Foundations of Actuarial Learning and its Applications
This open access book discusses the statistical modeling of insurance problems, a process which comprises data collection, data analysis and statistical model building to forecast insured events that may happen in the future. It presents the mathematical foundations behind these fundamental statistical concepts and how they can be applied in daily actuarial practice. Statistical modeling has a wide range of applications, and, depending on the application, the theoretical aspects may be weighted differently: here the main focus is on prediction rather than explanation. Starting with a presentation of state-of-the-art actuarial models, such as generalized linear models, the book then dives into modern machine learning tools such as neural networks and text recognition to improve predictive modeling with complex features. Providing practitioners with detailed guidance on how to apply machine learning methods to real-world data sets, and how to interpret the results without losing sight of the mathematical assumptions on which these methods are based, the book can serve as a modern basis for an actuarial education syllabus