6,014 research outputs found
Convex Optimization for Machine Learning
This book covers an introduction to convex optimization, one of the powerful and tractable optimization problems that can be efficiently solved on a computer. The goal of the book is to
help develop a sense of what convex optimization is, and how it can be used in a widening array of practical contexts with a particular emphasis on machine learning.
The first part of the book covers core concepts of convex sets, convex functions, and related basic definitions that serve understanding convex optimization and its corresponding models. The second part deals with one very useful theory, called duality, which enables us to: (1) gain algorithmic insights; and (2) obtain an approximate solution to non-convex optimization problems which are often difficult to solve. The last part focuses on modern applications in machine learning and deep learning.
A defining feature of this book is that it succinctly relates the “story” of how convex optimization plays a role, via historical examples and trending machine learning applications. Another key feature is that it includes programming implementation of a variety of machine learning algorithms inspired by optimization fundamentals, together with a brief tutorial of the used programming tools. The implementation is based on Python, CVXPY, and TensorFlow.
This book does not follow a traditional textbook-style organization, but is streamlined via a series of lecture notes that are intimately related, centered around coherent themes and concepts. It serves as a textbook mainly for a senior-level undergraduate course, yet is also suitable for a first-year graduate course. Readers benefit from having a good background in linear algebra, some exposure to probability, and basic familiarity with Python
Predicting and Understanding Binding Affinities of Synthetic Anion Receptors
Anion receptors are molecules that can recognise and bind anions. They have applications in organocatalysis, anion sensing and the removal of anions from wastewater. Some anion receptors are also able to transport anions across cell membranes and show promise for the treatment of diseases such as cystic fibrosis and cancer. As such, it is of interest to develop computational methods that can reliably predict the physicochemical properties and anion binding affinities of these molecules. However, efforts to computationally model these molecules are hampered by the sheer size of typical receptors, making them too expensive to treat using accurate quantum chemical methods. Whilst efficient approximations such as local-correlation methods have been developed, the broader accuracy of these methods, particularly in their application to ionic non-covalent systems remains unclear. To address this gap, this thesis has carried out an extensive validation of local-correlation methods, and economical density functional theory (DFT) methods for receptors with different binding motifs. Additionally, multiscale models have also been examined with the view to extending the scope of these methods to model very large anion receptors. DFT methods giving good agreement with highly accurate calculations at a fraction of the cost were identified. The use of semiempirical methods combined with DFT in a multiscale model for calculating anion binding affinities lead to unexpectedly large errors with modest savings of computational time, while some "three-fold corrected" methods show promise in reducing the cost of geometry optimisations of large receptors. These validated protocols were subsequently applied to investigate the structure-binding relationships of a wide range of dual-hydrogen bonding receptors. Notably, different receptor motifs were found to have different conformational preferences, which could explain why experimentally, thioureas, thiosquaramides and croconamides show weaker chloride binding affinities than would be expected based on their acidity. The results suggest that pre-organising anion receptors in the conformer that facilitates hydrogen bond formation could be a promising strategy for the development of anion receptors. It is envisaged that these findings will aid in the design and screening of novel anion receptors with increased binding affinity and selectivity
Dynamic Feature Engineering and model selection methods for temporal tabular datasets with regime changes
The application of deep learning algorithms to temporal panel datasets is
difficult due to heavy non-stationarities which can lead to over-fitted models
that under-perform under regime changes. In this work we propose a new machine
learning pipeline for ranking predictions on temporal panel datasets which is
robust under regime changes of data. Different machine-learning models,
including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and
without simple feature engineering are evaluated in the pipeline with different
settings. We find that GBDT models with dropout display high performance,
robustness and generalisability with relatively low complexity and reduced
computational cost. We then show that online learning techniques can be used in
post-prediction processing to enhance the results. In particular, dynamic
feature neutralisation, an efficient procedure that requires no retraining of
models and can be applied post-prediction to any machine learning model,
improves robustness by reducing drawdown in regime changes. Furthermore, we
demonstrate that the creation of model ensembles through dynamic model
selection based on recent model performance leads to improved performance over
baseline by improving the Sharpe and Calmar ratios of out-of-sample prediction
performances. We also evaluate the robustness of our pipeline across different
data splits and random seeds with good reproducibility of results
Efficient Covariance Matrix Reconstruction with Iterative Spatial Spectrum Sampling
This work presents a cost-effective technique for designing robust adaptive
beamforming algorithms based on efficient covariance matrix reconstruction with
iterative spatial power spectrum (CMR-ISPS). The proposed CMR-ISPS approach
reconstructs the interference-plus-noise covariance (INC) matrix based on a
simplified maximum entropy power spectral density function that can be used to
shape the directional response of the beamformer. Firstly, we estimate the
directions of arrival (DoAs) of the interfering sources with the available
snapshots. We then develop an algorithm to reconstruct the INC matrix using a
weighted sum of outer products of steering vectors whose coefficients can be
estimated in the vicinity of the DoAs of the interferences which lie in a small
angular sector. We also devise a cost-effective adaptive algorithm based on
conjugate gradient techniques to update the beamforming weights and a method to
obtain estimates of the signal of interest (SOI) steering vector from the
spatial power spectrum. The proposed CMR-ISPS beamformer can suppress
interferers close to the direction of the SOI by producing notches in the
directional response of the array with sufficient depths. Simulation results
are provided to confirm the validity of the proposed method and make a
comparison to existing approachesComment: 14 pages, 8 figure
Low carbon multi-vector energy systems: a case study of the University of Edinburgh's 2040 'Net Zero' target
The ultimate goal of this research was to develop a methodology to support
decision-making by large (public sector) organisations regarding future energy
technology choices to reduce carbon emissions. This culminated in the
development of a multi-vector campus energy systems modelling tool that was
applied to the University of Edinburgh as a case study. To deliver this a series of
objectives were addressed. Machine learning models were applied to model
building heat and electrical energy use for extrapolation to campus level. This
was applied to explore the scope to reduce campus level emissions through
operational changes; this demonstrated that it is difficult to further reduce the
carbon emissions without technological changes given the University’s heavy
reliance on natural gas-fired combined heat and power and boilers. As part of the
analysis of alternative energy sources, the scope for off-campus wind farms was
considered; specifically this focussed on estimation of wind farm generation at
the planning stage and employed a model transfer strategy to facilitate use of
metered data from wind farms. One of the key issues in making decisions about
future energy sources on campus is the simultaneous changes in the wider
energy system and specifically the decarbonisation of electricity; to facilitate
better choices about onsite production and imports from the grid, a fundamental
electricity model was developed to translate the National Grid Future Energy
Scenarios into plausible patterns of electricity prices. The learning from these
activities were incorporated into a model able to develop possible configurations
for campus-level multi-vector energy systems given a variety of future pathways
and uncertainties. The optimal planning model is formulated as a mixed-integer
linear programming model with the objective to minimize the overall cost including
carbon emissions. A numerical case study for the planning of three real-world
campuses is presented to demonstrate the effectiveness of the proposed method.
The conclusion highlights the importance of energy storage and a remote wind
farm in these energy systems. Also, it is noted that there is no single solution that
works in all cases where there are differences in factors such as device cost and
performance, the gap between gas and electricity prices, weather conditions and
the use (or otherwise) of cross-campus local energy balancing
Efficiency and Sustainability of the Distributed Renewable Hybrid Power Systems Based on the Energy Internet, Blockchain Technology and Smart Contracts-Volume II
The climate changes that are becoming visible today are a challenge for the global research community. In this context, renewable energy sources, fuel cell systems, and other energy generating sources must be optimally combined and connected to the grid system using advanced energy transaction methods. As this reprint presents the latest solutions in the implementation of fuel cell and renewable energy in mobile and stationary applications, such as hybrid and microgrid power systems based on the Energy Internet, Blockchain technology, and smart contracts, we hope that they will be of interest to readers working in the related fields mentioned above
Rational Function Simplification for Integration-by-Parts Reduction and Beyond
We present FUEL (Fractional Universal Evaluation Library), a C++ library for
performing rational function arithmetic with a flexible choice of third-party
computer algebra systems as simplifiers. FUEL is an outgrowth of a C++
interface to Fermat which was originally part of the FIRE code for
integration-by-parts (IBP) reduction for Feynman integrals, now promoted to be
a standalone library and with access to simplifiers other than Fermat. We
compare the performance of various simplifiers for standalone benchmark
problems as well as IBP reduction runs with FIRE.Comment: 18 pages, 1 figure, 6 table
Application of Conventional Feedforward and Deep Neural Networks to Power Distribution System State Estimation and State Forecasting
Classical neural networks such as feedforward multilayer perceptron models (MLPs) are well established as universal approximators and as such, show promise in applications such as static state estimation in power transmission systems. This research investigates the application of conventional neural networks (MLPs) and deep learning based models such as convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) to mitigate challenges in power distribution system state estimation and forecasting based upon conventional analytic methods. The ability of MLPs to perform regression to perform power system state estimation will be investigated. MLPs are considered based upon their promise to learn complex functional mapping between datasets with many features. CNNs and LSTMs are considered based upon their promise to perform time-series forecasting by learning the autocorrelation of the dataset being predicted. The performance of MLPs will be presented in terms of root-mean-square error (RMSE) between actual and predicted voltage magnitude and voltage phase angles and training execution time for distribution system state estimation (DSSE). The performance of CNNs, and LSTMs will be presented in terms of RMSE between actual and predicted real power demand and execution time when performing distribution system state forecasting (DSSF). Additionally, Bayesian Optimization with Gaussian Processes are used to optimize MLPs for regression. An IEEE standard 34-bus test system is used to illustrate the proposed conventional neural network and deep learning methods and their effectiveness to perform power system state estimation and power system state forecasting respectively
Higher-order interactions in single-cell gene expression: towards a cybergenetic semantics of cell state
Finding and understanding patterns in gene expression guides our understanding of living organisms, their development, and diseases, but is a challenging and high-dimensional problem as there are many molecules involved. One way to learn about the structure of a gene regulatory network is by studying the interdependencies among its constituents in transcriptomic data sets. These interdependencies could be arbitrarily complex, but almost all current models of gene regulation contain pairwise interactions only, despite experimental evidence existing for higher-order regulation that cannot be decomposed into pairwise mechanisms. I set out to capture these higher-order dependencies in single-cell RNA-seq data using two different approaches. First, I fitted maximum entropy (or Ising) models to expression data by training restricted Boltzmann machines (RBMs). On simulated data, RBMs faithfully reproduced both pairwise and third-order interactions. I then trained RBMs on 37 genes from a scRNA-seq data set of 70k astrocytes from an embryonic mouse. While pairwise and third-order interactions were revealed, the estimates contained a strong omitted variable bias, and there was no statistically sound and tractable way to quantify the uncertainty in the estimates. As a result I next adopted a model-free approach. Estimating model-free interactions (MFIs) in single-cell gene expression data required a quasi-causal graph of conditional dependencies among the genes, which I inferred with an MCMC graph-optimisation algorithm on an initial estimate found by the Peter-Clark algorithm. As the estimates are model-free, MFIs can be interpreted either as mechanistic relationships between the genes, or as substructures in the cell population. On simulated data, MFIs revealed synergy and higher-order mechanisms in various logical and causal dynamics more accurately than any correlation- or information-based quantities. I then estimated MFIs among 1,000 genes, at up to seventh-order, in 20k neurons and 20k astrocytes from two different mouse brain scRNA-seq data sets: one developmental, and one adolescent. I found strong evidence for up to fifth-order interactions, and the MFIs mostly disambiguated direct from indirect regulation by preferentially coupling causally connected genes, whereas correlations persisted across causal chains. Validating the predicted interactions against the Pathway Commons database, gene ontology annotations, and semantic similarity, I found that pairwise MFIs contained different but a similar amount of mechanistic information relative to networks based on correlation. Furthermore, third-order interactions provided evidence of combinatorial regulation by transcription factors and immediate early genes.
I then switched focus from mechanism to population structure. Each significant MFI can be assigned a set of single cells that most influence its value. Hierarchical clustering of the MFIs by cell assignment revealed substructures in the cell population corresponding to diverse cell states. This offered a new, purely data-driven view on cell states because the inferred states are not required to localise in gene expression space. Across the four data sets, I found 69 significant and biologically interpretable cell states, where only 9 could be obtained by standard approaches. I identified immature neurons among developing astrocytes and radial glial cells, D1 and D2 medium spiny neurons, D1 MSN subtypes, and cell-cycle related states present across four data sets. I further found evidence for states defined by genes associated to neuropeptide signalling, neuronal activity, myelin metabolism, and genomic imprinting. MFIs thus provide a new, statistically sound method to detect substructure in single-cell gene expression data, identifying cell types, subtypes, or states that can be delocalised in gene expression space and whose hierarchical structure provides a new view on the semantics of cell state. The estimation of the quasi-causal graph, the MFIs, and inference of the associated states is implemented as a publicly available Nextflow pipeline called Stator
- …