Search CORE

2,359 research outputs found

Maximum Entropy Production Principle for Stock Returns

Author: Fiedor Paweł
Publication venue
Publication date: 16/08/2014
Field of study

In our previous studies we have investigated the structural complexity of time series describing stock returns on New York's and Warsaw's stock exchanges, by employing two estimators of Shannon's entropy rate based on Lempel-Ziv and Context Tree Weighting algorithms, which were originally used for data compression. Such structural complexity of the time series describing logarithmic stock returns can be used as a measure of the inherent (model-free) predictability of the underlying price formation processes, testing the Efficient-Market Hypothesis in practice. We have also correlated the estimated predictability with the profitability of standard trading algorithms, and found that these do not use the structure inherent in the stock returns to any significant degree. To find a way to use the structural complexity of the stock returns for the purpose of predictions we propose the Maximum Entropy Production Principle as applied to stock returns, and test it on the two mentioned markets, inquiring into whether it is possible to enhance prediction of stock returns based on the structural complexity of these and the mentioned principle.Comment: 14 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

OCReP: An Optimally Conditioned Regularization for Pseudoinversion Based Neural Training

Author: Cancelliere Rossella
Gai Mario
Gallinari Patrick
Rubini Luca
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

In this paper we consider the training of single hidden layer neural networks by pseudoinversion, which, in spite of its popularity, is sometimes affected by numerical instability issues. Regularization is known to be effective in such cases, so that we introduce, in the framework of Tikhonov regularization, a matricial reformulation of the problem which allows us to use the condition number as a diagnostic tool for identification of instability. By imposing well-conditioning requirements on the relevant matrices, our theoretical analysis allows the identification of an optimal value for the regularization parameter from the standpoint of stability. We compare with the value derived by cross-validation for overfitting control and optimisation of the generalization performance. We test our method for both regression and classification tasks. The proposed method is quite effective in terms of predictivity, often with some improvement on performance with respect to the reference cases considered. This approach, due to analytical determination of the regularization parameter, dramatically reduces the computational load required by many other techniques.Comment: Published on Neural Network

arXiv.org e-Print Archive

Institutional Research Information System University of Turin

Scale-invariant segmentation of dynamic contrast-enhanced perfusion MR-images with inherent scale selection

Author: Egmont-Petersen M.
Geest R.J. van der
Hendriks E.A.
Hogendoorn P.C.W.
Janssen J.P.
Reiber J.H.C.
Reinders M.J.T.
Publication venue
Publication date: 01/01/2002
Field of study

Selection of the best set of scales is problematic when developing signaldriven approaches for pixel-based image segmentation. Often, different possibly conflicting criteria need to be fulfilled in order to obtain the best tradeoff between uncertainty (variance) and location accuracy. The optimal set of scales depends on several factors: the noise level present in the image material, the prior distribution of the different types of segments, the class-conditional distributions associated with each type of segment as well as the actual size of the (connected) segments. We analyse, theoretically and through experiments, the possibility of using the overall and class-conditional error rates as criteria for selecting the optimal sampling of the linear and morphological scale spaces. It is shown that the overall error rate is optimised by taking the prior class distribution in the image material into account. However, a uniform (ignorant) prior distribution ensures constant class-conditional error rates. Consequently, we advocate for a uniform prior class distribution when an uncommitted, scaleinvariant segmentation approach is desired. Experiments with a neural net classifier developed for segmentation of dynamic MR images, acquired with a paramagnetic tracer, support the theoretical results. Furthermore, the experiments show that the addition of spatial features to the classifier, extracted from the linear or morphological scale spaces, improves the segmentation result compared to a signal-driven approach based solely on the dynamic MR signal. The segmentation results obtained from the two types of features are compared using two novel quality measures that characterise spatial properties of labelled images

Utrecht University Repository

Novel Machine Learning Pipelines with Applications to Finance

Author: Fang Fan
Publication venue
Publication date: 01/11/2023
Field of study

King's Research Portal

Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization.

Author: Bhatt S
Cameron E
Flaxman SR
Gething PW
Smith DL
Weiss DJ
Publication venue: 'The Royal Society'
Publication date: 01/01/2017
Field of study

Maps of infectious disease-charting spatial variations in the force of infection, degree of endemicity and the burden on human health-provide an essential evidence base to support planning towards global health targets. Contemporary disease mapping efforts have embraced statistical modelling approaches to properly acknowledge uncertainties in both the available measurements and their spatial interpolation. The most common such approach is Gaussian process regression, a mathematical framework composed of two components: a mean function harnessing the predictive power of multiple independent variables, and a covariance function yielding spatio-temporal shrinkage against residual variation from the mean. Though many techniques have been developed to improve the flexibility and fitting of the covariance function, models for the mean function have typically been restricted to simple linear terms. For infectious diseases, known to be driven by complex interactions between environmental and socio-economic factors, improved modelling of the mean function can greatly boost predictive power. Here, we present an ensemble approach based on stacked generalization that allows for multiple nonlinear algorithmic mean functions to be jointly embedded within the Gaussian process framework. We apply this method to mapping Plasmodium falciparum prevalence data in sub-Saharan Africa and show that the generalized ensemble approach markedly outperforms any individual method

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Essays on distance metric learning

Author: Yang Xiaochen
Publication venue: UCL (University College London)
Publication date: 28/10/2020
Field of study

Many machine learning methods, such as the k-nearest neighbours algorithm, heavily depend on the distance measure between data points. As each task has its own notion of distance, distance metric learning has been proposed. It learns a distance metric to assign a small distance to semantically similar instances and a large distance to dissimilar instances by formulating an optimisation problem. While many loss functions and regularisation terms have been proposed to improve the discrimination and generalisation ability of the learned metric, the metric may be sensitive to a small perturbation in the input space. Moreover, these methods implicitly assume that features are numerical variables and labels are deterministic. However, categorical variables and probabilistic labels are common in real-world applications. This thesis develops three metric learning methods to enhance robustness against input perturbation and applicability for categorical variables and probabilistic labels. In Chapter 3, I identify that many existing methods maximise a margin in the feature space and such margin is insufficient to withstand perturbation in the input space. To address this issue, a new loss function is designed to penalise the input-space margin for being small and hence improve the robustness of the learned metric. In Chapter 4, I propose a metric learning method for categorical data. Classifying categorical data is difficult due to high feature ambiguity, and to this end, the technique of adversarial training is employed. Moreover, the generalisation bound of the proposed method is established, which informs the choice of the regularisation term. In Chapter 5, I adapt a classical probabilistic approach for metric learning to utilise information on probabilistic labels. The loss function is modified for training stability, and new evaluation criteria are suggested to assess the effectiveness of different methods. At the end of this thesis, two publications on hyperspectral target detection are appended as additional work during my PhD

UCL Discovery

Recommended from our members

The Foundations of Infinite-Dimensional Spectral Computations

Author: Colbrook Matthew
Publication venue: University of Cambridge
Publication date: 16/06/2020
Field of study

Spectral computations in infinite dimensions are ubiquitous in the sciences. However, their many applications and theoretical studies depend on computations which are infamously difficult. This thesis, therefore, addresses the broad question, “What is computationally possible within the field of spectral theory of separable Hilbert spaces?” The boundaries of what computers can achieve in computational spectral theory and mathematical physics are unknown, leaving many open questions that have been unsolved for decades. This thesis provides solutions to several such long-standing problems. To determine these boundaries, we use the Solvability Complexity Index (SCI) hierarchy, an idea which has its roots in Smale's comprehensive programme on the foundations of computational mathematics. The Smale programme led to a real-number counterpart of the Turing machine, yet left a substantial gap between theory and practice. The SCI hierarchy encompasses both these models and provides universal bounds on what is computationally possible. What makes spectral problems particularly delicate is that many of the problems can only be computed by using several limits, a phenomenon also shared in the foundations of polynomial root-finding as shown by McMullen. We develop and extend the SCI hierarchy to prove optimality of algorithms and construct a myriad of different methods for infinite-dimensional spectral problems, solving many computational spectral problems for the first time. For arguably almost any operator of applicable interest, we solve the long-standing computational spectral problem and construct algorithms that compute spectra with error control. This is done for partial differential operators with coefficients of locally bounded total variation and also for discrete infinite matrix operators. We also show how to compute spectral measures of normal operators (when the spectrum is a subset of a regular enough Jordan curve), including spectral measures of classes of self-adjoint operators with error control and the construction of high-order rational kernel methods. We classify the problems of computing measures, measure decompositions, types of spectra (pure point, absolutely continuous, singular continuous), functional calculus, and Radon--Nikodym derivatives in the SCI hierarchy. We construct algorithms for and classify; fractal dimensions of spectra, Lebesgue measures of spectra, spectral gaps, discrete spectra, eigenvalue multiplicities, capacity, different spectral radii and the problem of detecting algorithmic failure of previous methods (finite section method). The infinite-dimensional QR algorithm is also analysed, recovering extremal parts of spectra, corresponding eigenvectors, and invariant subspaces, with convergence rates and error control. Finally, we analyse pseudospectra of pseudoergodic operators (a generalisation of random operators) on vector-valued

l^p

spaces. All of the algorithms developed in this thesis are sharp in the sense of the SCI hierarchy. In other words, we prove that they are optimal, realising the boundaries of what digital computers can achieve. They are also implementable and practical, and the majority are parallelisable. Extensive numerical examples are given throughout, demonstrating efficiency and tackling difficult problems taken from mathematics and also physical applications. In summary, this thesis allows scientists to rigorously and efficiently compute many spectral properties for the first time. The framework provided by this thesis also encompasses a vast number of areas in computational mathematics, including the classical problem of polynomial root-finding, as well as optimisation, neural networks, PDEs and computer-assisted proofs. This framework will be explored in the future work of the author within these settings

Apollo (Cambridge)

Laser Based Mid-Infrared Spectroscopic Imaging – Exploring a Novel Method for Application in Cancer Diagnosis

Author: McCrow Andrew
McCrow Andrew
Publication venue: Physics, Imperial College London
Publication date: 01/12/2011
Field of study

A number of biomedical studies have shown that mid-infrared spectroscopic images can provide both morphological and biochemical information that can be used for the diagnosis of cancer. Whilst this technique has shown great potential it has yet to be employed by the medical profession. By replacing the conventional broadband thermal source employed in modern FTIR spectrometers with high-brightness, broadly tuneable laser based sources (QCLs and OPGs) we aim to solve one of the main obstacles to the transfer of this technology to the medical arena; namely poor signal to noise ratios at high spatial resolutions and short image acquisition times. In this thesis we take the first steps towards developing the optimum experimental configuration, the data processing algorithms and the spectroscopic image contrast and enhancement methods needed to utilise these high intensity laser based sources. We show that a QCL system is better suited to providing numerical absorbance values (biochemical information) than an OPG system primarily due to the QCL pulse stability. We also discuss practical protocols for the application of spectroscopic imaging to cancer diagnosis and present our spectroscopic imaging results from our laser based spectroscopic imaging experiments of oesophageal cancer tissue

Spiral - Imperial College Digital Repository

Full Covariance Modelling for Speech Recognition

Author: Bell Peter
Publication venue: The University of Edinburgh
Publication date: 01/01/2010
Field of study

HMM-based systems for Automatic Speech Recognition typically model the acoustic features using mixtures of multivariate Gaussians. In this thesis, we consider the problem of learning a suitable covariance matrix for each Gaussian. A variety of schemes have been proposed for controlling the number of covariance parameters per Gaussian, and studies have shown that in general, the greater the number of parameters used in the models, the better the recognition performance. We therefore investigate systems with full covariance Gaussians. However, in this case, the obvious choice of parameters – given by the sample covariance matrix – leads to matrices that are poorly-conditioned, and do not generalise well to unseen test data. The problem is particularly acute when the amount of training data is limited. We propose two solutions to this problem: firstly, we impose the requirement that each matrix should take the form of a Gaussian graphical model, and introduce a method for learning the parameters and the model structure simultaneously. Secondly, we explain how an alternative estimator, the shrinkage estimator, is preferable to the standard maximum likelihood estimator, and derive formulae for the optimal shrinkage intensity within the context of a Gaussian mixture model. We show how this relates to the use of a diagonal covariance smoothing prior. We compare the effectiveness of these techniques to standard methods on a phone recognition task where the quantity of training data is artificially constrained. We then investigate the performance of the shrinkage estimator on a large-vocabulary conversational telephone speech recognition task. Discriminative training techniques can be used to compensate for the invalidity of the model correctness assumption underpinning maximum likelihood estimation. On the large-vocabulary task, we use discriminative training of the full covariance models and diagonal priors to yield improved recognition performance

Edinburgh Research Archive