2,359 research outputs found

    Maximum Entropy Production Principle for Stock Returns

    Full text link
    In our previous studies we have investigated the structural complexity of time series describing stock returns on New York's and Warsaw's stock exchanges, by employing two estimators of Shannon's entropy rate based on Lempel-Ziv and Context Tree Weighting algorithms, which were originally used for data compression. Such structural complexity of the time series describing logarithmic stock returns can be used as a measure of the inherent (model-free) predictability of the underlying price formation processes, testing the Efficient-Market Hypothesis in practice. We have also correlated the estimated predictability with the profitability of standard trading algorithms, and found that these do not use the structure inherent in the stock returns to any significant degree. To find a way to use the structural complexity of the stock returns for the purpose of predictions we propose the Maximum Entropy Production Principle as applied to stock returns, and test it on the two mentioned markets, inquiring into whether it is possible to enhance prediction of stock returns based on the structural complexity of these and the mentioned principle.Comment: 14 pages, 5 figure

    OCReP: An Optimally Conditioned Regularization for Pseudoinversion Based Neural Training

    Full text link
    In this paper we consider the training of single hidden layer neural networks by pseudoinversion, which, in spite of its popularity, is sometimes affected by numerical instability issues. Regularization is known to be effective in such cases, so that we introduce, in the framework of Tikhonov regularization, a matricial reformulation of the problem which allows us to use the condition number as a diagnostic tool for identification of instability. By imposing well-conditioning requirements on the relevant matrices, our theoretical analysis allows the identification of an optimal value for the regularization parameter from the standpoint of stability. We compare with the value derived by cross-validation for overfitting control and optimisation of the generalization performance. We test our method for both regression and classification tasks. The proposed method is quite effective in terms of predictivity, often with some improvement on performance with respect to the reference cases considered. This approach, due to analytical determination of the regularization parameter, dramatically reduces the computational load required by many other techniques.Comment: Published on Neural Network

    Scale-invariant segmentation of dynamic contrast-enhanced perfusion MR-images with inherent scale selection

    Get PDF
    Selection of the best set of scales is problematic when developing signaldriven approaches for pixel-based image segmentation. Often, different possibly conflicting criteria need to be fulfilled in order to obtain the best tradeoff between uncertainty (variance) and location accuracy. The optimal set of scales depends on several factors: the noise level present in the image material, the prior distribution of the different types of segments, the class-conditional distributions associated with each type of segment as well as the actual size of the (connected) segments. We analyse, theoretically and through experiments, the possibility of using the overall and class-conditional error rates as criteria for selecting the optimal sampling of the linear and morphological scale spaces. It is shown that the overall error rate is optimised by taking the prior class distribution in the image material into account. However, a uniform (ignorant) prior distribution ensures constant class-conditional error rates. Consequently, we advocate for a uniform prior class distribution when an uncommitted, scaleinvariant segmentation approach is desired. Experiments with a neural net classifier developed for segmentation of dynamic MR images, acquired with a paramagnetic tracer, support the theoretical results. Furthermore, the experiments show that the addition of spatial features to the classifier, extracted from the linear or morphological scale spaces, improves the segmentation result compared to a signal-driven approach based solely on the dynamic MR signal. The segmentation results obtained from the two types of features are compared using two novel quality measures that characterise spatial properties of labelled images

    Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization.

    Get PDF
    Maps of infectious disease-charting spatial variations in the force of infection, degree of endemicity and the burden on human health-provide an essential evidence base to support planning towards global health targets. Contemporary disease mapping efforts have embraced statistical modelling approaches to properly acknowledge uncertainties in both the available measurements and their spatial interpolation. The most common such approach is Gaussian process regression, a mathematical framework composed of two components: a mean function harnessing the predictive power of multiple independent variables, and a covariance function yielding spatio-temporal shrinkage against residual variation from the mean. Though many techniques have been developed to improve the flexibility and fitting of the covariance function, models for the mean function have typically been restricted to simple linear terms. For infectious diseases, known to be driven by complex interactions between environmental and socio-economic factors, improved modelling of the mean function can greatly boost predictive power. Here, we present an ensemble approach based on stacked generalization that allows for multiple nonlinear algorithmic mean functions to be jointly embedded within the Gaussian process framework. We apply this method to mapping Plasmodium falciparum prevalence data in sub-Saharan Africa and show that the generalized ensemble approach markedly outperforms any individual method

    Essays on distance metric learning

    Get PDF
    Many machine learning methods, such as the k-nearest neighbours algorithm, heavily depend on the distance measure between data points. As each task has its own notion of distance, distance metric learning has been proposed. It learns a distance metric to assign a small distance to semantically similar instances and a large distance to dissimilar instances by formulating an optimisation problem. While many loss functions and regularisation terms have been proposed to improve the discrimination and generalisation ability of the learned metric, the metric may be sensitive to a small perturbation in the input space. Moreover, these methods implicitly assume that features are numerical variables and labels are deterministic. However, categorical variables and probabilistic labels are common in real-world applications. This thesis develops three metric learning methods to enhance robustness against input perturbation and applicability for categorical variables and probabilistic labels. In Chapter 3, I identify that many existing methods maximise a margin in the feature space and such margin is insufficient to withstand perturbation in the input space. To address this issue, a new loss function is designed to penalise the input-space margin for being small and hence improve the robustness of the learned metric. In Chapter 4, I propose a metric learning method for categorical data. Classifying categorical data is difficult due to high feature ambiguity, and to this end, the technique of adversarial training is employed. Moreover, the generalisation bound of the proposed method is established, which informs the choice of the regularisation term. In Chapter 5, I adapt a classical probabilistic approach for metric learning to utilise information on probabilistic labels. The loss function is modified for training stability, and new evaluation criteria are suggested to assess the effectiveness of different methods. At the end of this thesis, two publications on hyperspectral target detection are appended as additional work during my PhD

    Laser Based Mid-Infrared Spectroscopic Imaging – Exploring a Novel Method for Application in Cancer Diagnosis

    Get PDF
    A number of biomedical studies have shown that mid-infrared spectroscopic images can provide both morphological and biochemical information that can be used for the diagnosis of cancer. Whilst this technique has shown great potential it has yet to be employed by the medical profession. By replacing the conventional broadband thermal source employed in modern FTIR spectrometers with high-brightness, broadly tuneable laser based sources (QCLs and OPGs) we aim to solve one of the main obstacles to the transfer of this technology to the medical arena; namely poor signal to noise ratios at high spatial resolutions and short image acquisition times. In this thesis we take the first steps towards developing the optimum experimental configuration, the data processing algorithms and the spectroscopic image contrast and enhancement methods needed to utilise these high intensity laser based sources. We show that a QCL system is better suited to providing numerical absorbance values (biochemical information) than an OPG system primarily due to the QCL pulse stability. We also discuss practical protocols for the application of spectroscopic imaging to cancer diagnosis and present our spectroscopic imaging results from our laser based spectroscopic imaging experiments of oesophageal cancer tissue

    Full Covariance Modelling for Speech Recognition

    Get PDF
    HMM-based systems for Automatic Speech Recognition typically model the acoustic features using mixtures of multivariate Gaussians. In this thesis, we consider the problem of learning a suitable covariance matrix for each Gaussian. A variety of schemes have been proposed for controlling the number of covariance parameters per Gaussian, and studies have shown that in general, the greater the number of parameters used in the models, the better the recognition performance. We therefore investigate systems with full covariance Gaussians. However, in this case, the obvious choice of parameters – given by the sample covariance matrix – leads to matrices that are poorly-conditioned, and do not generalise well to unseen test data. The problem is particularly acute when the amount of training data is limited. We propose two solutions to this problem: firstly, we impose the requirement that each matrix should take the form of a Gaussian graphical model, and introduce a method for learning the parameters and the model structure simultaneously. Secondly, we explain how an alternative estimator, the shrinkage estimator, is preferable to the standard maximum likelihood estimator, and derive formulae for the optimal shrinkage intensity within the context of a Gaussian mixture model. We show how this relates to the use of a diagonal covariance smoothing prior. We compare the effectiveness of these techniques to standard methods on a phone recognition task where the quantity of training data is artificially constrained. We then investigate the performance of the shrinkage estimator on a large-vocabulary conversational telephone speech recognition task. Discriminative training techniques can be used to compensate for the invalidity of the model correctness assumption underpinning maximum likelihood estimation. On the large-vocabulary task, we use discriminative training of the full covariance models and diagonal priors to yield improved recognition performance
    • 

    corecore