6,402 research outputs found

    Parsimonious modeling with information filtering networks

    Get PDF
    We introduce a methodology to construct parsimonious probabilistic models. This method makes use of information filtering networks to produce a robust estimate of the global sparse inverse covariance from a simple sum of local inverse covariances computed on small subparts of the network. Being based on local and low-dimensional inversions, this method is computationally very efficient and statistically robust, even for the estimation of inverse covariance of high-dimensional, noisy, and short time series. Applied to financial data our method results are computationally more efficient than state-of-the-art methodologies such as Glasso producing, in a fraction of the computation time, models that can have equivalent or better performances but with a sparser inference structure. We also discuss performances with sparse factor models where we notice that relative performances decrease with the number of factors. The local nature of this approach allows us to perform computations in parallel and provides a tool for dynamical adaptation by partial updating when the properties of some variables change without the need of recomputing the whole model. This makes this approach particularly suitable to handle big data sets with large numbers of variables. Examples of practical application for forecasting, stress testing, and risk allocation in financial systems are also provided

    Parsimonious modeling with information filtering networks

    Get PDF
    We introduce a methodology to construct parsimonious probabilistic models. This method makes use of information filtering networks to produce a robust estimate of the global sparse inverse covariance from a simple sum of local inverse covariances computed on small subparts of the network. Being based on local and low-dimensional inversions, this method is computationally very efficient and statistically robust, even for the estimation of inverse covariance of high-dimensional, noisy, and short time series. Applied to financial data our method results are computationally more efficient than state-of-the-art methodologies such as Glasso producing, in a fraction of the computation time, models that can have equivalent or better performances but with a sparser inference structure. We also discuss performances with sparse factor models where we notice that relative performances decrease with the number of factors. The local nature of this approach allows us to perform computations in parallel and provides a tool for dynamical adaptation by partial updating when the properties of some variables change without the need of recomputing the whole model. This makes this approach particularly suitable to handle big data sets with large numbers of variables. Examples of practical application for forecasting, stress testing, and risk allocation in financial systems are also provided

    Model estimation of cerebral hemodynamics between blood flow and volume changes: a data-based modeling approach

    Get PDF
    It is well known that there is a dynamic relationship between cerebral blood flow (CBF) and cerebral blood volume (CBV). With increasing applications of functional MRI, where the blood oxygen-level-dependent signals are recorded, the understanding and accurate modeling of the hemodynamic relationship between CBF and CBV becomes increasingly important. This study presents an empirical and data-based modeling framework for model identification from CBF and CBV experimental data. It is shown that the relationship between the changes in CBF and CBV can be described using a parsimonious autoregressive with exogenous input model structure. It is observed that neither the ordinary least-squares (LS) method nor the classical total least-squares (TLS) method can produce accurate estimates from the original noisy CBF and CBV data. A regularized total least-squares (RTLS) method is thus introduced and extended to solve such an error-in-the-variables problem. Quantitative results show that the RTLS method works very well on the noisy CBF and CBV data. Finally, a combination of RTLS with a filtering method can lead to a parsimonious but very effective model that can characterize the relationship between the changes in CBF and CBV

    Exploring Topic-based Language Models for Effective Web Information Retrieval

    Get PDF
    The main obstacle for providing focused search is the relative opaqueness of search request -- searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can lead to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search.Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model

    Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications

    Get PDF
    Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins
    • ā€¦
    corecore