54 research outputs found

    Fast brain decoding with random sampling and random projections

    Get PDF
    International audienceMachine learning from brain images is a central tool for image-based diagnosis and diseases characterization. Predicting behavior from functional imaging, brain decoding, analyzes brain activity in terms of the behavior that it implies. While these multivariate techniques are becoming standard brain mapping tools, like mass-univariate analysis, they entail much larger computational costs. In an time of growing data sizes, with larger cohorts and higher-resolutions imaging, this cost is increasingly a burden. Here we consider the use of random sampling and projections as fast data approximation techniques for brain images. We evaluate their prediction accuracy and computation time on various datasets and discrimination tasks. We show that the weight maps obtained after random sampling are highly consistent with those obtained with the whole feature space, while having a fair prediction performance. Altogether, we present the practical advantage of random sampling methods in neuroimaging, showing a simple way to embed back the reduced coefficients, with only a small loss of information

    Improving sparse recovery on structured images with bagged clustering

    Get PDF
    International audience—The identification of image regions associated with external variables through discriminative approaches yields ill-posed estimation problems. This estimation challenge can be tackled by imposing sparse solutions. However, the sensitivity of sparse estimators to correlated variables leads to non-reproducible results, and only a subset of the important variables are selected. In this paper, we explore an approach based on bagging clustering-based data compression in order to alleviate the instability of sparse models. Specifically, we design a new framework in which the estimator is built by averaging multiple models estimated after feature clustering, to improve the conditioning of the model. We show that this combination of model averaging with spatially consistent compression can have the virtuous effect of increasing the stability of the weight maps, allowing a better interpretation of the results. Finally, we demonstrate the benefit of our approach on several predictive modeling problems

    Towards a Faster Randomized Parcellation Based Inference

    Get PDF
    International audienceIn neuroimaging, multi-subject statistical analysis is an essential step, as it makes it possible to draw conclusions for the population under study. However, the lack of power in neuroimaging studies combined with the lack of stability and sensitivity of voxel-based methods may lead to non-reproducible results. A method designed to tackle this problem is Randomized Parcellation-Based Inference (RPBI), which has shown good empirical performance. Nevertheless, the use of an agglomerative clustering algorithm proposed in the initial RPBI formulation to build the parcellations entails a large computation cost. In this paper, we explore two strategies to speedup RPBI: Firstly, we use a fast clustering algorithm called Recursive Nearest Agglomeration (ReNA), to find the parcellations. Secondly, we consider the aggregation of p-values over multiple parcellations to avoid a permutation test. We evaluate their the computation time, as well as their recovery performance. As a main conclusion, we advocate the use of (permuted) RPBI with ReNA, as it yields very fast models, while keeping the performance of slower methods

    Assessing and tuning brain decoders: cross-validation, caveats, and guidelines

    Get PDF
    International audienceDecoding, ie prediction from brain images or signals, calls for empirical evaluation of its predictive power. Such evaluation is achieved via cross-validation, a method also used to tune decoders' hyper-parameters. This paper is a review on cross-validation procedures for decoding in neuroimaging. It includes a didactic overview of the relevant theoretical considerations. Practical aspects are highlighted with an extensive empirical study of the common decoders in within-and across-subject predictions, on multiple datasets –anatomical and functional MRI and MEG– and simulations. Theory and experiments outline that the popular " leave-one-out " strategy leads to unstable and biased estimates, and a repeated random splits method should be preferred. Experiments outline the large error bars of cross-validation in neuroimaging settings: typical confidence intervals of 10%. Nested cross-validation can tune decoders' parameters while avoiding circularity bias. However we find that it can be more favorable to use sane defaults, in particular for non-sparse decoders

    Aprendizaje de las Leyes de Newton en la Educación Superior a través de la gamificación

    Get PDF
    This work aims to implement and implement an application to improve student learning in the three Newton Laws that study Physics I, in the Engineering programs at the Comfacauca University Corporation, because you have evidenced that it is a complex topic. the development of the course, given that the mathematical and physical formalism used to solve problems. This generates a low academic performance in the last cut of the semester that translates into invisible concepts that cause difficulty in understanding and representing. It is a more complete tool in simulator mode for the student to interact and solve the proposed problems. It is expected to improve the learning and academic performance of the students.Este trabajo pretende diseñar e implementar una app móvil que permita mejorar el aprendizaje de los estudiantes en las tres Leyes de Newton que cursan Física I, en los programas de Ingeniería en la Corporación Universitaria Comfacauca, debido a que se ha evidenciado que es un tema complejo durante el desarrollo del curso, dado que el formalismo matemático y físico que se emplea para la resolución de problemas. Esto genera un bajo rendimiento académico de los estudiantes en el último corte del semestre que cursan porque se enfrentan a conceptos invisibles que causan dificultad al comprenderlos y representarlos. Se observó que falta una herramienta más completa a modo de simulador para que el estudiante interactúe y resuelva los problemas propuestos. Se espera mejorar el aprendizaje y rendimiento académico de los estudiantes

    Valid population inference for information-based imaging: From the second-level t-test to prevalence inference

    Get PDF
    In multivariate pattern analysis of neuroimaging data, ‘second-level’ inference is often performed by entering classification accuracies into a t-test vs chance level across subjects. We argue that while the random-effects analysis implemented by the t-test does provide population inference if applied to activation differences, it fails to do so in the case of classification accuracy or other ‘information-like’ measures, because the true value of such measures can never be below chance level. This constraint changes the meaning of the population-level null hypothesis being tested, which becomes equivalent to the global null hypothesis that there is no effect in any subject in the population. Consequently, rejecting it only allows to infer that there are some subjects in which there is an information effect, but not that it generalizes, rendering it effectively equivalent to fixed-effects analysis. This statement is supported by theoretical arguments as well as simulations. We review possible alternative approaches to population inference for information-based imaging, converging on the idea that it should not target the mean, but the prevalence of the effect in the population. One method to do so, ‘permutation-based information prevalence inference using the minimum statistic’, is described in detail and applied to empirical data

    Ensembles des modeles en fMRI : l'apprentissage stable à grande échelle

    No full text
    In medical imaging, collaborative worldwide initiatives have begun theacquisition of hundreds of Terabytes of data that are made available to thescientific community. In particular, functional Magnetic Resonance Imaging --fMRI-- data. However, this signal requires extensive fitting and noise reduction steps to extract useful information. The complexity of these analysis pipelines yields results that are highly dependent on the chosen parameters.The computation cost of this data deluge is worse than linear: as datasetsno longer fit in cache, standard computational architectures cannot beefficiently used.To speed-up the computation time, we considered dimensionality reduction byfeature grouping. We use clustering methods to perform this task. We introduce a linear-time agglomerative clustering scheme, Recursive Nearest Agglomeration (ReNA). Unlike existing fast agglomerative schemes, it avoids the creation of giant clusters. We then show empirically how this clustering algorithm yields very fast and accurate models, enabling to process large datasets on budget.In neuroimaging, machine learning can be used to understand the cognitiveorganization of the brain. The idea is to build predictive models that are used to identify the brain regions involved in the cognitive processing of an external stimulus. However, training such estimators is a high-dimensional problem, and one needs to impose some prior to find a suitable model.To handle large datasets and increase stability of results, we propose to useensembles of models in combination with clustering. We study the empirical performance of this pipeline on a large number of brain imaging datasets. This method is highly parallelizable, it has lower computation time than the state-of-the-art methods and we show that, it requires less data samples to achieve better prediction accuracy. Finally, we show that ensembles of models improve the stability of the weight maps and reduce the variance of prediction accuracy.En imagerie médicale, des collaborations internationales ont lançé l'acquisition de centaines de Terabytes de données - et en particulierde données d'Imagerie par Résonance Magnétique fonctionelle (IRMf) -pour les mettre à disposition de la communauté scientifique.Extraire de l'information utile de ces données nécessite d'importants prétraitements et des étapes de réduction de bruit. La complexité de ces analyses rend les résultats très sensibles aux paramètres choisis. Le temps de calcul requis augmente plus vite que linéairement: les jeux de données sont si importants qu'il ne tiennent plus dans le cache, et les architectures de calcul classiques deviennent inefficaces.Pour réduire les temps de calcul, nous avons étudié le feature-grouping commetechnique de réduction de dimension. Pour ce faire, nous utilisons des méthodes de clustering. Nous proposons un algorithme de clustering agglomératif en temps linéaire: Recursive Nearest Agglomeration (ReNA). ReNA prévient la création de clusters énormes, qui constitue un défaut des méthodes agglomératives rapidesexistantes. Nous démontrons empiriquement que cet algorithme de clustering engendre des modèles très précis et rapides, et permet d'analyser de grands jeux de données avec des ressources limitées.En neuroimagerie, l'apprentissage statistique peut servir à étudierl'organisation cognitive du cerveau. Des modèles prédictifs permettent d'identifier les régions du cerveau impliquées dans le traitement cognitif d'un stimulus externe. L'entraînement de ces modèles est un problème de très grande dimension, et il est nécéssaire d'introduire un a priori pour obtenir un modèle satisfaisant.Afin de pouvoir traiter de grands jeux de données et d'améliorer lastabilité des résultats, nous proposons de combiner le clustering etl'utilisation d'ensembles de modèles. Nous évaluons la performance empirique de ce procédé à travers de nombreux jeux de données de neuroimagerie. Cette méthode est hautement parallélisable et moins coûteuse que l'état del'art en temps de calcul. Elle permet, avec moins de données d'entraînement,d'obtenir de meilleures prédictions. Enfin, nous montrons que l'utilisation d'ensembles de modèles améliore la stabilité des cartes de poids résultantes et réduit la variance du score de prédiction
    corecore