62 research outputs found

    Advances in Spectral Learning with Applications to Text Analysis and Brain Imaging

    Get PDF
    Spectral learning algorithms are becoming increasingly popular in data-rich domains, driven in part by recent advances in large scale randomized SVD, and in spectral estimation of Hidden Markov Models. Extensions of these methods lead to statistical estimation algorithms which are not only fast, scalable, and useful on real data sets, but are also provably correct. Following this line of research, we make two contributions. First, we propose a set of spectral algorithms for text analysis and natural language processing. In particular, we propose fast and scalable spectral algorithms for learning word embeddings -- low dimensional real vectors (called Eigenwords) that capture the “meaning” of words from their context. Second, we show how similar spectral methods can be applied to analyzing brain images. State-of-the-art approaches to learning word embeddings are slow to train or lack theoretical grounding; We propose three spectral algorithms that overcome these limitations. All three algorithms harness the multi-view nature of text data i.e. the left and right context of each word, and share three characteristics: 1). They are fast to train and are scalable. 2). They have strong theoretical properties. 3). They can induce context-specific embeddings i.e. different embedding for “river bank” or “Bank of America”. \end{enumerate} They also have lower sample complexity and hence higher statistical power for rare words. We provide theory which establishes relationships between these algorithms and optimality criteria for the estimates they provide. We also perform thorough qualitative and quantitative evaluation of Eigenwords and demonstrate their superior performance over state-of-the-art approaches. Next, we turn to the task of using spectral learning methods for brain imaging data. Methods like Sparse Principal Component Analysis (SPCA), Non-negative Matrix Factorization (NMF) and Independent Component Analysis (ICA) have been used to obtain state-of-the-art accuracies in a variety of problems in machine learning. However, their usage in brain imaging, though increasing, is limited by the fact that they are used as out-of-the-box techniques and are seldom tailored to the domain specific constraints and knowledge pertaining to medical imaging, which leads to difficulties in interpretation of results. In order to address the above shortcomings, we propose Eigenanatomy (EANAT), a general framework for sparse matrix factorization. Its goal is to statistically learn the boundaries of and connections between brain regions by weighing both the data and prior neuroanatomical knowledge. Although EANAT incorporates some neuroanatomical prior knowledge in the form of connectedness and smoothness constraints, it can still be difficult for clinicians to interpret the results in specific domains where network-specific hypotheses exist. We thus extend EANAT and present a novel framework for prior-constrained sparse decomposition of matrices derived from brain imaging data, called Prior Based Eigenanatomy (p-Eigen). We formulate our solution in terms of a prior-constrained l1 penalized (sparse) principal component analysis. Experimental evaluation confirms that p-Eigen extracts biologically-relevant, patient-specific functional parcels and that it significantly aids classification of Mild Cognitive Impairment when compared to state-of-the-art competing approaches

    False discovery rate smoothing

    Full text link
    We present false discovery rate smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems. FDR smoothing automatically finds spatially localized regions of significant test statistics. It then relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false-discovery rate at a given level. This results in increased power and cleaner spatial separation of signals from noise. The approach requires solving a non-standard high-dimensional optimization problem, for which an efficient augmented-Lagrangian algorithm is presented. In simulation studies, FDR smoothing exhibits state-of-the-art performance at modest computational cost. In particular, it is shown to be far more robust than existing methods for spatially dependent multiple testing. We also apply the method to a data set from an fMRI experiment on spatial working memory, where it detects patterns that are much more biologically plausible than those detected by standard FDR-controlling methods. All code for FDR smoothing is publicly available in Python and R.Comment: Added misspecification analysis, added pathological scenario discussions, additional comparisons, new graph fused lasso algorith

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis

    Leveraging Computer Vision for Applications in Biomedicine and Geoscience

    Get PDF
    Skin cancer is one of the most common types of cancer and is usually classified as either non-melanoma and melanoma skin cancer. Melanoma skin cancer accounts for about half of all skin cancer-related deaths. The 5-year survival rate is 99% when the cancer is detected early but drops to 25% once it becomes metastatic. In other words, the key to preventing death is early detection. Foraminifera are microscopic single-celled organisms that exist in marine environments and are classified as living a benthic or planktic lifestyle. In total, roughly 50,000 species are known to have existed, of which about 9,000 are still living today. Foraminifera are important proxies for reconstructing past ocean and climate conditions and as bio-indicators of anthropogenic pollution. Since the 1800s, the identification and counting of foraminifera have been performed manually. The process is resource-intensive. In this dissertation, we leverage recent advances in computer vision, driven by breakthroughs in deep learning methodologies and scale-space theory, to make progress towards both early detection of melanoma skin cancer and automation of the identification and counting of microscopic foraminifera. First, we investigate the use of hyperspectral images in skin cancer detection by performing a critical review of relevant, peer-reviewed research. Second, we present a novel scale-space methodology for detecting changes in hyperspectral images. Third, we develop a deep learning model for classifying microscopic foraminifera. Finally, we present a deep learning model for instance segmentation of microscopic foraminifera. The works presented in this dissertation are valuable contributions in the fields of biomedicine and geoscience, more specifically, towards the challenges of early detection of melanoma skin cancer and automation of the identification, counting, and picking of microscopic foraminifera

    Effective EEG analysis for advanced AI-driven motor imagery BCI systems

    Get PDF
    Developing effective signal processing for brain-computer interfaces (BCIs) and brain-machine interfaces (BMIs) involves factoring in three aspects of functionality: classification performance, execution time, and the number of data channels used. The contributions in this thesis are centered on these three issues. Contributions are focused on the classification of motor imagery (MI) data, which is generated during imagined movements. Typically, EEG time-series data is segmented for data augmentation or to mimic buffering that happens in an online BCI. A multi-segment decision fusion approach is presented, which takes consecutive temporal segments of EEG data, and uses decision fusion to boost classification performance. It was computationally lightweight and improved the performance of four conventional classifiers. Also, an analysis of the contributions of electrodes from different scalp regions is presented, and a subset of channels is recommended. Sparse learning (SL) classifiers have exhibited strong classification performance in the literature. However, they are computationally expensive. To reduce the test-set execution times, a novel EEG classification pipeline consisting of a genetic-algorithm (GA) for channel selection and a dictionary-based SL module for classification, called GABSLEEG, is presented. Subject-specific channel selection was carried out, in which the channels are selected based on training data from the subject. Using the GA-recommended subset of EEG channels reduced the execution time by 60% whilst preserving classification performance. Although subject-specific channel selection is widely used in the literature, effective subject-independent channel selection, in which channels are detected using data from other subjects, is an ideal aim because it leads to lower training latency and reduces the number of electrodes needed. A novel convolutional neural network (CNN)-based subject-independent channels selection method is presented, called the integrated channel selection (ICS) layer. It performed on-a-par with or better than subject-specific channel selection. It was computationally efficient, operating 12-17 times faster than the GA channel selection module. The ICS layer method was versatile, performing well with two different CNN architectures and datasets.Developing effective signal processing for brain-computer interfaces (BCIs) and brain-machine interfaces (BMIs) involves factoring in three aspects of functionality: classification performance, execution time, and the number of data channels used. The contributions in this thesis are centered on these three issues. Contributions are focused on the classification of motor imagery (MI) data, which is generated during imagined movements. Typically, EEG time-series data is segmented for data augmentation or to mimic buffering that happens in an online BCI. A multi-segment decision fusion approach is presented, which takes consecutive temporal segments of EEG data, and uses decision fusion to boost classification performance. It was computationally lightweight and improved the performance of four conventional classifiers. Also, an analysis of the contributions of electrodes from different scalp regions is presented, and a subset of channels is recommended. Sparse learning (SL) classifiers have exhibited strong classification performance in the literature. However, they are computationally expensive. To reduce the test-set execution times, a novel EEG classification pipeline consisting of a genetic-algorithm (GA) for channel selection and a dictionary-based SL module for classification, called GABSLEEG, is presented. Subject-specific channel selection was carried out, in which the channels are selected based on training data from the subject. Using the GA-recommended subset of EEG channels reduced the execution time by 60% whilst preserving classification performance. Although subject-specific channel selection is widely used in the literature, effective subject-independent channel selection, in which channels are detected using data from other subjects, is an ideal aim because it leads to lower training latency and reduces the number of electrodes needed. A novel convolutional neural network (CNN)-based subject-independent channels selection method is presented, called the integrated channel selection (ICS) layer. It performed on-a-par with or better than subject-specific channel selection. It was computationally efficient, operating 12-17 times faster than the GA channel selection module. The ICS layer method was versatile, performing well with two different CNN architectures and datasets

    Challenges in biomedical data science: data-driven solutions to clinical questions

    Get PDF
    Data are influencing every aspect of our lives, from our work activities, to our spare time and even to our health. In this regard, medical diagnosis and treatments are often supported by quantitative measures and observations, such as laboratory tests, medical imaging or genetic analysis. In medicine, as well as in several other scientific domains, the amount of data involved in each decision-making process has become overwhelming. The complexity of the phenomena under investigation and the scale of modern data collections has long superseded human analysis and insights potential

    Algorithms, applications and systems towards interpretable pattern mining from multi-aspect data

    Get PDF
    How do humans move around in the urban space and how do they differ when the city undergoes terrorist attacks? How do users behave in Massive Open Online courses~(MOOCs) and how do they differ if some of them achieve certificates while some of them not? What areas in the court elite players, such as Stephen Curry, LeBron James, like to make their shots in the course of the game? How can we uncover the hidden habits that govern our online purchases? Are there unspoken agendas in how different states pass legislation of certain kinds? At the heart of these seemingly unconnected puzzles is this same mystery of multi-aspect mining, i.g., how can we mine and interpret the hidden pattern from a dataset that simultaneously reveals the associations, or changes of the associations, among various aspects of the data (e.g., a shot could be described with three aspects, player, time of the game, and area in the court)? Solving this problem could open gates to a deep understanding of underlying mechanisms for many real-world phenomena. While much of the research in multi-aspect mining contribute broad scope of innovations in the mining part, interpretation of patterns from the perspective of users (or domain experts) is often overlooked. Questions like what do they require for patterns, how good are the patterns, or how to read them, have barely been addressed. Without efficient and effective ways of involving users in the process of multi-aspect mining, the results are likely to lead to something difficult for them to comprehend. This dissertation proposes the M^3 framework, which consists of multiplex pattern discovery, multifaceted pattern evaluation, and multipurpose pattern presentation, to tackle the challenges of multi-aspect pattern discovery. Based on this framework, we develop algorithms, applications, and analytic systems to enable interpretable pattern discovery from multi-aspect data. Following the concept of meaningful multiplex pattern discovery, we propose PairFac to close the gap between human information needs and naive mining optimization. We demonstrate its effectiveness in the context of impact discovery in the aftermath of urban disasters. We develop iDisc to target the crossing of multiplex pattern discovery with multifaceted pattern evaluation. iDisc meets the specific information need in understanding multi-level, contrastive behavior patterns. As an example, we use iDisc to predict student performance outcomes in Massive Open Online Courses given users' latent behaviors. FacIt is an interactive visual analytic system that sits at the intersection of all three components and enables for interpretable, fine-tunable, and scrutinizable pattern discovery from multi-aspect data. We demonstrate each work's significance and implications in its respective problem context. As a whole, this series of studies is an effort to instantiate the M^3 framework and push the field of multi-aspect mining towards a more human-centric process in real-world applications

    Clinical correlates and advanced processing of the dopamine transporter spect - applications in parkinsonism.

    Get PDF
    La visualización del transportador de dopamina (DAT) a través del SPECT con [123I]FP-CIT es una prueba de imagen ampliamente usada en el diagnóstico de la enfermedad de Parkinson (EP) y otros trastornos del movimiento que cursan con síntomas parkinsonianos. Dicha imagen permite visualizar y cuantificar los niveles de DAT en el estriado y sus regiones putamen y caudado, y es por tanto una herramienta útil para evaluar in-vivo el estado de las terminales presinápticos dopaminérgicos de la vía nigroestriada. En la práctica clínica es comúnmente utilizado para la diferenciación de parkinsonismos neurodegenerativos con afectación presináptica y otros trastornos del movimiento con síntomas similares pero sin afectación presináptica como el temblor esencial. En la imagen se suele observar un patrón de degeneración postero-anterior que se corresponde con la progresión de síntomas en la EP debido a la afectación progresiva de los circuitos de los ganglios basales. De hecho, numerosos estudios han mostrado que la falta de DAT en el putamen y caudado se correlacionan con síntomas motores y cognitivos, respectivamente. Sin embargo, a pesar de su uso extendido, su uso clínico dado los métodos de evaluación actuales se limita a determinar la presencia o no de degeneración nigroestriada. En esta tesis se plantea como hipótesis que el uso de métodos de procesamiento y evaluación más sofisticados, utilizando técnicas de procesamiento de imágenes y de reconocimiento de patrones a nivel de vóxel, podría potenciar el desarrollo de nuevas aplicaciones clínicas; incluyendo la evaluación de síntomas y el diagnóstico diferencial entre parkinsonismos. Para ello, hemos caracterizado clínicamente y recogido imágenes de SPECT de cientos de pacientes con EP y otros parkinsonismos, persiguiendo dos objetivos globales: i) investigar ciertos conceptos actuales sobre los síntomas motores y cognitivos en la EP; y ii) desarrollar nuevos métodos de procesamiento y evaluación que permitan extender el rango actual de aplicaciones clínicas de dicha prueba. Se presentan un total de 5 publicaciones agrupadas en dos temáticas, una para cada objetivo global. En la primera temática, se engloban dos trabajos con títulos: 1) Lower levels of uric acid and striatal dopamine in non-tremor dominant Parkinson's disease subtype, Plos One 2017 Mar 30;12(3):e0174644; y 2) Genetic factors influencing frontostriatal dysfunction and the development of dementia in Parkinson's disease, Plos One 2017 Apr 11;12(4):e0175560. En el trabajo 1 se investigaron las diferencias entre los niveles de ácido úrico y dopamina estriatal en los subtipos motores de EP: tremorígeno, intermedio, y con trastorno de la marcha e inestabilidad postural. Estudiamos 75 pacientes con EP de larga evolución y encontramos que aquellos que presentaron un predominio de temblor al inicio y mantuvieron este fenotípo clinico durante el curso de la enfermedad, tuvieron niveles de ácido úrico y dopamina estriatal mayores que aquellos que desarrollaron trastorno de la marcha e inestabilidad postural. Además, los niveles de ácido úrico y de dopamina estriatal se correlacionaron. Como conclusión, especulamos que niveles bajos de este antioxidante natural (el ácido úrico) puede reducer los niveles de neuroprotección y por tanto influenciar el perfil y curso de fenotipo motor en la EP. En el trabajo 2 se investigó la contribución de los principales factores genéticos descritos en la literatura en los síndromes duales de deterioro cognitivo en la EP (fronto-estriatal que conlleva un alto riesgo de síndrome disejecutivo – causado por falta de dopamina – y posterior-cortical que conlleva un alto riesgo de demencia). Evaluamos la imagen, el estado cognitivo y el genotipo de 298 pacientes con EP. Como resultado, observamos que el alelo APOE2, los polimorfismos SNCA rs356219 y COMT Val158Met, y las variantes patogénicas en GBA se asociaron con los niveles de denervación dopaminérgica estriatal, mientras que el alelo APOE4 y de nuevo las variaciones patogénicas en GBA se asociaron con el desarrollo de demencia (sugiriendo un doble rol del gen GBA). No encontramos ninguna relación del haplotipo MAPT H1 en ninguno de los síndromes. Concluimos que la dicotomía de los síndromes duales puede estar conducida por una dicotomía en estos factores genéticos. En la segunda temática, se presentan otros 3 trabajos más centrados en el desarrollo de metodología, titulados: 3) Machine learning models for the differential diagnosis of vascular parkinsonism and Parkinson's disease using [123I]FP-CIT SPECT, European Journal of Nuclear Medicine and Molecular Imaging, 2015 Jan;42(1):112-9; 4) A Bayesian spatial model for neuroimaging using multiscale functional parcellations, En revisión en la revista euroimage; y un último trabajo que está en elaboración y cuyos resultados preliminares fueron presentados recientemente: 5) Probabilistic intensity normalization of PET/SPECT images via Variational mixture of Gamma distributions, 30th Neural Information and Processing Systems Conference, November 2016, Barcelona, Spain. En el trabajo 3 se desarrollaron algoritmos usando imágenes de SPECT para distinguir un parkinsonismo secundario – el parkinsonismo vascular (PV) – de la EP. Observamos que una simple regresión logística – incluyendo los valores medios de captación estriatales, junto con el sexo, la edad, y los años de evolución – diferenció ambas entidades con un 90% de exactitud. De manera similar, encontramos que el uso de algoritmos objetivos y automáticos usando técnicas de machine learning basadas en vóxeles también discriminaron ambas entidades con un 90% de exactitud. Concluimos que el diagnóstico diferencial de ambas enfermedades puede ser asistido por algoritmos automáticos basados en imagen. En el trabajo 4 se desarrolló una nueva metodología, más allá del método estándar basado en vóxeles, para realizar inferencias en neuroimagen funcional. Se desarrolló un modelo multivariado espacial que permitió modelar imágenes de SPECT de sujetos sanos de manera muy eficiente con un número de parámetros muy inferior al número de vóxeles. Dicho modelo consiste en una superposición lineal de funciones base utilizando subparcelaciones multi-escala del estriado, éstas obtenidas tras procesar imágenes de resonancia magnética funcional. También demostramos la utilidad de nuestro modelo para desarrollar aplicaciones clínicas mediante la construcción de clasificadores para diferenciar la EP de controles sanos y un parkinsonismo atípico: la parálisis supranuclear progresiva. Esta nueva metodología ofrece ventajas sin precedentes para el análisis de neuroimagen con respecto al clásico modelo lineal general univariado basado en vóxel, incluyendo: i) mayor interpretabilidad de las señales cerebrales; ii) modelos parsimoniosos y por tanto incremento del poder estadístico; y iii) modelado de la correlación espacial entre regiones y a distintos niveles de granuralidad en neuroimagen funcional. Además, desarrollamos metodología bayesiana para detectar de manera automática (y cuantificar la incertidumbre) las regiones cerebrales que estén relacionadas con ciertas variables fenotípicas. En el trabajo 5 se desarrolló un método para armonizar la intensidad de las imágenes de SPECT producidas por distintos fabricantes (y calibración) de cámaras Gamma. El método se basa en modelar el histograma de la imagen con un modelo mixto de distribuciones Gamma. Se utilizó la función de densidad acumulada de la distribución Gamma que modela la región específica de captación para reparametrizar la imagen con valores de vóxel entre 0 y 1. Observamos que dicha normalización mejoró sustancialmente (hasta un 10%) el diagnóstico de EP cuando los algoritmos se desarrollaron usando imágenes de distintas cámaras y/o calibraciones. Dicha normalización puede suponer un paso clave en pre-procesado de estas imágenes de cara a la realización de estudios multicéntricos y el desarrollo de aplicaciones clínicas generalizables. Como conclusión es importante resaltar la relevancia de los trabajos. En los trabajos 1 y 2 hemos aportado resultados con biomarcadores de valor pronóstico en la progresión de la EP. En los trabajos 3, 4 y 5, hemos aportado una nueva metodología, muy superior a la existente, de procesamiento y evaluación de esta prueba de imagen. La metodología desarrollada en el trabajo 4 permite explorar regiones cerebrales a un de nivel de complejidad espacial y granularidad sin precedentes. Por ello, nuestro modelo podría captar las diferencias entre las imágenes de pacientes con distintas patologías y/o entre síntomas específicos residir en patrones espaciales sutiles y complejos. De hecho, en los trabajos 3 y 4 aportamos resultados excelentes en la diferenciación de la EP con otros síndromes parkinsonianos. Además, el trabajo 5 tiene el potencial de constituirse en el campo como un paso fundamental de pre-procesado, especialmente en estudios ulticéntricos y estudios que pretendan desarrollar aplicaciones clínicas generalizables, independientemente de la cámara Gamma y el centro donde se realice la prueba. Es importante señalar además que los métodos desarrollados se podrían igualmente aplicar para procesar y evaluar otro tipo de imágenes de medicina nuclear y/u otras regiones cerebrales. Es por ello que esperamos que este trabajo tenga un gran impacto en general en la evaluación de este tipo de imágenes y en el desarrollo de algoritmos que den soporte a la decisiones clínicas en trastornos del movimiento y potencialmente en otras enfermedades.The imaging of the dopamine transporter (DAT) with [123I]FP-CIT SPECT is a routinely used assessment in the diagnostic pipeline of Parkinson’s disease (PD) and other movement disorders that present with parkinsonian symptoms. In this scan, the levels of striatal DAT can be visualized and quantified, also at the region-of-interest (ROI) level in putamen and caudate, and therefore it constitutes an useful tool to assess in-vivo the state of the dopaminergic presynaptic terminals in the nigrostriatal pathway. In routine clinical practice it is especially utilized for the differential diagnlosis of presynaptic neurodegenerative disorders like PD and other non-presynaptic movement disorders like essential tremor. Also, numerous research studies have shown that striatal DAT deficits quantitatively correlate with motor and cognitive impairment in PD. Indeed, it can be seen in the image a posterior-to-anterior pattern of degeneration that well corresponds with disease progression due to the progressive lost of dopaminergic input into the motor and associative loops between the basal ganglia and the cortex. However, despite its known utility and widespread availability, its use with current assessment methods in real clinical practice is limited to determining the presence of nigrostriatal degeneration at a single-subject level in a binary fashion. We hypothesized in this thesis that an enhanced processing and assessment of this scan with modern image processing and pattern recognition techniques may help to boost its use in the clinic with new and more accurate applications, including symptom risk assessment and differential diagnosis with other parkinsonisms. We collected DAT scans of several hundreds of well-clinicallyphenotyped patients with PD and other parkinsonims, envisaging two main global objectives: i) to investigate some trending hypotheses and concepts about the motor and cognitive impairment in PD; and ii) to develop new processing and evaluation strategies with computational techniques to shed light into new clinical applications. A total of 5 publications are herein presented and grouped in two themes, one for each global objective. In the first theme, two works are presented, entitled: 1) Lower levels of uric acid and striatal dopamine in non-tremor dominant Parkinson's disease subtype, Plos One 2017 Mar 30;12(3):e0174644; and 2) Genetic factors influencing frontostriatal dysfunction and the development of dementia in Parkinson's disease, Plos One 2017 Apr 11;12(4):e0175560. In work 1 we investigated the differences in uric acid and striatal DAT in PD motor subtypes: tremor-dominant, intermediate, or postural instability and gait disorder (PIGD). We studied 75 PD patients of long-term evolution and found that those who presented with a tremor onset and maintained predominance of tremor, or, to a lesser extent, evolved to an intermediate phenotype, had higher levels of uric acid and striatal DAT binding than those who developed a IGD phenotype. We also found that uric acid and striatal DAT levels were highly correlated. We speculate that low levels of this natural antioxidant may lead to a lesser degree of neuroprotection and could therefore influence the motor phenotype and course. In work 2 we investigated the contribution to the dual syndromes of cognitive impairment in PD (frontostriatal dopamine-mediated and posterior cortical leading to dementia) of the main genetic risk factors decribed in the literature. We evaluated the scans, the cognitive status, and the genotypes of 298 PD patients and found that APOE2 allele, SNCA rs356219 and COMT Val158Met polymorphisms, and deleterious variants in GBA influenced striatal dopaminergic depletion, and that APOE4 allele and deleterious variants in GBA influenced dementia, thus suggesting a doubleedged role for GBA. We did not found any role of MAPT H1 haplotype. We conclude that the dichotomy of the dual syndromes may be driven by a broad dichotomy in these genetic factors. In the second theme, we present three other works with more focus on methodology, entitled: 3) Machine learning models for the differential diagnosis of vascular parkinsonism and Parkinson's disease using [123I]FP-CIT SPECT, European Journal of Nuclear Medicine and Molecular Imaging, 2015 Jan;42(1):112-9; 4) A Bayesian spatial model for neuroimaging using multiscale functional parcellations, Under Review in Neuroimage; and a last piece of work that it is in preparation for submission and that I have adapted for this thesis from 5) Probabilistic intensity normalization of PET/SPECT images via Variational mixture of Gamma distributions, 30th Neural Information and Processing Systems Conference, November 2016, Barcelona, Spain. In work 3 we developed analytical models using DAT SPECT data to discriminate vascular parkinsonism (VP) from PD. We collected scans from 80 VP and 164 PD and found that a simple logistic regression using the quantification of the striatal subregions putamen and caudate together with age, sex and disease duration discriminated both entities with over 90% accuracy. Also, we found that the use of more automated and rater-independent machine learning algorithms such as support vector machines with the voxel-wise data of the striatum also gives discrimination accuracy over 90%. We conclude that the differential diagnosis of both diseases can be aided by automated image-based algorithms. In work 4 we developed a new anaylsis framework to perform inferences with functional neuroimaging data. We developed a multivariate spatial model by which an imaged brain region can be efficiently represented in low dimensions with a linear superposition of basis functions. To demonstrate, we accurately modeled DATSCAN images from healthy subjects with a linear combination of multi-resolutional striatum parcellations derived from functional MRI experiments. We also demonstrate the utility of our model to develop clinical application by constructing accurate classifiers to differentiate PD from normal controls and patients with an atypical parkinsonism: the progressive supranuclear palsy. This approach offers unprecedent benefits with respect to classical univariate voxel methods, including: i) greater biological interpretability of the detected brain signals ii) parsimonity in the models and hence gain in statistical power; and iii) multi-range modeling of the spatial dependencies in brain images. Furthermore, we provide a bayesian analysis framework to automatically identifying brain subregions/subnetworks that are meaningful for particular phenotypic variables. In work 5 we developed a voxel-based intensity normalization method for DAT SPECT images aiming at overcoming the liminations of the current ROI-based normalization standard, namely ROI delineation dependence and intensity values dependence on Gamma camera. We found that the intensity histogram of a DAT SPECT image can be modeled as a mixture model of Gamma distributions. The cumulative distribution function (CDF) of the fitted Gamma distributions can be used to re-cast the voxel intensity values into a new normalized feature space between 0 and 1. We found that this re-parametrization equalized intensity across cameras and drastically improved the accuracy of PD diagnosis (up to 10%) when images from different cameras were pooled. Importantly, our method may constitute a key pre-processing step for group-level and multi-center studies. As a final remark, it is important to stress the relevance of the work. In the works 1 and 2, we have provided new insights on biomarkers that have prognostic value in the progression of PD. In the works 3, 4 and 5, which set the grounds of a new powerful approach to process and evaluate these images. The machine learning framework developed in work 4) allows to exploring brain regions at a unprecedent level of spatial complexity and granurality. Thus, challenging tasks such as the differential diagnosis between different parkinsonian disorders or the identification of fine-grained regions/networks responsible for specific parkinsonian symptoms can be tackled with the proposed approach. In fact, we obtained excellent results in works 3 and 4 in the differentiation of PD from other parkinsonian syndromes. Also, the work 5 may constitute a fundamental pre-processing step, especially in multi-center studies and studies aiming at developing generalizable clinical applications, regardless of the Gamma camera manufacturer and site where the scan is made. It is important to note that, besides DATSCAN, these methods could be also applied to other nuclearmedicine images and/or brain regions. We hope that this work will have an impact in the assessment of this type of images and in the development of algorithms supporting clinical decisions in movement disorders and potentially in other diseases as well.Premio Extraordinario de Doctorado U
    corecore