472 research outputs found

    Basic Science to Clinical Research: Segmentation of Ultrasound and Modelling in Clinical Informatics

    Get PDF
    The world of basic science is a world of minutia; it boils down to improving even a fraction of a percent over the baseline standard. It is a domain of peer reviewed fractions of seconds and the world of squeezing every last ounce of efficiency from a processor, a storage medium, or an algorithm. The field of health data is based on extracting knowledge from segments of data that may improve some clinical process or practice guideline to improve the time and quality of care. Clinical informatics and knowledge translation provide this information in order to reveal insights to the world of improving patient treatments, regimens, and overall outcomes. In my world of minutia, or basic science, the movement of blood served an integral role. The novel detection of sound reverberations map out the landscape for my research. I have applied my algorithms to the various anatomical structures of the heart and artery system. This serves as a basis for segmentation, active contouring, and shape priors. The algorithms presented, leverage novel applications in segmentation by using anatomical features of the heart for shape priors and the integration of optical flow models to improve tracking. The presented techniques show improvements over traditional methods in the estimation of left ventricular size and function, along with plaque estimation in the carotid artery. In my clinical world of data understanding, I have endeavoured to decipher trends in Alzheimer’s disease, Sepsis of hospital patients, and the burden of Melanoma using mathematical modelling methods. The use of decision trees, Markov models, and various clustering techniques provide insights into data sets that are otherwise hidden. Finally, I demonstrate how efficient data capture from providers can achieve rapid results and actionable information on patient medical records. This culminated in generating studies on the burden of illness and their associated costs. A selection of published works from my research in the world of basic sciences to clinical informatics has been included in this thesis to detail my transition. This is my journey from one contented realm to a turbulent one

    Improved Alzheimer’s disease detection by MRI using multimodal machine learning algorithms

    Get PDF
    Dementia is one of the huge medical problems that have challenged the public health sector around the world. Moreover, it generally occurred in older adults (age > 60). Shockingly, there are no legitimate drugs to fix this sickness, and once in a while it will directly influence individual memory abilities and diminish the human capacity to perform day by day exercises. Many health experts and computing scientists were performing research works on this issue for the most recent twenty years. All things considered, there is an immediate requirement for finding the relative characteristics that can figure out the identification of dementia. The motive behind the works presented in this thesis is to propose the sophisticated supervised machine learning model in the prediction and classification of AD in elder people. For that, we conducted different experiments on open access brain image information including demographic MRI data of 373 scan sessions of 150 patients. In the first two works, we applied single ML models called support vectors and pruned decision trees for the prediction of dementia on the same dataset. In the first experiment with SVM, we achieved 70% of the prediction accuracy of late-stage dementia. Classification of true dementia subjects (precision) is calculated as 75%. Similarly, in the second experiment with J48 pruned decision trees, the accuracy was improved to the value of 88.73%. Classification of true dementia cases with this model was comprehensively done and achieved 92.4% of precision. To enhance this work, rather than single modelling we employed multi-modelling approaches. In the comparative analysis of the machine learning study, we applied the feature reduction technique called principal component analysis. This approach identifies the high correlated features in the dataset that are closely associated with dementia type. By doing the simultaneous application of three models such as KNN, LR, and SVM, it has been possible to identify an ideal model for the classification of dementia subjects. When compared with support vectors, KNN and LR models comprehensively classified AD subjects with 97.6% and 98.3% of accuracy respectively. These values are relatively higher than the previous experiments. However, because of the AD severity in older adults, it should be mandatory to not leave true AD positives. For the classification of true AD subjects among total subjects, we enhanced the model accuracy by introducing three independent experiments. In this work, we incorporated two new models called Naïve Bayes and Artificial Neural Networks along support vectors and KNN. In the first experiment, models were independently developed with manual feature selection. The experimental outcome suggested that KNN 3 is the optimal model solution because of 91.32% of classification accuracy. In the second experiment, the same models were tested with limited features (with high correlation). SVM was produced a high 96.12% of classification accuracy and NB produced a 98.21% classification rate of true AD subjects. Ultimately, in the third experiment, we mixed these four models and created a new model called hybrid type modelling. Hybrid model performance is validated AU-ROC curve value which is 0.991 (i.e., 99.1% of classification accuracy) has achieved. All these experimental results suggested that the ensemble modelling approach with wrapping is an optimal solution in the classification of AD subjects

    Analytical fusion of multimodal magnetic resonance imaging to identify pathological states in genetically selected Marchigian Sardinian alcohol-preferring (msP) rats

    Full text link
    [EN] Alcohol abuse is one of the most alarming issues for the health authorities. It is estimated that at least 23 million of European citizens are affected by alcoholism causing a cost around 270 million euros. Excessive alcohol consumption is related with physical harm and, although it damages the most of body organs, liver, pancreas, and brain are more severally affected. Not only physical harm is associated to alcohol-related disorders, but also other psychiatric disorders such as depression are often comorbiding. As well, alcohol is present in many of violent behaviors and traffic injures. Altogether reflects the high complexity of alcohol-related disorders suggesting the involvement of multiple brain systems. With the emergence of non-invasive diagnosis techniques such as neuroimaging or EEG, many neurobiological factors have been evidenced to be fundamental in the acquisition and maintenance of addictive behaviors, relapsing risk, and validity of available treatment alternatives. Alterations in brain structure and function reflected in non-invasive imaging studies have been repeatedly investigated. However, the extent to which imaging measures may precisely characterize and differentiate pathological stages of the disease often accompanied by other pathologies is not clear. The use of animal models has elucidated the role of neurobiological mechanisms paralleling alcohol misuses. Thus, combining animal research with non-invasive neuroimaging studies is a key tool in the advance of the disorder understanding. As the volume of data from very diverse nature available in clinical and research settings increases, an integration of data sets and methodologies is required to explore multidimensional aspects of psychiatric disorders. Complementing conventional mass-variate statistics, interests in predictive power of statistical machine learning to neuroimaging data is currently growing among scientific community. This doctoral thesis has covered most of the aspects mentioned above. Starting from a well-established animal model in alcohol research, Marchigian Sardinian rats, we have performed multimodal neuroimaging studies at several stages of alcohol-experimental design including the etiological mechanisms modulating high alcohol consumption (in comparison to Wistar control rats), alcohol consumption, and treatment with the opioid antagonist Naltrexone, a well-established drug in clinics but with heterogeneous response. Multimodal magnetic resonance imaging acquisition included Diffusion Tensor Imaging, structural imaging, and the calculation of magnetic-derived relaxometry maps. We have designed an analytical framework based on widely used algorithms in neuroimaging field, Random Forest and Support Vector Machine, combined in a wrapping fashion. Designed approach was applied on the same dataset with two different aims: exploring the validity of the approach to discriminate experimental stages running at subject-level and establishing predictive models at voxel-level to identify key anatomical regions modified during the experiment course. As expected, combination of multiple magnetic resonance imaging modalities resulted in an enhanced predictive power (between 3 and 16%) with heterogeneous modality contribution. Surprisingly, we have identified some inborn alterations correlating high alcohol preference and thalamic neuroadaptations related to Naltrexone efficacy. As well, reproducible contribution of DTI and relaxometry -related biomarkers has been repeatedly identified guiding further studies in alcohol research. In summary, along this research we demonstrate the feasibility of incorporating multimodal neuroimaging, machine learning algorithms, and animal research in the advance of the understanding alcohol-related disorders.[ES] El abuso de alcohol es una de las mayores preocupaciones de las autoridades sanitarias en la Unión Europea. El consumo de alcohol en exceso afecta en mayor o menor medida la totalidad del organismo siendo el páncreas e hígado los más severamente afectados. Además de estos, el sistema nervioso central sufre deterioros relacionados con el alcohol y con frecuencia se presenta en paralelo con otras patologías psiquiátricas como la depresión u otras adicciones como la ludopatía. La presencia de estas comorbidades demuestra la complejidad de la patología en la que multitud de sistemas neuronales interaccionan entre sí. El uso imágenes de resonancia magnética (RM) han ayudado en el estudio de enfermedades psiquiátricas facilitando el descubrimiento de mecanismos neurológicos fundamentales en el desarrollo y mantenimiento de la adicción al alcohol, recaídas y el efecto de los tratamientos disponibles. A pesar de los avances, todavía se necesita investigar más para identificar las bases biológicas que contribuyen a la enfermedad. En este sentido, los modelos animales sirven, por lo tanto, a discriminar aquellos factores únicamente relacionados con el alcohol controlando otros factores que facilitan el desarrollo del alcoholismo. Estudios de resonancia magnética en animales de laboratorio y su posterior evaluación en humanos juegan un papel fundamental en el entendimiento de las patologías psiquatricas como la addicción al alcohol. La imagen por resonancia magnética se ha integrado en entornos clínicos como prueba diagnósticas no invasivas. A medida que el volumen de datos se va incrementando, se necesitan herramientas y metodologías capaces de fusionar información de muy distinta naturaleza y así establecer criterios diagnósticos cada vez más exactos. El poder predictivo de herramientas derivadas de la inteligencia artificial como el aprendizaje automático sirven de complemento a tradicionales métodos estadísticos. En este trabajo se han abordado la mayoría de estos aspectos. Se han obtenido datos multimodales de resonancia magnética de un modelo validado en la investigación de patologías derivadas del consumo del alcohol, las ratas Marchigian-Sardinian desarrolladas en la Universidad de Camerino (Italia) y con consumos de alcohol comparables a los humanos. Para cada animal se han adquirido datos antes y después del consumo de alcohol y bajo dos condiciones de abstinencia (con y sin tratamiento de Naltrexona, una medicaciones anti-recaídas usada como farmacoterapia en el alcoholismo). Los datos de resonancia magnética multimodal consistentes en imágenes de difusión, de relaxometría y estructurales se han fusionado en un esquema analítico multivariable incorporando dos herramientas generalmente usadas en datos derivados de neuroimagen, Random Forest y Support Vector Machine. Nuestro esquema fue aplicado con dos objetivos diferenciados. Por un lado, determinar en qué fase experimental se encuentra el sujeto a partir de biomarcadores y por el otro, identificar sistemas cerebrales susceptibles de alterarse debido a una importante ingesta de alcohol y su evolución durante la abstinencia. Nuestros resultados demostraron que cuando biomarcadores derivados de múltiples modalidades de neuroimagen se fusionan en un único análisis producen diagnósticos más exactos que los derivados de una única modalidad (hasta un 16% de mejora). Biomarcadores derivados de imágenes de difusión y relaxometría discriminan estados experimentales. También se han identificado algunos aspectos innatos que están relacionados con posteriores comportamientos con el consumo de alcohol o la relación entre la respuesta al tratamiento y los datos de resonancia magnética. Resumiendo, a lo largo de esta tesis, se demuestra que el uso de datos de resonancia magnética multimodales en modelos animales combinados en esquemas analíticos multivariados es una herramienta válida en el entendimiento de patologías[CAT] L'abús de alcohol es una de les majors preocupacions per part de les autoritats sanitàries de la Unió Europea. Malgrat la dificultat de establir xifres exactes, se estima que uns 23 milions de europeus actualment sofreixen de malalties derivades del alcoholisme amb un cost que supera els 150.000 milions de euros per a la societat. Un consum de alcohol en excés afecta en major o menor mesura el cos humà sent el pàncreas i el fetge el més afectats. A més, el cervell sofreix de deterioraments produïts per l'alcohol i amb freqüència coexisteixen amb altres patologies com depressió o altres addiccions com la ludopatia. Tot aquest demostra la complexitat de la malaltia en la que múltiple sistemes neuronals interactuen entre si. Tècniques no invasives com el encefalograma (EEG) o imatges de ressonància magnètica (RM) han ajudat en l'estudi de malalties psiquiàtriques facilitant el descobriment de mecanismes neurològics fonamentals en el desenvolupament i manteniment de la addició, recaiguda i la efectivitat dels tractaments disponibles. Tot i els avanços, encara es necessiten més investigacions per identificar les bases biològiques que contribueixen a la malaltia. En aquesta direcció, el models animals serveixen per a identificar únicament dependents del abús del alcohol. Estudis de ressonància magnètica en animals de laboratori i posterior avaluació en humans jugarien un paper fonamental en l' enteniment de l'ús del alcohol. L'ús de probes diagnostiques no invasives en entorns clínics has sigut integrades. A mesura que el volum de dades es incrementa, eines i metodologies per a la fusió d' informació de molt distinta natura i per tant, establir criteris diagnòstics cada vegada més exactes. La predictibilitat de eines desenvolupades en el camp de la intel·ligència artificial com la aprenentatge automàtic serveixen de complement a mètodes estadístics tradicionals. En aquesta investigació se han abordat tots aquestes aspectes. Dades multimodals de ressonància magnètica se han obtingut de un model animal validat en l'estudi de patologies relacionades amb el consum d'alcohol, les rates Marchigian-Sardinian desenvolupades en la Universitat de Camerino (Italià) i amb consums d'alcohol comparables als humans. Per a cada animal es van adquirir dades previs i després al consum de alcohol i dos condicions diferents de abstinència (amb i sense tractament anti-recaiguda). Dades de ressonància magnètica multimodal constituides per imatges de difusió, de relaxometria magnètica i estructurals van ser fusionades en esquemes analítics multivariats incorporant dues metodologies validades en el camp de neuroimatge, Random Forest i Support Vector Machine. Nostre esquema ha sigut aplicat amb dos objectius diferenciats. El primer objectiu es determinar en quina fase experimental es troba el subjecte a partir de biomarcadors obtinguts per neuroimatge. Per l'altra banda, el segon objectiu es identificar el sistemes cerebrals susceptibles de ser alterats durant una important ingesta de alcohol i la seua evolució durant la fase del tractament. El nostres resultats demostraren que l'ús de biomarcadors derivats de varies modalitats de neuroimatge fusionades en un anàlisis multivariat produeixen diagnòstics més exactes que els derivats de una única modalitat (fins un 16% de millora). Biomarcadors derivats de imatges de difusió i relaxometria van contribuir de distints estats experimentals. També s'han identificat aspectes innats que estan relacionades amb posterior preferències d'alcohol o la relació entre la resposta al tractament anti-recaiguda i les dades de ressonància magnètica. En resum, al llarg de aquest treball, es demostra que l'ús de dades de ressonància magnètica multimodal en models animals combinats en esquemes analítics multivariats són una eina molt valida en l'enteniment i avanç de patologies psiquiàtriques com l'alcoholisme.Cosa Liñán, A. (2017). Analytical fusion of multimodal magnetic resonance imaging to identify pathological states in genetically selected Marchigian Sardinian alcohol-preferring (msP) rats [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90523TESI

    Diagnosis of Autism Spectrum Disorder Based on Brain Network Clustering

    Get PDF
    Developments in magnetic resonance imaging (MRI) provide new non-invasive approach—functional MRI (fMRI)—to study functions of brain. With the help of fMRI, I can build functional brain networks (FBN) to model correlations of brain activities between cortical regions. Studies focused on brain diseases, including autism spectrum disorder (ASD), have been conducted based on analyzing alterations in FBNs of patients. New biomarkers are identified, and new theories and assumptions are proposed on pathology of brain diseases. Considering that traditional clinical ASD diagnosis instruments, which greatly rely on interviews and observations, can yield large variance, recent studies start to combine machine learning methods and FBN to perform auto-classification of ASD. Such studies have achieved relatively good accuracy. However, in most of these studies, features they use are extracted from the whole brain networks thus the dimension of the features can be high. High-dimensional features may yield overfitting issues and increase computational complexity. Therefore, I need a feature selection strategy that effectively reduces feature dimensions while keeping a good classification performance. In this study, I present a new feature selection strategy that extracting features from functional modules but not the whole brain networks. I will show that my strategy not only reduces feature dimensions, but also improve performances of auto-classifications of ASD. The whole study can be separated into 4 stages: building FBNs, identification of functional modules, statistical analysis of modular alterations and, finally, training classifiers with modular features for auto-classification of ASD. I firstly demonstrate the whole procedure to build FBNs from fMRI images. To identify functional module, I propose a new network clustering algorithm based on joint non-negative matrix factorization. Different from traditional brain network clustering algorithms that mostly perform on an average network of all subjects, I design my algorithm to factorize multiple brain networks simultaneously because the clustering results should be valid not only on the average network but also on each individual network. I show the modules I find are more valid in both views. Then I statistically analyze the alterations in functional modules between ASD and typically developed (TD) group to determine from which modules I extract features from. Several indices based on graph theory are calculated to measure modular properties. I find significant alterations in two modules. With features from these two modules, I train several widely-used classifiers and validate the classifiers on a real-world dataset. The performances of classifiers trained by modular features are better than those with whole-brain features, which demonstrates the effectiveness of my feature selection strategy

    Quantifying cognitive and mortality outcomes in older patients following acute illness using epidemiological and machine learning approaches

    Get PDF
    Introduction: Cognitive and functional decompensation during acute illness in older people are poorly understood. It remains unclear how delirium, an acute confusional state reflective of cognitive decompensation, is contextualised by baseline premorbid cognition and relates to long-term adverse outcomes. High-dimensional machine learning offers a novel, feasible and enticing approach for stratifying acute illness in older people, improving treatment consistency while optimising future research design. Methods: Longitudinal associations were analysed from the Delirium and Population Health Informatics Cohort (DELPHIC) study, a prospective cohort ≥70 years resident in Camden, with cognitive and functional ascertainment at baseline and 2-year follow-up, and daily assessments during incident hospitalisation. Second, using routine clinical data from UCLH, I constructed an extreme gradient-boosted trees predicting 600-day mortality for unselected acute admissions of oldest-old patients with mechanistic inferences. Third, hierarchical agglomerative clustering was performed to demonstrate structure within DELPHIC participants, with predictive implications for survival and length of stay. Results: i. Delirium is associated with increased rates of cognitive decline and mortality risk, in a dose-dependent manner, with an interaction between baseline cognition and delirium exposure. Those with highest delirium exposure but also best premorbid cognition have the “most to lose”. ii. High-dimensional multimodal machine learning models can predict mortality in oldest-old populations with 0.874 accuracy. The anterior cingulate and angular gyri, and extracranial soft tissue, are the highest contributory intracranial and extracranial features respectively. iii. Clinically useful acute illness subtypes in older people can be described using longitudinal clinical, functional, and biochemical features. Conclusions: Interactions between baseline cognition and delirium exposure during acute illness in older patients result in divergent long-term adverse outcomes. Supervised machine learning can robustly predict mortality in in oldest-old patients, producing a valuable prognostication tool using routinely collected data, ready for clinical deployment. Preliminary findings suggest possible discernible subtypes within acute illness in older people

    Coding serial position in working memory in the healthy and demented brain

    Get PDF

    Computerized tools : a substitute or a supplement when diagnosing Alzheimer's disease?

    Get PDF
    Alzheimer’s disease (AD) is the most common form of dementia in the elderly characterized by difficulties in memory, disturbances in language, changes in behavior, and impairments in daily life activities. By the time cognitive impairment manifests, substantial synaptic and neuronal degeneration has already occurred. Therefore, patients need to be diagnosed as early as possible at a preclinical or presymptomatic stage. This will be important when disease-modifying treatments exist in the future. The main focus of this thesis is on the study of structural neuroimaging in AD and in prodromal stages of the disease. We emphasize the use of statistical learning for the analysis of structural neuroimaging data to achieve individual prediction of disease status and conversion from prodromal stages. The main aims of the thesis were to develop and validate computerized tools to identify patterns of atrophy with the potential of becoming markers of AD pathology using structural magnetic resonance imaging (sMRI) data and to develop a segmentation tool for Computed Tomography (CT). Using automated neuroanatomical software we measured multiple brain structures that were given to statistical learning techniques to create discriminative models for prediction of presence of disease and conversion from prodromal stages. Building statistical models based on sMRI data we investigated optimal normalization strategies for the combination of structural measures such as cortical thickness, cortical and subcortical volumes (Study I). A baseline model was created based on the optimal normalization strategy and combination of structural measures. This model was used to compare the discrimination ability of different statistical learning algorithms (decision trees, artificial neural networks, support vector machines and orthogonal partial least squares (OPLS)). Additionally, the addition of age, years of education and APOE phenotype was added to the baseline model to assess the impact on discrimination ability (Study II). The OPLS classification algorithm was trained on the baseline model to produce a structural index reflecting information about AD-like patterns of atrophy from each individual’s sMRI data. Additional longitudinal information at one-year follow-up was used to characterize the temporal evolution of the derived index (Study III). Since total intracranial volume (ICV) remains a morphological measure of interest and CT is today widely used in routine clinical investigations, we developed and validated an automated segmentation algorithm to estimate ICV from CT scans (Study IV). We believe computerized tools (automated neuroimaging software and statistical discriminative algorithms) have significantly enriched our knowledge and understanding of associated neurodegenerative pathology, its effects on cognition and interaction with age. These tools were mainly developed for research purposes but we believe all accumulated knowledge and insights could be translated into clinical settings, however, that is a challenge that remains open for future studies

    Revealing the differences between normal and pathological ageing using functional magnetic resonance imaging (fMRI)

    Get PDF
    The aim of the present study was to use fMRI to examine the brain activation patterns found in normal and pathological ageing on specific cognitive tasks. The cognitive paradigms that were chosen, consisted of an n-back working memory task and a semantic memory and processing task. Manipulation of the n-back task enabled vigilance and working memory load to be investigated. Patients with Alzheimer's Disease (AD) and individuals with amnestic Mild Cognitive Impairment (MCI) were compared to normal elderly and young controls. The experiments showed that the patterns of brain activation found in normal and pathological ageing do not appear to fall along the same continuum. When comparing the elderly group to the young group, areas of under-activation could be seen, in addition to other regions of activation which were thought to be due to compensation. The comparison of the normal to the pathological groups revealed distinct differences in the levels and locations of the significant activations. On the vigilance and working memory tasks, the behavioural scores and reaction times of the pathological groups were not significantly different from the normal elderly, yet substantial differences could be identified in the brain activation patterns. The semantic memory task, contrary to expectation, revealed a significant difference in behavioural performance between the young group and the elderly group. Both the reaction times and the performance scores of the AD group were significantly different compared to the elderly, however. Significant differences also occurred in the brain activation patterns of both pathological groups (AD and MCI) compared to the elderly. The differences that were present between the normal and pathological groups on each of the tasks, suggest that sensitive cognitive fMRI paradigms might be very useful in resolving diagnostic ambiguity in people at increased risk of developing AD

    Machine learning approaches for the study of AD with brain MRI data

    Get PDF
    Treballs Finals de Grau d'Enginyeria Biomèdica. Facultat de Medicina i Ciències de la Salut. Universitat de Barcelona. Curs: 2020-2021. Directors: Roser Sala Llonch, Agnès Pérez MillanThe use of automated or semi-automated approaches based on imaging data has been suggested to support the diagnoses of some diseases. In this context, Machine Learning (ML) appears as a useful emerging tool for this purpose, allowing from feature extraction to automatic classification. Alzheimer Disease (AD) and Frontotemporal Dementia (FTD) are two common and prevalent forms of early-onset dementia with different, but partly overlapping, symptoms and brain patterns of atrophy. Because of the similarities, there is a need to establish an accurate diagnosis and to obtain good markers for prognosis. This work combines both supervised and unsupervised ML algorithms to classify AD and FTD. The data used consisted of gray matter volumes and cortical thicknesses (CTh) extracted from 3TT1 MRI of 44 healthy controls (HC, age: 57.8±5.4 years), 53 Early-Onset Alzheimer Disease patients (EOAD, age: 59.4±4.4 years) and 64 FTD patients (FTD, age: 64.4±8.8 years). A principal component analysis (PCA) of all volumes and thicknesses was performed and a number of principal components (PC) that accumulated at least 80% of the data variance were entered into a Support Vector Machine (SVM). Overall performance was assessed using a 5-fold crossvalidation..

    Dealing with heterogeneity in the prediction of clinical diagnosis

    Full text link
    Le diagnostic assisté par ordinateur est un domaine de recherche en émergence et se situe à l’intersection de l’imagerie médicale et de l’apprentissage machine. Les données médi- cales sont de nature très hétérogène et nécessitent une attention particulière lorsque l’on veut entraîner des modèles de prédiction. Dans cette thèse, j’ai exploré deux sources d’hétérogénéité, soit l’agrégation multisites et l’hétérogénéité des étiquettes cliniques dans le contexte de l’imagerie par résonance magnétique (IRM) pour le diagnostic de la maladie d’Alzheimer (MA). La première partie de ce travail consiste en une introduction générale sur la MA, l’IRM et les défis de l’apprentissage machine en imagerie médicale. Dans la deuxième partie de ce travail, je présente les trois articles composant la thèse. Enfin, la troisième partie porte sur une discussion des contributions et perspectives fu- tures de ce travail de recherche. Le premier article de cette thèse montre que l’agrégation des données sur plusieurs sites d’acquisition entraîne une certaine perte, comparative- ment à l’analyse sur un seul site, qui tend à diminuer plus la taille de l’échantillon aug- mente. Le deuxième article de cette thèse examine la généralisabilité des modèles de prédiction à l’aide de divers schémas de validation croisée. Les résultats montrent que la formation et les essais sur le même ensemble de sites surestiment la précision du modèle, comparativement aux essais sur des nouveaux sites. J’ai également montré que l’entraînement sur un grand nombre de sites améliore la précision sur des nouveaux sites. Le troisième et dernier article porte sur l’hétérogénéité des étiquettes cliniques et pro- pose un nouveau cadre dans lequel il est possible d’identifier un sous-groupe d’individus qui partagent une signature homogène hautement prédictive de la démence liée à la MA. Cette signature se retrouve également chez les patients présentant des symptômes mod- érés. Les résultats montrent que 90% des sujets portant la signature ont progressé vers la démence en trois ans. Les travaux de cette thèse apportent ainsi de nouvelles con- tributions à la manière dont nous approchons l’hétérogénéité en diagnostic médical et proposent des pistes de solution pour tirer profit de cette hétérogénéité.Computer assisted diagnosis has emerged as a popular area of research at the intersection of medical imaging and machine learning. Medical data are very heterogeneous in nature and therefore require careful attention when one wants to train prediction models. In this thesis, I explored two sources of heterogeneity, multisite aggregation and clinical label heterogeneity, in an application of magnetic resonance imaging to the diagnosis of Alzheimer’s disease. In the process, I learned about the feasibility of multisite data aggregation and how to leverage that heterogeneity in order to improve generalizability of prediction models. Part one of the document is a general context introduction to Alzheimer’s disease, magnetic resonance imaging, and machine learning challenges in medical imaging. In part two, I present my research through three articles (two published and one in preparation). Finally, part three provides a discussion of my contributions and hints to possible future developments. The first article shows that data aggregation across multiple acquisition sites incurs some loss, compared to single site analysis, that tends to diminish as the sample size increase. These results were obtained through semisynthetic Monte-Carlo simulations based on real data. The second article investigates the generalizability of prediction models with various cross-validation schemes. I showed that training and testing on the same batch of sites over-estimates the accuracy of the model, compared to testing on unseen sites. However, I also showed that training on a large number of sites improves the accuracy on unseen sites. The third article, on clinical label heterogeneity, proposes a new framework where we can identify a subgroup of individuals that share a homogeneous signature highly predictive of AD dementia. That signature could also be found in patients with mild symptoms, 90% of whom progressed to dementia within three years. The thesis thus makes new contributions to dealing with heterogeneity in medical diagnostic applications and proposes ways to leverage that heterogeneity to our benefit
    corecore