Search CORE

53 research outputs found

Individuals tell a fascinating story: using unsupervised text mining methods to cluster policyholders based on their medical history

Author: Gauchon Romain
Hermet Jean-Pascal
Publication venue: HAL CCSD
Publication date: 08/11/2019
Field of study

Background and objective: Classifying people according to their health profile is crucial in order to propose appropriate treatment. However, the medical diagnosis is sometimes not available. This is for example the case in health insurance, making the proposal of custom prevention plans difficult. When this is the case, an unsupervised clustering method is needed. This article aims to compare three different methods by adapting some text mining methods to the field of health insurance. Also, a new clustering stability measure is proposed in order to compare the stability of the tested processes. Methods : Nonnegative Matrix Factorization, the word2vec method, and marginalized Stacked Denoising Autoencoders are used and compared in order to create a high-quality input for a clustering method. A self-organizing map is then used to obtain the final clustering. A real health insurance database is used in order to test the methods. Results: the marginalized Stacked Denoising Autoencoder outperforms the other methods both in stability and result quality with our data. Conclusions: The use of text mining methods offers several possibilities to understand the context of any medical act. On a medical database, the process could reveal unexpected correlation between treatment, and thus, pathology. Moreover, this kind of method could exploit the refund dates contained in the data, but the tested method using temporality, word2vec, still needs to be improved since the results, even if satisfying, are not as better as the one offered by other methods

Differentiation of Alzheimer's disease dementia, mild cognitive impairment and normal condition using PET-FDG and AV-45 imaging : a machine-learning approach

Author: Anjum Ayesha
Publication venue
Publication date: 25/09/2013
Field of study

Nous avons utilisé l'imagerie TEP avec les traceurs F18-FDG et AV45 en conjonction avec les méthodes de classification du domaine du "Machine Learning". Les images ont été acquises en mode dynamique, une image toutes les 5 minutes. Les données ont été transformées par Analyse en Composantes Principales et Analyse en Composantes Indépendantes. Les images proviennent de trois sources différentes: la base de données ADNI (Alzheimer's Disease Neuroimaging Initiative) et deux protocoles réalisés au sein du centre TEP de l'hôpital Purpan. Pour évaluer la performance de la classification nous avons eu recours à la méthode de validation croisée LOOCV (Leave One Out Cross Validation). Nous donnons une comparaison entre les deux méthodes de classification les plus utilisées, SVM (Support Vector Machine) et les réseaux de neurones artificiels (ANN). La combinaison donnant le meilleur taux de classification semble être SVM et le traceur AV45. Cependant les confusions les plus importantes sont entre les patients MCI et les sujets normaux. Les patients Alzheimer se distinguent relativement mieux puisqu'ils sont retrouvés souvent à plus de 90%. Nous avons évalué la généralisation de telles méthodes de classification en réalisant l'apprentissage sur un ensemble de données et la classification sur un autre ensemble. Nous avons pu atteindre une spécificité de 100% et une sensibilité supérieure à 81%. La méthode SVM semble avoir une meilleure sensibilité que les réseaux de neurones. L'intérêt d'un tel travail est de pouvoir aider à terme au diagnostic de la maladie d'Alzheimer.We used PET imaging with tracers F18-FDG and AV45 in conjunction with the classification methods in the field of "Machine Learning". PET images were acquired in dynamic mode, an image every 5 minutes.The images used come from three different sources: the database ADNI (Alzheimer's Disease Neuro-Imaging Initiative, University of California Los Angeles) and two protocols performed in the PET center of the Purpan Hospital. The classification was applied after processing dynamic images by Principal Component Analysis and Independent Component Analysis. The data were separated into training set and test set. To evaluate the performance of the classification we used the method of cross-validation LOOCV (Leave One Out Cross Validation). We give a comparison between the two most widely used classification methods, SVM (Support Vector Machine) and artificial neural networks (ANN) for both tracers. The combination giving the best classification rate seems to be SVM and AV45 tracer. However the most important confusion is found between MCI patients and normal subjects. Alzheimer's patients differ somewhat better since they are often found in more than 90%. We evaluated the generalization of our methods by making learning from set of data and classification on another set . We reached the specifity score of 100% and sensitivity score of more than 81%. SVM method showed a bettrer sensitivity than Artificial Neural Network method. The value of such work is to help the clinicians in diagnosing Alzheimer's disease

Thèses en ligne de l'Université Toulouse III - Paul Sabatier