101 research outputs found
The Smart Data Extractor, a Clinician Friendly Solution to Accelerate and Improve the Data Collection During Clinical Trials
In medical research, the traditional way to collect data, i.e. browsing
patient files, has been proven to induce bias, errors, human labor and costs.
We propose a semi-automated system able to extract every type of data,
including notes. The Smart Data Extractor pre-populates clinic research forms
by following rules. We performed a cross-testing experiment to compare
semi-automated to manual data collection. 20 target items had to be collected
for 79 patients. The average time to complete one form was 6'81'' for manual
data collection and 3'22'' with the Smart Data Extractor. There were also more
mistakes during manual data collection (163 for the whole cohort) than with the
Smart Data Extractor (46 for the whole cohort). We present an easy to use,
understandable and agile solution to fill out clinical research forms. It
reduces human effort and provides higher quality data, avoiding data re-entry
and fatigue induced errors.Comment: IOS Press, 2023, Studies in Health Technology and Informatic
Exploring Analogical Inference in Healthcare
International audienc
An analogy based framework for patient-stay identification in healthcare
International audienc
The genetic landscape and clinical spectrum of nephronophthisis and related ciliopathies
Nephronophthisis (NPH) is an autosomal-recessive ciliopathy representing one of the most frequent causes of kidney failure in childhood characterized by a broad clinical and genetic heterogeneity. Applied to one of the worldwide largest cohorts of patients with NPH, genetic analysis encompassing targeted and whole exome sequencing identified disease-causing variants in 600 patients from 496 families with a detection rate of 71%. Of 788 pathogenic variants, 40 known ciliopathy genes were identified. However, the majority of patients (53%) bore biallelic pathogenic variants in NPHP1. NPH-causing gene alterations affected all ciliary modules defined by structural and/or functional subdomains. Seventy six percent of these patients had progressed to kidney failure, of which 18% had an infantile form (under five years) and harbored variants affecting the Inversin compartment or intraflagellar transport complex A. Forty eight percent of patients showed a juvenile (5-15 years) and 34% a late-onset disease (over 15 years), the latter mostly carrying variants belonging to the Transition Zone module. Furthermore, while more than 85% of patients with an infantile form presented with extra-kidney manifestations, it only concerned half of juvenile and late onset cases. Eye involvement represented a predominant feature, followed by cerebellar hypoplasia and other brain abnormalities, liver and skeletal defects. The phenotypic variability was in a large part associated with mutation types, genes and corresponding ciliary modules with hypomorphic variants in ciliary genes playing a role in early steps of ciliogenesis associated with juvenile-to-late onset NPH forms. Thus, our data confirm a considerable proportion of late-onset NPH suggesting an underdiagnosis in adult chronic kidney disease
AI-based diagnosis and phenotype â Genotype correlations in syndromic craniosynostoses
Apert (AS), Crouzon (CS), Muenke (MS), Pfeiffer (PS), and Saethre Chotzen (SCS) are among the most frequently diagnosed syndromic craniosynostoses. The aims of this study were (1) to train an innovative model using artificial intelligence (AI)âbased methods on two-dimensional facial frontal, lateral, and external ear photographs to assist diagnosis for syndromic craniosynostoses vs controls, and (2) to screen for genotype/phenotype correlations in AS, CS, and PS. We included retrospectively and prospectively, from 1979 to 2023, all frontal and lateral pictures of patients genetically diagnosed with AS, CS, MS, PS and SCS syndromes. After a deep learningâbased preprocessing, we extracted geometric and textural features and used XGboost (eXtreme Gradient Boosting) to classify patients. The model was tested on an independent international validation set of genetically confirmed patients and non-syndromic controls. Between 1979 and 2023, we included 2228 frontal and lateral facial photographs corresponding to 541 patients. In all, 70.2% [0.593â0.797] (p < 0.001) of patients in the validation set were correctly diagnosed. Genotypes linked to a splice donor site of FGFR2 in Crouzon-Pfeiffer syndrome (CPS) caused a milder phenotype in CPS. Here we report a new method for the automatic detection of syndromic craniosynostoses using AI.</p
L'Ă©quipe-projet HeKA
This article describe the Inria, Inserm, Univ. de Paris project team HeKA.International audienceHeKA est une Ă©quipe-projet de recherche commune Ă Inria, lâInserm et lâUniversitĂ© de Paris. Plus prĂ©cisĂ©ment, HeKA, dĂ©pend du Centre de Recherche des Cordeliers et du Centre Inria de Paris. En plus de deux chercheurs Inria et Inserm, HeKA est composĂ© de chercheurs hospitalo-universitaires de lâAP-HP associĂ©s Ă des services de lâHĂŽpital EuropĂ©en Georges Pompidou, lâHĂŽpital Necker et de lâInstitut Imagine. Les thĂšmes de recherche de lâĂ©quipe sont lâinformatique mĂ©dicale, les biostatistiques et les mathĂ©matiques appliquĂ©es pour lâaide Ă la dĂ©cision clinique. Le terme HeKA est Ă la fois une rĂ©fĂ©rence Ă la divitĂ© Ă©gyptienne de la mĂ©decine et un acronyme pour Health data- and model- driven Knowledge Acquisition.LâĂ©quipe HeKA fait suite Ă lâĂ©quipe 22 (Information Sciences to support Personalized Medicine) dirigĂ©e par Anita Burgun au Centre de Recherche des Corderliers (Inserm, UniversitĂ© de Paris). La responsable de HeKA est Sarah Zohar, elle est secondĂ©e par Adrien Coulet
AI-based diagnosis in mandibulofacial dysostosis with microcephaly using external ear shapes
IntroductionMandibulo-Facial Dysostosis with Microcephaly (MFDM) is a rare disease with a broad spectrum of symptoms, characterized by zygomatic and mandibular hypoplasia, microcephaly, and ear abnormalities. Here, we aimed at describing the external ear phenotype of MFDM patients, and train an Artificial Intelligence (AI)-based model to differentiate MFDM ears from non-syndromic control ears (binary classification), and from ears of the main differential diagnoses of this condition (multi-class classification): Treacher Collins (TC), Nager (NAFD) and CHARGE syndromes.MethodsThe training set contained 1,592 ear photographs, corresponding to 550 patients. We extracted 48 patients completely independent of the training set, with only one photograph per ear per patient. After a CNN-(Convolutional Neural Network) based ear detection, the images were automatically landmarked. Generalized Procrustes Analysis was then performed, along with a dimension reduction using PCA (Principal Component Analysis). The principal components were used as inputs in an eXtreme Gradient Boosting (XGBoost) model, optimized using a 5-fold cross-validation. Finally, the model was tested on an independent validation set.ResultsWe trained the model on 1,592 ear photographs, corresponding to 1,296 control ears, 105 MFDM, 33 NAFD, 70 TC and 88 CHARGE syndrome ears. The model detected MFDM with an accuracy of 0.969 [0.838â0.999] (pâ<â0.001) and an AUC (Area Under the Curve) of 0.975 within controls (binary classification). Balanced accuracies were 0.811 [0.648â0.920] (pâ=â0.002) in a first multiclass design (MFDM vs. controls and differential diagnoses) and 0.813 [0.544â0.960] (pâ=â0.003) in a second multiclass design (MFDM vs. differential diagnoses).ConclusionThis is the first AI-based syndrome detection model in dysmorphology based on the external ear, opening promising clinical applications both for local care and referral, and for expert centers
Textual data Warehouse challenge : Dr. Warehouse and translational research on rare diseases
La rĂ©utilisation des donnĂ©es de soins pour la recherche sâest largement rĂ©pandue avec le dĂ©veloppement dâentrepĂŽts de donnĂ©es cliniques. Ces entrepĂŽts de donnĂ©es sont modĂ©lisĂ©s pour intĂ©grer et explorer des donnĂ©es structurĂ©es liĂ©es Ă des thesaurus. Ces donnĂ©es proviennent principalement dâautomates (biologie, gĂ©nĂ©tique, cardiologie, etc) mais aussi de formulaires de donnĂ©es structurĂ©es saisies manuellement. La production de soins est aussi largement pourvoyeuse de donnĂ©es textuelles provenant des comptes rendus hospitaliers (hospitalisation, opĂ©ratoire, imagerie, anatomopathologie etc.), des zones de texte libre dans les formulaires Ă©lectroniques. Cette masse de donnĂ©es, peu ou pas utilisĂ©e par les entrepĂŽts classiques, est une source dâinformation indispensable dans le contexte des maladies rares. En effet, le texte libre permet de dĂ©crire le tableau clinique dâun patient avec davantage de prĂ©cisions et en exprimant lâabsence de signes et lâincertitude. ParticuliĂšrement pour les patients encore non diagnostiquĂ©s, le mĂ©decin dĂ©crit lâhistoire mĂ©dicale du patient en dehors de tout cadre nosologique. Cette richesse dâinformation fait du texte clinique une source prĂ©cieuse pour la recherche translationnelle. Cela nĂ©cessite toutefois des algorithmes et des outils adaptĂ©s pour en permettre une rĂ©utilisation optimisĂ©e par les mĂ©decins et les chercheurs. Nous prĂ©sentons dans cette thĂšse l'entrepĂŽt de donnĂ©es centrĂ© sur le document clinique, que nous avons modĂ©lisĂ©, implĂ©mentĂ© et Ă©valuĂ©. Ă travers trois cas dâusage pour la recherche translationnelle dans le contexte des maladies rares, nous avons tentĂ© dâadresser les problĂ©matiques inhĂ©rentes aux donnĂ©es textuelles: (i) le recrutement de patients Ă travers un moteur de recherche adaptĂ© aux donnĂ©es textuelles (traitement de la nĂ©gation et des antĂ©cĂ©dents familiaux), (ii) le phĂ©notypage automatisĂ© Ă partir des donnĂ©es textuelles et (iii) lâaide au diagnostic par similaritĂ© entre patients basĂ©s sur le phĂ©notypage. Nous avons pu Ă©valuer ces mĂ©thodes sur lâentrepĂŽt de donnĂ©es de Necker-Enfants Malades crĂ©Ă© et alimentĂ© pendant cette thĂšse, intĂ©grant environ 490 000 patients et 4 millions de comptes rendus. Ces mĂ©thodes et algorithmes ont Ă©tĂ© intĂ©grĂ©s dans le logiciel Dr Warehouse dĂ©veloppĂ© pendant la thĂšse et diffusĂ© en Open source depuis septembre 2017.The repurposing of clinical data for research has become widespread with the development of clinical data warehouses. These data warehouses are modeled to integrate and explore structured data related to thesauri. These data come mainly from machine (biology, genetics, cardiology, etc.) but also from manual data input forms. The production of care is also largely providing textual data from hospital reports (hospitalization, surgery, imaging, anatomopathologic etc.), free text areas in electronic forms. This mass of data, little used by conventional warehouses, is an indispensable source of information in the context of rare diseases. Indeed, the free text makes it possible to describe the clinical picture of a patient with more precision and expressing the absence of signs and uncertainty. Particularly for patients still undiagnosed, the doctor describes the patient's medical history outside any nosological framework. This wealth of information makes clinical text a valuable source for translational research. However, this requires appropriate algorithms and tools to enable optimized re-use by doctors and researchers. We present in this thesis the data warehouse centered on the clinical document, which we have modeled, implemented and evaluated. In three cases of use for translational research in the context of rare diseases, we attempted to address the problems inherent in textual data: (i) recruitment of patients through a search engine adapted to textual (data negation and family history detection), (ii) automated phenotyping from textual data, and (iii) diagnosis by similarity between patients based on phenotyping. We were able to evaluate these methods on the data warehouse of Necker-Enfants Malades created and fed during this thesis, integrating about 490,000 patients and 4 million reports. These methods and algorithms were integrated into the software Dr Warehouse developed during the thesis and distributed in Open source since September 2017
Problématique des entrepÎts de données textuelles : dr Warehouse et la recherche translationnelle sur les maladies rares
The repurposing of clinical data for research has become widespread with the development of clinical data warehouses. These data warehouses are modeled to integrate and explore structured data related to thesauri. These data come mainly from machine (biology, genetics, cardiology, etc.) but also from manual data input forms. The production of care is also largely providing textual data from hospital reports (hospitalization, surgery, imaging, anatomopathologic etc.), free text areas in electronic forms. This mass of data, little used by conventional warehouses, is an indispensable source of information in the context of rare diseases. Indeed, the free text makes it possible to describe the clinical picture of a patient with more precision and expressing the absence of signs and uncertainty. Particularly for patients still undiagnosed, the doctor describes the patient's medical history outside any nosological framework. This wealth of information makes clinical text a valuable source for translational research. However, this requires appropriate algorithms and tools to enable optimized re-use by doctors and researchers. We present in this thesis the data warehouse centered on the clinical document, which we have modeled, implemented and evaluated. In three cases of use for translational research in the context of rare diseases, we attempted to address the problems inherent in textual data: (i) recruitment of patients through a search engine adapted to textual (data negation and family history detection), (ii) automated phenotyping from textual data, and (iii) diagnosis by similarity between patients based on phenotyping. We were able to evaluate these methods on the data warehouse of Necker-Enfants Malades created and fed during this thesis, integrating about 490,000 patients and 4 million reports. These methods and algorithms were integrated into the software Dr Warehouse developed during the thesis and distributed in Open source since September 2017.La rĂ©utilisation des donnĂ©es de soins pour la recherche sâest largement rĂ©pandue avec le dĂ©veloppement dâentrepĂŽts de donnĂ©es cliniques. Ces entrepĂŽts de donnĂ©es sont modĂ©lisĂ©s pour intĂ©grer et explorer des donnĂ©es structurĂ©es liĂ©es Ă des thesaurus. Ces donnĂ©es proviennent principalement dâautomates (biologie, gĂ©nĂ©tique, cardiologie, etc) mais aussi de formulaires de donnĂ©es structurĂ©es saisies manuellement. La production de soins est aussi largement pourvoyeuse de donnĂ©es textuelles provenant des comptes rendus hospitaliers (hospitalisation, opĂ©ratoire, imagerie, anatomopathologie etc.), des zones de texte libre dans les formulaires Ă©lectroniques. Cette masse de donnĂ©es, peu ou pas utilisĂ©e par les entrepĂŽts classiques, est une source dâinformation indispensable dans le contexte des maladies rares. En effet, le texte libre permet de dĂ©crire le tableau clinique dâun patient avec davantage de prĂ©cisions et en exprimant lâabsence de signes et lâincertitude. ParticuliĂšrement pour les patients encore non diagnostiquĂ©s, le mĂ©decin dĂ©crit lâhistoire mĂ©dicale du patient en dehors de tout cadre nosologique. Cette richesse dâinformation fait du texte clinique une source prĂ©cieuse pour la recherche translationnelle. Cela nĂ©cessite toutefois des algorithmes et des outils adaptĂ©s pour en permettre une rĂ©utilisation optimisĂ©e par les mĂ©decins et les chercheurs. Nous prĂ©sentons dans cette thĂšse l'entrepĂŽt de donnĂ©es centrĂ© sur le document clinique, que nous avons modĂ©lisĂ©, implĂ©mentĂ© et Ă©valuĂ©. Ă travers trois cas dâusage pour la recherche translationnelle dans le contexte des maladies rares, nous avons tentĂ© dâadresser les problĂ©matiques inhĂ©rentes aux donnĂ©es textuelles: (i) le recrutement de patients Ă travers un moteur de recherche adaptĂ© aux donnĂ©es textuelles (traitement de la nĂ©gation et des antĂ©cĂ©dents familiaux), (ii) le phĂ©notypage automatisĂ© Ă partir des donnĂ©es textuelles et (iii) lâaide au diagnostic par similaritĂ© entre patients basĂ©s sur le phĂ©notypage. Nous avons pu Ă©valuer ces mĂ©thodes sur lâentrepĂŽt de donnĂ©es de Necker-Enfants Malades crĂ©Ă© et alimentĂ© pendant cette thĂšse, intĂ©grant environ 490 000 patients et 4 millions de comptes rendus. Ces mĂ©thodes et algorithmes ont Ă©tĂ© intĂ©grĂ©s dans le logiciel Dr Warehouse dĂ©veloppĂ© pendant la thĂšse et diffusĂ© en Open source depuis septembre 2017
- âŠ