99 research outputs found
Erratum to: Methods for evaluating medical tests and biomarkers
[This corrects the article DOI: 10.1186/s41512-016-0001-y.]
Evidence synthesis to inform model-based cost-effectiveness evaluations of diagnostic tests: a methodological systematic review of health technology assessments
Background: Evaluations of diagnostic tests are challenging because of the indirect nature of their impact on patient outcomes. Model-based health economic evaluations of tests allow different types of evidence from various sources to be incorporated and enable cost-effectiveness estimates to be made beyond the duration of available study data. To parameterize a health-economic model fully, all the ways a test impacts on patient health must be quantified, including but not limited to diagnostic test accuracy. Methods: We assessed all UK NIHR HTA reports published May 2009-July 2015. Reports were included if they evaluated a diagnostic test, included a model-based health economic evaluation and included a systematic review and meta-analysis of test accuracy. From each eligible report we extracted information on the following topics: 1) what evidence aside from test accuracy was searched for and synthesised, 2) which methods were used to synthesise test accuracy evidence and how did the results inform the economic model, 3) how/whether threshold effects were explored, 4) how the potential dependency between multiple tests in a pathway was accounted for, and 5) for evaluations of tests targeted at the primary care setting, how evidence from differing healthcare settings was incorporated. Results: The bivariate or HSROC model was implemented in 20/22 reports that met all inclusion criteria. Test accuracy data for health economic modelling was obtained from meta-analyses completely in four reports, partially in fourteen reports and not at all in four reports. Only 2/7 reports that used a quantitative test gave clear threshold recommendations. All 22 reports explored the effect of uncertainty in accuracy parameters but most of those that used multiple tests did not allow for dependence between test results. 7/22 tests were potentially suitable for primary care but the majority found limited evidence on test accuracy in primary care settings. Conclusions: The uptake of appropriate meta-analysis methods for synthesising evidence on diagnostic test accuracy in UK NIHR HTAs has improved in recent years. Future research should focus on other evidence requirements for cost-effectiveness assessment, threshold effects for quantitative tests and the impact of multiple diagnostic tests
Erratum to: Methods for evaluating medical tests and biomarkers
[This corrects the article DOI: 10.1186/s41512-016-0001-y.]
Structuration, standardisation et enrichissement par traitement automatique du langage des données relatives au cancer au sein de l'entrepÎt de données de santé de l'Assistance Publique -HÎpitaux de Paris
Cancer is a public health issue for which the improvement of care relies, among other levers, on the use of clinical data warehouses (CDWs). Their use involves overcoming obstacles such as the quality, standardization and structuring of the care data stored there. The objective of this thesis was to demonstrate that it is possible to address the challenges of secondary use of data from the Assistance Publique - HĂŽpitaux de Paris (AP-HP) CDW regarding cancer patients, and for various purposes such as monitoring the safety and quality of care, and performing observational and experimental clinical research. First, the identification of a minimal data set enabled to concentrate the effort of formalizing the items of interest specific to the discipline. From 15 identified items, 4 use cases from distinct medical perspectives were successfully developed: automation of calculations of safety and quality of care required for the international certification of health establishments , clinical epidemiology regarding the impact of public health measures during a pandemic on the delay in cancer diagnosis, decision support regarding the optimization of patient recruitment in clinical trials, development of neural networks regarding prognostication by computer vision. A second condition necessary for the CDW use in oncology is based on the optimal and interoperable formalization between several CDWs of this minimal data set. As part of the French PENELOPE initiative aiming at improving patient recruitment in clinical trials, the thesis assessed the added value of the oncology extension of the OMOP common data model. This version 5.4 of OMOP enabled to double the rate of formalization of prescreening criteria for phase I to IV clinical trials. Only 23% of these criteria could be automatically queried on the AP-HP CDW, and this, modulo a positive predictive value of less than 30%. This work suggested a novel methodology for evaluating the performance of a recruitment support system: based on the usual metrics (sensitivity, specificity, positive predictive value, negative predictive value), but also based on additional indicators characterizing the adequacy of the model chosen with the CDW related (rate of translation and execution of queries). Finally, the work showed how natural language processing related to the CDW data structuring could enrich the minimal data set, based on the baseline tumor dissemination assessment of a cancer diagnosis and on the histoprognostic characteristics of tumors. The comparison of textual extraction performance metrics and the human and technical resources necessary for the development of rules and machine learning systems made it possible to promote, for a certain number of situations, the first approach. The thesis identified that automatic rule-based preannotation before a manual annotation phase for training a machine learning model was an optimizable approach. The rules seemed to be sufficient for textual extraction tasks of a certain typology of entities that are well characterized on a lexical and semantic level. Anticipation and modeling of this typology could be possible upstream of the textual extraction phase, in order to differentiate, depending on each type of entity, to what extent machine learning should replace the rules. The thesis demonstrated that a close attention to a certain number of data science challenges allowed the efficient use of a CDW for various purposes in oncology.Le cancer est un enjeu de santĂ© publique dont lâamĂ©lioration de la prise en charge repose, entre autres leviers, sur lâexploitation dâentrepĂŽts de donnĂ©es de santĂ© (EDS). Leur utilisation implique la maĂźtrise dâobstacles tels que la qualitĂ©, la standardisation et la structuration des donnĂ©es de soins qui y sont stockĂ©es. Lâobjectif de cette thĂšse Ă©tait de dĂ©montrer quâil est possible de lever les verrous dâutilisation secondaire des donnĂ©es de lâEDS de lâAssistance Publique - HĂŽpitaux de Paris (AP-HP) concernant des patients atteints de cancer Ă diverses finalitĂ©s telles que le pilotage de la sĂ©curitĂ© et de la qualitĂ© des soins, et les projets de recherche clinique observationnelle et expĂ©rimentale. En premier lieu, lâidentification dâun jeu de donnĂ©es minimales a permis de concentrer lâeffort de formalisation des items dâintĂ©rĂȘt propres Ă la discipline. A partir de 15 items identifiĂ©s, 4 cas dâusages relevant de perspectives mĂ©dicales distinctes ont pu ĂȘtre dĂ©veloppĂ©s avec succĂšs : pilotage concernant lâautomatisation de calculs dâindicateurs de sĂ©curitĂ© et de qualitĂ© des soins nĂ©cessaires Ă la certification internationale des Ă©tablissements de santĂ©, Ă©pidĂ©miologie clinique concernant lâimpact des mesures de santĂ© publique en temps de pandĂ©mie sur le retard diagnostic des cancers, aide Ă la dĂ©cision concernant lâoptimisation du recrutement des patients dans des essais cliniques, dĂ©veloppement de rĂ©seaux de neurones concernant la pronostication par vision par ordinateur. Une deuxiĂšme condition nĂ©cessaire Ă lâexploitation dâun EDS en oncologie repose sur la formalisation optimale et interopĂ©rable entre plusieurs EDS de ce jeu de donnĂ©es minimales. Dans le cadre de lâinitiative française PENELOPE visant Ă amĂ©liorer le recrutement des patients dans des essais cliniques, la thĂšse a Ă©valuĂ© la plus-value de lâextension oncologie du modĂšle de donnĂ©es commun OMOP. Cette version 5.4 dâOMOP permettait de doubler le taux de formalisation de critĂšres de prĂ©screening dâessais cliniques de phase I Ă IV. Seulement 23% de ces critĂšres pouvaient ĂȘtre requetĂ©s automatiquement sur lâEDS de lâAP-HP, et ce, modulo une valeur prĂ©dictive positive infĂ©rieure Ă 30%. Ce travail propose une mĂ©thodologie inĂ©dite pour Ă©valuer la performance d'un systĂšme dâaide au recrutement : Ă partir des mĂ©triques habituelles (sensibilitĂ©, spĂ©cificitĂ©, valeur prĂ©dictive positive, valeur prĂ©dictive nĂ©gative), mais aussi Ă partir dâindicateurs complĂ©mentaires caractĂ©risant lâadĂ©quation du modĂšle choisi avec lâEDS concernĂ© (taux de traduction et d'exĂ©cution des requĂȘtes). Enfin, le travail a permis de montrer le caractĂšre palliatif du traitement automatique du langage naturel concernant la structuration des donnĂ©es d'un EDS en informant le bilan dâextension initial dâun diagnostic de cancer et les caractĂ©ristiques histopronostiques des tumeurs. La confrontation des mĂ©triques de performance dâextraction textuelle et des ressources humaines et techniques nĂ©cessaires au dĂ©veloppement de systĂšmes de rĂšgles et dâapprentissage automatique a permis de valoriser, pour un certain nombre de situations, la premiĂšre approche. La thĂšse a identifiĂ© quâune prĂ©annotation automatique Ă base de rĂšgles avant une phase dâannotation manuelle pour entraĂźnement dâun modĂšle dâapprentissage machine Ă©tait une approche optimisable. Les rĂšgles semblent suffire pour les tĂąches dâextraction textuelle dâune certaine typologie dâentitĂ©s bien caractĂ©risĂ©e sur un plan lexical et sĂ©mantique. Lâanticipation et la modĂ©lisation de cette typologie pourrait ĂȘtre possible en amont de la phase dâextraction textuelle, afin de diffĂ©rencier, en fonction de chaque type dâentitĂ©, dans quelle mesure lâapprentissage machine devrait supplĂ©er aux rĂšgles. La thĂšse a permis de dĂ©montrer quâune attention portĂ©e Ă un certain nombre de thĂ©matiques des sciences des donnĂ©es permettait lâutilisation efficiente dâun EDS et ce, Ă des fins diverses en oncologie
Structuring, standardisation and enrichment of cancer data by natural language processing within the APHP clinical data warehouse
Le cancer est un enjeu de santĂ© publique dont lâamĂ©lioration de la prise en charge repose, entre autres leviers, sur lâexploitation dâentrepĂŽts de donnĂ©es de santĂ© (EDS). Leur utilisation implique la maĂźtrise dâobstacles tels que la qualitĂ©, la standardisation et la structuration des donnĂ©es de soins qui y sont stockĂ©es. Lâobjectif de cette thĂšse Ă©tait de dĂ©montrer quâil est possible de lever les verrous dâutilisation secondaire des donnĂ©es de lâEDS de lâAssistance Publique - HĂŽpitaux de Paris (AP-HP) concernant des patients atteints de cancer Ă diverses finalitĂ©s telles que le pilotage de la sĂ©curitĂ© et de la qualitĂ© des soins, et les projets de recherche clinique observationnelle et expĂ©rimentale. En premier lieu, lâidentification dâun jeu de donnĂ©es minimales a permis de concentrer lâeffort de formalisation des items dâintĂ©rĂȘt propres Ă la discipline. A partir de 15 items identifiĂ©s, 4 cas dâusages relevant de perspectives mĂ©dicales distinctes ont pu ĂȘtre dĂ©veloppĂ©s avec succĂšs : pilotage concernant lâautomatisation de calculs dâindicateurs de sĂ©curitĂ© et de qualitĂ© des soins nĂ©cessaires Ă la certification internationale des Ă©tablissements de santĂ©, Ă©pidĂ©miologie clinique concernant lâimpact des mesures de santĂ© publique en temps de pandĂ©mie sur le retard diagnostic des cancers, aide Ă la dĂ©cision concernant lâoptimisation du recrutement des patients dans des essais cliniques, dĂ©veloppement de rĂ©seaux de neurones concernant la pronostication par vision par ordinateur. Une deuxiĂšme condition nĂ©cessaire Ă lâexploitation dâun EDS en oncologie repose sur la formalisation optimale et interopĂ©rable entre plusieurs EDS de ce jeu de donnĂ©es minimales. Dans le cadre de lâinitiative française PENELOPE visant Ă amĂ©liorer le recrutement des patients dans des essais cliniques, la thĂšse a Ă©valuĂ© la plus-value de lâextension oncologie du modĂšle de donnĂ©es commun OMOP. Cette version 5.4 dâOMOP permettait de doubler le taux de formalisation de critĂšres de prĂ©screening dâessais cliniques de phase I Ă IV. Seulement 23% de ces critĂšres pouvaient ĂȘtre requetĂ©s automatiquement sur lâEDS de lâAP-HP, et ce, modulo une valeur prĂ©dictive positive infĂ©rieure Ă 30%. Ce travail propose une mĂ©thodologie inĂ©dite pour Ă©valuer la performance d'un systĂšme dâaide au recrutement : Ă partir des mĂ©triques habituelles (sensibilitĂ©, spĂ©cificitĂ©, valeur prĂ©dictive positive, valeur prĂ©dictive nĂ©gative), mais aussi Ă partir dâindicateurs complĂ©mentaires caractĂ©risant lâadĂ©quation du modĂšle choisi avec lâEDS concernĂ© (taux de traduction et d'exĂ©cution des requĂȘtes). Enfin, le travail a permis de montrer le caractĂšre palliatif du traitement automatique du langage naturel concernant la structuration des donnĂ©es d'un EDS en informant le bilan dâextension initial dâun diagnostic de cancer et les caractĂ©ristiques histopronostiques des tumeurs. La confrontation des mĂ©triques de performance dâextraction textuelle et des ressources humaines et techniques nĂ©cessaires au dĂ©veloppement de systĂšmes de rĂšgles et dâapprentissage automatique a permis de valoriser, pour un certain nombre de situations, la premiĂšre approche. La thĂšse a identifiĂ© quâune prĂ©annotation automatique Ă base de rĂšgles avant une phase dâannotation manuelle pour entraĂźnement dâun modĂšle dâapprentissage machine Ă©tait une approche optimisable. Les rĂšgles semblent suffire pour les tĂąches dâextraction textuelle dâune certaine typologie dâentitĂ©s bien caractĂ©risĂ©e sur un plan lexical et sĂ©mantique. Lâanticipation et la modĂ©lisation de cette typologie pourrait ĂȘtre possible en amont de la phase dâextraction textuelle, afin de diffĂ©rencier, en fonction de chaque type dâentitĂ©, dans quelle mesure lâapprentissage machine devrait supplĂ©er aux rĂšgles. La thĂšse a permis de dĂ©montrer quâune attention portĂ©e Ă un certain nombre de thĂ©matiques des sciences des donnĂ©es permettait lâutilisation efficiente dâun EDS et ce, Ă des fins diverses en oncologie.Cancer is a public health issue for which the improvement of care relies, among other levers, on the use of clinical data warehouses (CDWs). Their use involves overcoming obstacles such as the quality, standardization and structuring of the care data stored there. The objective of this thesis was to demonstrate that it is possible to address the challenges of secondary use of data from the Assistance Publique - HĂŽpitaux de Paris (AP-HP) CDW regarding cancer patients, and for various purposes such as monitoring the safety and quality of care, and performing observational and experimental clinical research. First, the identification of a minimal data set enabled to concentrate the effort of formalizing the items of interest specific to the discipline. From 15 identified items, 4 use cases from distinct medical perspectives were successfully developed: automation of calculations of safety and quality of care required for the international certification of health establishments , clinical epidemiology regarding the impact of public health measures during a pandemic on the delay in cancer diagnosis, decision support regarding the optimization of patient recruitment in clinical trials, development of neural networks regarding prognostication by computer vision. A second condition necessary for the CDW use in oncology is based on the optimal and interoperable formalization between several CDWs of this minimal data set. As part of the French PENELOPE initiative aiming at improving patient recruitment in clinical trials, the thesis assessed the added value of the oncology extension of the OMOP common data model. This version 5.4 of OMOP enabled to double the rate of formalization of prescreening criteria for phase I to IV clinical trials. Only 23% of these criteria could be automatically queried on the AP-HP CDW, and this, modulo a positive predictive value of less than 30%. This work suggested a novel methodology for evaluating the performance of a recruitment support system: based on the usual metrics (sensitivity, specificity, positive predictive value, negative predictive value), but also based on additional indicators characterizing the adequacy of the model chosen with the CDW related (rate of translation and execution of queries). Finally, the work showed how natural language processing related to the CDW data structuring could enrich the minimal data set, based on the baseline tumor dissemination assessment of a cancer diagnosis and on the histoprognostic characteristics of tumors. The comparison of textual extraction performance metrics and the human and technical resources necessary for the development of rules and machine learning systems made it possible to promote, for a certain number of situations, the first approach. The thesis identified that automatic rule-based preannotation before a manual annotation phase for training a machine learning model was an optimizable approach. The rules seemed to be sufficient for textual extraction tasks of a certain typology of entities that are well characterized on a lexical and semantic level. Anticipation and modeling of this typology could be possible upstream of the textual extraction phase, in order to differentiate, depending on each type of entity, to what extent machine learning should replace the rules. The thesis demonstrated that a close attention to a certain number of data science challenges allowed the efficient use of a CDW for various purposes in oncology
Structuration, standardisation et enrichissement par traitement automatique du langage des donnĂ©es relatives au cancer au sein de lâentrepĂŽt de donnĂ©es de santĂ© de lâAssistance Publique â HĂŽpitaux de Paris
Cancer is a public health issue for which the improvement of care relies, among other levers, on the use of clinical data warehouses (CDWs). Their use involves overcoming obstacles such as the quality, standardization and structuring of the care data stored there. The objective of this thesis was to demonstrate that it is possible to address the challenges of secondary use of data from the Assistance Publique - HĂŽpitaux de Paris (AP-HP) CDW regarding cancer patients, and for various purposes such as monitoring the safety and quality of care, and performing observational and experimental clinical research. First, the identification of a minimal data set enabled to concentrate the effort of formalizing the items of interest specific to the discipline. From 15 identified items, 4 use cases from distinct medical perspectives were successfully developed: automation of calculations of safety and quality of care required for the international certification of health establishments , clinical epidemiology regarding the impact of public health measures during a pandemic on the delay in cancer diagnosis, decision support regarding the optimization of patient recruitment in clinical trials, development of neural networks regarding prognostication by computer vision. A second condition necessary for the CDW use in oncology is based on the optimal and interoperable formalization between several CDWs of this minimal data set. As part of the French PENELOPE initiative aiming at improving patient recruitment in clinical trials, the thesis assessed the added value of the oncology extension of the OMOP common data model. This version 5.4 of OMOP enabled to double the rate of formalization of prescreening criteria for phase I to IV clinical trials. Only 23% of these criteria could be automatically queried on the AP-HP CDW, and this, modulo a positive predictive value of less than 30%. This work suggested a novel methodology for evaluating the performance of a recruitment support system: based on the usual metrics (sensitivity, specificity, positive predictive value, negative predictive value), but also based on additional indicators characterizing the adequacy of the model chosen with the CDW related (rate of translation and execution of queries). Finally, the work showed how natural language processing related to the CDW data structuring could enrich the minimal data set, based on the baseline tumor dissemination assessment of a cancer diagnosis and on the histoprognostic characteristics of tumors. The comparison of textual extraction performance metrics and the human and technical resources necessary for the development of rules and machine learning systems made it possible to promote, for a certain number of situations, the first approach. The thesis identified that automatic rule-based preannotation before a manual annotation phase for training a machine learning model was an optimizable approach. The rules seemed to be sufficient for textual extraction tasks of a certain typology of entities that are well characterized on a lexical and semantic level. Anticipation and modeling of this typology could be possible upstream of the textual extraction phase, in order to differentiate, depending on each type of entity, to what extent machine learning should replace the rules. The thesis demonstrated that a close attention to a certain number of data science challenges allowed the efficient use of a CDW for various purposes in oncology.Le cancer est un enjeu de santĂ© publique dont lâamĂ©lioration de la prise en charge repose, entre autres leviers, sur lâexploitation dâentrepĂŽts de donnĂ©es de santĂ© (EDS). Leur utilisation implique la maĂźtrise dâobstacles tels que la qualitĂ©, la standardisation et la structuration des donnĂ©es de soins qui y sont stockĂ©es. Lâobjectif de cette thĂšse Ă©tait de dĂ©montrer quâil est possible de lever les verrous dâutilisation secondaire des donnĂ©es de lâEDS de lâAssistance Publique - HĂŽpitaux de Paris (AP-HP) concernant des patients atteints de cancer Ă diverses finalitĂ©s telles que le pilotage de la sĂ©curitĂ© et de la qualitĂ© des soins, et les projets de recherche clinique observationnelle et expĂ©rimentale. En premier lieu, lâidentification dâun jeu de donnĂ©es minimales a permis de concentrer lâeffort de formalisation des items dâintĂ©rĂȘt propres Ă la discipline. A partir de 15 items identifiĂ©s, 4 cas dâusages relevant de perspectives mĂ©dicales distinctes ont pu ĂȘtre dĂ©veloppĂ©s avec succĂšs : pilotage concernant lâautomatisation de calculs dâindicateurs de sĂ©curitĂ© et de qualitĂ© des soins nĂ©cessaires Ă la certification internationale des Ă©tablissements de santĂ©, Ă©pidĂ©miologie clinique concernant lâimpact des mesures de santĂ© publique en temps de pandĂ©mie sur le retard diagnostic des cancers, aide Ă la dĂ©cision concernant lâoptimisation du recrutement des patients dans des essais cliniques, dĂ©veloppement de rĂ©seaux de neurones concernant la pronostication par vision par ordinateur. Une deuxiĂšme condition nĂ©cessaire Ă lâexploitation dâun EDS en oncologie repose sur la formalisation optimale et interopĂ©rable entre plusieurs EDS de ce jeu de donnĂ©es minimales. Dans le cadre de lâinitiative française PENELOPE visant Ă amĂ©liorer le recrutement des patients dans des essais cliniques, la thĂšse a Ă©valuĂ© la plus-value de lâextension oncologie du modĂšle de donnĂ©es commun OMOP. Cette version 5.4 dâOMOP permettait de doubler le taux de formalisation de critĂšres de prĂ©screening dâessais cliniques de phase I Ă IV. Seulement 23% de ces critĂšres pouvaient ĂȘtre requetĂ©s automatiquement sur lâEDS de lâAP-HP, et ce, modulo une valeur prĂ©dictive positive infĂ©rieure Ă 30%. Ce travail propose une mĂ©thodologie inĂ©dite pour Ă©valuer la performance d'un systĂšme dâaide au recrutement : Ă partir des mĂ©triques habituelles (sensibilitĂ©, spĂ©cificitĂ©, valeur prĂ©dictive positive, valeur prĂ©dictive nĂ©gative), mais aussi Ă partir dâindicateurs complĂ©mentaires caractĂ©risant lâadĂ©quation du modĂšle choisi avec lâEDS concernĂ© (taux de traduction et d'exĂ©cution des requĂȘtes). Enfin, le travail a permis de montrer le caractĂšre palliatif du traitement automatique du langage naturel concernant la structuration des donnĂ©es d'un EDS en informant le bilan dâextension initial dâun diagnostic de cancer et les caractĂ©ristiques histopronostiques des tumeurs. La confrontation des mĂ©triques de performance dâextraction textuelle et des ressources humaines et techniques nĂ©cessaires au dĂ©veloppement de systĂšmes de rĂšgles et dâapprentissage automatique a permis de valoriser, pour un certain nombre de situations, la premiĂšre approche. La thĂšse a identifiĂ© quâune prĂ©annotation automatique Ă base de rĂšgles avant une phase dâannotation manuelle pour entraĂźnement dâun modĂšle dâapprentissage machine Ă©tait une approche optimisable. Les rĂšgles semblent suffire pour les tĂąches dâextraction textuelle dâune certaine typologie dâentitĂ©s bien caractĂ©risĂ©e sur un plan lexical et sĂ©mantique. Lâanticipation et la modĂ©lisation de cette typologie pourrait ĂȘtre possible en amont de la phase dâextraction textuelle, afin de diffĂ©rencier, en fonction de chaque type dâentitĂ©, dans quelle mesure lâapprentissage machine devrait supplĂ©er aux rĂšgles. La thĂšse a permis de dĂ©montrer quâune attention portĂ©e Ă un certain nombre de thĂ©matiques des sciences des donnĂ©es permettait lâutilisation efficiente dâun EDS et ce, Ă des fins diverses en oncologie
Structuration, standardisation et enrichissement par traitement automatique du langage des donnĂ©es relatives au cancer au sein de lâentrepĂŽt de donnĂ©es de santĂ© de lâAssistance Publique â HĂŽpitaux de Paris
Cancer is a public health issue for which the improvement of care relies, among other levers, on the use of clinical data warehouses (CDWs). Their use involves overcoming obstacles such as the quality, standardization and structuring of the care data stored there. The objective of this thesis was to demonstrate that it is possible to address the challenges of secondary use of data from the Assistance Publique - HĂŽpitaux de Paris (AP-HP) CDW regarding cancer patients, and for various purposes such as monitoring the safety and quality of care, and performing observational and experimental clinical research. First, the identification of a minimal data set enabled to concentrate the effort of formalizing the items of interest specific to the discipline. From 15 identified items, 4 use cases from distinct medical perspectives were successfully developed: automation of calculations of safety and quality of care required for the international certification of health establishments , clinical epidemiology regarding the impact of public health measures during a pandemic on the delay in cancer diagnosis, decision support regarding the optimization of patient recruitment in clinical trials, development of neural networks regarding prognostication by computer vision. A second condition necessary for the CDW use in oncology is based on the optimal and interoperable formalization between several CDWs of this minimal data set. As part of the French PENELOPE initiative aiming at improving patient recruitment in clinical trials, the thesis assessed the added value of the oncology extension of the OMOP common data model. This version 5.4 of OMOP enabled to double the rate of formalization of prescreening criteria for phase I to IV clinical trials. Only 23% of these criteria could be automatically queried on the AP-HP CDW, and this, modulo a positive predictive value of less than 30%. This work suggested a novel methodology for evaluating the performance of a recruitment support system: based on the usual metrics (sensitivity, specificity, positive predictive value, negative predictive value), but also based on additional indicators characterizing the adequacy of the model chosen with the CDW related (rate of translation and execution of queries). Finally, the work showed how natural language processing related to the CDW data structuring could enrich the minimal data set, based on the baseline tumor dissemination assessment of a cancer diagnosis and on the histoprognostic characteristics of tumors. The comparison of textual extraction performance metrics and the human and technical resources necessary for the development of rules and machine learning systems made it possible to promote, for a certain number of situations, the first approach. The thesis identified that automatic rule-based preannotation before a manual annotation phase for training a machine learning model was an optimizable approach. The rules seemed to be sufficient for textual extraction tasks of a certain typology of entities that are well characterized on a lexical and semantic level. Anticipation and modeling of this typology could be possible upstream of the textual extraction phase, in order to differentiate, depending on each type of entity, to what extent machine learning should replace the rules. The thesis demonstrated that a close attention to a certain number of data science challenges allowed the efficient use of a CDW for various purposes in oncology.Le cancer est un enjeu de santĂ© publique dont lâamĂ©lioration de la prise en charge repose, entre autres leviers, sur lâexploitation dâentrepĂŽts de donnĂ©es de santĂ© (EDS). Leur utilisation implique la maĂźtrise dâobstacles tels que la qualitĂ©, la standardisation et la structuration des donnĂ©es de soins qui y sont stockĂ©es. Lâobjectif de cette thĂšse Ă©tait de dĂ©montrer quâil est possible de lever les verrous dâutilisation secondaire des donnĂ©es de lâEDS de lâAssistance Publique - HĂŽpitaux de Paris (AP-HP) concernant des patients atteints de cancer Ă diverses finalitĂ©s telles que le pilotage de la sĂ©curitĂ© et de la qualitĂ© des soins, et les projets de recherche clinique observationnelle et expĂ©rimentale. En premier lieu, lâidentification dâun jeu de donnĂ©es minimales a permis de concentrer lâeffort de formalisation des items dâintĂ©rĂȘt propres Ă la discipline. A partir de 15 items identifiĂ©s, 4 cas dâusages relevant de perspectives mĂ©dicales distinctes ont pu ĂȘtre dĂ©veloppĂ©s avec succĂšs : pilotage concernant lâautomatisation de calculs dâindicateurs de sĂ©curitĂ© et de qualitĂ© des soins nĂ©cessaires Ă la certification internationale des Ă©tablissements de santĂ©, Ă©pidĂ©miologie clinique concernant lâimpact des mesures de santĂ© publique en temps de pandĂ©mie sur le retard diagnostic des cancers, aide Ă la dĂ©cision concernant lâoptimisation du recrutement des patients dans des essais cliniques, dĂ©veloppement de rĂ©seaux de neurones concernant la pronostication par vision par ordinateur. Une deuxiĂšme condition nĂ©cessaire Ă lâexploitation dâun EDS en oncologie repose sur la formalisation optimale et interopĂ©rable entre plusieurs EDS de ce jeu de donnĂ©es minimales. Dans le cadre de lâinitiative française PENELOPE visant Ă amĂ©liorer le recrutement des patients dans des essais cliniques, la thĂšse a Ă©valuĂ© la plus-value de lâextension oncologie du modĂšle de donnĂ©es commun OMOP. Cette version 5.4 dâOMOP permettait de doubler le taux de formalisation de critĂšres de prĂ©screening dâessais cliniques de phase I Ă IV. Seulement 23% de ces critĂšres pouvaient ĂȘtre requetĂ©s automatiquement sur lâEDS de lâAP-HP, et ce, modulo une valeur prĂ©dictive positive infĂ©rieure Ă 30%. Ce travail propose une mĂ©thodologie inĂ©dite pour Ă©valuer la performance d'un systĂšme dâaide au recrutement : Ă partir des mĂ©triques habituelles (sensibilitĂ©, spĂ©cificitĂ©, valeur prĂ©dictive positive, valeur prĂ©dictive nĂ©gative), mais aussi Ă partir dâindicateurs complĂ©mentaires caractĂ©risant lâadĂ©quation du modĂšle choisi avec lâEDS concernĂ© (taux de traduction et d'exĂ©cution des requĂȘtes). Enfin, le travail a permis de montrer le caractĂšre palliatif du traitement automatique du langage naturel concernant la structuration des donnĂ©es d'un EDS en informant le bilan dâextension initial dâun diagnostic de cancer et les caractĂ©ristiques histopronostiques des tumeurs. La confrontation des mĂ©triques de performance dâextraction textuelle et des ressources humaines et techniques nĂ©cessaires au dĂ©veloppement de systĂšmes de rĂšgles et dâapprentissage automatique a permis de valoriser, pour un certain nombre de situations, la premiĂšre approche. La thĂšse a identifiĂ© quâune prĂ©annotation automatique Ă base de rĂšgles avant une phase dâannotation manuelle pour entraĂźnement dâun modĂšle dâapprentissage machine Ă©tait une approche optimisable. Les rĂšgles semblent suffire pour les tĂąches dâextraction textuelle dâune certaine typologie dâentitĂ©s bien caractĂ©risĂ©e sur un plan lexical et sĂ©mantique. Lâanticipation et la modĂ©lisation de cette typologie pourrait ĂȘtre possible en amont de la phase dâextraction textuelle, afin de diffĂ©rencier, en fonction de chaque type dâentitĂ©, dans quelle mesure lâapprentissage machine devrait supplĂ©er aux rĂšgles. La thĂšse a permis de dĂ©montrer quâune attention portĂ©e Ă un certain nombre de thĂ©matiques des sciences des donnĂ©es permettait lâutilisation efficiente dâun EDS et ce, Ă des fins diverses en oncologie
DĂ©penses publiques dans une Ă©conomie Ă deux pays : Stackelberg versus Nash
National audienceThis paper analyses strategic fiscal policy-making within the context of the standard two-country-two-good real trade model developped by TURNOVSKY (1988). Introducing asymmetry between the two countries and assuming that one country acts as a Stackelberg leader relative to the other one, we compare the welfare issued from the Nash equilibrium and the welfare for each country issued from the Stackelberg equilibrium. It happens that both countries benefi t from Stackelberg equilibrium with respect to the non cooperative equilibrium in the presence of strategic omplementarity, because both governments reduce their public pendings. In this model, strategic interactions depend on the relative value of elasticities of substitution between goods. There may either exist strategic substituability or complementarity.Cet article analyse l'impact sur le bien-ĂȘtre des agents des dĂ©penses publiques nationales dans un modĂšle rĂ©el standard Ă deux pays lorsqu'un des pays est en position de leader de Stackelberg. Nous prĂ©sentons l'Ă©quilibre de Stackelberg et le comparons Ă l'Ă©quilibre de Nash. Nous montrons que l'existence d'un pays meneur bĂ©nĂ©ficie au pays suiveur s'il existe des complĂ©mentaritĂ©s stratĂ©giques. Les interactions stratĂ©giques dĂ©pendent dans ce modĂšle des Ă©lasticitĂ©s de sustitution relatives entre les biens
- âŠ