19 research outputs found
Efficient supervised and semi-supervised approaches for affliations disambiguation
International audienceThe disambiguation of named entities is a challenge in many elds such as sciento- metrics, social networks, record linkage, citation analysis, semantic web...etc. The names ambiguities can arise from misspelling, typographical or OCR mistakes, abbreviations, omissions... So the search of names of persons or of organization is di cult, a single name can appear in di erent forms. This paper proposes two approaches to disambiguate on the a liations of authors of sci- enti c papers in bibliographic databases: the rst way, considers that we have a training corpus, and uses a Naive Bayesian model. The second way assumes that we have not re- source learning, and uses a semi-supervised approach, mixing soft-clustering and Bayesian learning. The results are encouraging and are already partially applied in a scienti c survey department. However, we aware that our approach may have limitations: we can't process e ciently highly unbalanced data but solutions are possible for future developments
Efficient supervised and semi-supervised approaches for affiliations disambiguation
International audienceThe disambiguation of named entities is a challenge in many fields such as scientometrics, social networks, record linkage, citation analysis, semantic web...etc. The names ambiguities can arise from misspelling, typographical or OCR mistakes, abbreviations, omissions... Therefore, the search of names of persons or of organizations is difficult as soon as a single name might appear in many different forms. This paper proposes two approaches to disambiguate on the affiliations of authors of scientific papers in bibliographic databases: the first way considers that a training dataset is available, and uses a Naive Bayes model. The second way assumes that there is no learning resource, and uses a semi-supervised approach, mixing soft-clustering and Bayesian learning. The results are encouraging and the approach is already partially applied in a scientific survey department. However, our experiments also highlight that our approach has some limitations: it cannot process efficiently highly unbalanced data. Alternatives solutions are possible for future developments, particularly with the use of a recent clustering algorithm relying on feature maximization
Approche semi-supervisée pour la désambiguïsation des affiliations dans les bases de données bibliographiques
International audienc
CorHAL une voie pour les chercheurs : simplifier le dépÎt des publications pour accroßtre le taux de texte intégral dans HAL
International audienceLancĂ© au printemps 2021 et soutenu par le MESRI, corHAL proposera ses services Ă la fin de lâannĂ©e. PortĂ© par lâInist et le CCSD, ce projet permet de collecter des mĂ©tadonnĂ©es de publications scientifiques françaises issues de plusieurs rĂ©servoirs. Ces donnĂ©es sont homogĂ©nĂ©isĂ©es et enrichies Ă lâaide dâalignements. Un repĂ©rage de doublons assure la crĂ©ation de notices unifiĂ©es combinant les informations des diffĂ©rentes sources. GrĂące Ă un systĂšme dâalertes (mode push ou pull), le service propose au chercheur ses publications absentes de HAL. Ce dernier choisit dâimporter automatiquement aucun, un, plusieurs ou tous les textes intĂ©graux de ses publications dans lâarchive ouverte nationale.CorHAL, un outil au service du chercheur et de la science ouverte
La pratique documentaire des chercheurs en SHS : la recherche d'information
Cette synthÚse s'intéresse d'abord à ce que recherchent les chercheurs en SHS (à partir de quels documents travaillent-ils ?) avant d'aborder la question de leurs sources d'information et de la maniÚre dont ils effectuent leur recherche. Une partie est consacrée aux difficultés rencontrées dans leur chasse à l'information et aux reproches énoncés en ce qui concerne l'environnement
électronique. La derniÚre partie, quant à elle, propose quelques pistes de réflexion et
recommandations.21 page
Is a quantitative risk assessment of air quality in underground parking garages possible ?
Little information is available about the health risks associated with time spent in underground parking garages. The objective of this study was to determine whether it is possible to quantify the health risks associated with these garages without epidemiologic data on the subject. We followed the standard procedure for health risk assessment. We searched the literature for pollutant concentrations in the air samples of underground parking garages, the hazards associated with their inhalation, and their toxicological reference values. Conditions of occupational and user exposure were estimated by scenarios and taken into account to discuss toxicological reference values by modifying (with Haber's law) the adjustment factors for exposure frequency and duration. Risk quantification was possible for 39 pollutants. Acute exposures to CO and NO2 exceed toxicological reference values, as does chronic exposure to benzene for threshold effects. The risk of a carcinogenic effect associated with benzene may be greater than 10(-5). Excess exposure to air pollution indicators (PM and NO2) is also elevated, judging by the WHO Air Quality Guidelines, and also when comparing to levels with reported effects in epidemiologic studies. The risk associated with underground parking garages can be evaluated only in part. The information available is nonetheless sufficient to justify actions to reduce exposur
Ătat des connaissances relatif Ă lâimpact sanitaire de lâexposition aux moisissures prĂ©sentes dans lâair ambiant sur la population gĂ©nĂ©rale française et recommandations en matiĂšre de surveillance nationale
Trois expertises collectives portant sur des agents biologiques dans lâair ambiant ou dans les environnements intĂ©rieurs ont Ă©tĂ© rĂ©alisĂ©es par lâAgence nationale de sĂ©curitĂ© sanitaire de lâalimentation, de lâenvironnement et du travail (Anses) ces cinq derniĂšres annĂ©es :- Pollens dans lâair ambiant (saisine 2011-SA-0151) ;- Moisissures dans le bĂąti (saisine 2014-SA-0016) ;- Pollens et moisissures dans lâair ambiant des dĂ©partements et rĂ©gions dâoutre-mer (saisine 2016-SA-0100).Ces travaux ont montrĂ© que ces polluants de lâair dâorigine biologique constituent un enjeu de santĂ© publique et ont fourni aux pouvoirs publics des recommandations en matiĂšre notamment de surveillance, de recherche et de gestion.Au niveau rĂ©glementaire, la loi du 26 juin 2016 de modernisation de notre systĂšme de santĂ© et lâarrĂȘtĂ© du 5 aoĂ»t 20161 encadrent la mise en place dâune surveillance de certaines moisissures dans lâair ambiant, notamment coordonnĂ©e par le RĂ©seau national de surveillance aĂ©robiologique (RNSA).La mesure des moisissures prĂ©sentes dans lâair ambiant rĂ©alisĂ©e depuis plusieurs annĂ©es en France mĂ©tropolitaine a permis dâapporter un certain nombre dâinformations en particulier sur le rĂŽle de certaines conditions mĂ©tĂ©orologiques (notamment de lâhumiditĂ© et de la tempĂ©rature) et de lâoccupation des sols dans le dĂ©veloppement de spores (RNSA 2011). Cependant, lesquestions relatives aux modalitĂ©s dâutilisation des rĂ©sultats de cette surveillance par le grand public et les professionnels de santĂ© notamment, et plus globalement Ă la contribution de cette surveillance Ă la prĂ©vention des pathologies liĂ©es aux moisissures, quâelles proviennent de lâair ambiant ou des environnements intĂ©rieurs, restent posĂ©es.Dans ce contexte, lâAgence a Ă©tĂ© saisie par courrier datĂ© du 22 janvier 2018 par la Direction gĂ©nĂ©rale de la santĂ© (DGS) en vue de rĂ©aliser une expertise sur lâimpact sanitaire de lâexposition aux moisissures prĂ©sentes dans lâair ambiant et de formuler des recommandations possibles en matiĂšre de surveillance nationale