15 research outputs found

    Coupling an automatic dictation system with a grammar checker

    Get PDF

    A Knowledge Management Platform for Documentation of Case Reports in Pharmacovigilance

    No full text
    Most countries have developed information systems to report drug adverse effects. However, as in other domains where systematic reviews are needed, there is little guidance on how systematic documentation of drug adverse effects should be performed. The objective of the VigiTermes project is to develop a platform to improve documentation of pharmacovigilance case reports for the pharmaceutical industry and regulatory authorities. In order to improve systematic reviews of adverse drug reactions, we developed a prototype that first reproduces and standardizes search strategies, then extracts information from the Medline abstracts which were retrieved and annotates them. The platform aims at providing transparent access and analysis tools to pharmacovigilance experts investigating relevance of safety signals related to drugs. The platform's architecture consists in the integration of two vendor tools ITM® and Luxid® and one academic web service for knowledge extraction from medical literature. Whereas a manual search performed by a pharmacovigilance expert retrieved 578 publications, the system proposed a list of 229 publications thus decreasing time required for review by 60%. Recall was 70% and additional developments are required in order to improve exhaustivity

    La construction de composants de connaissance pour l'extraction et le filtrage de l'information sur les réseaux

    No full text
    With the steady growth of business and scientific activities and the recent advances in Information Technology, huge amounts of electronically available but unstructured data have to be dealt with. New tools able to analyse and structure textual data need to be developed so that non-expert users can understand and evaluate the contents of their documents. This paper describes the first results of a R&D program set up by TEMIS Company, whose core business is Text Mining, i.e. information extraction, information processing, visualisation and valorisation of all the data issued by or received in a company in its field of activity. The aim of the program is to build a multilingual knowledge station, called K-Station, in order to build knowledge components (terms and extraction rules) in an iterative way. The knowledge station is independent of the field of activity. The K-Station allows the linguist to create knowledge components for a determined industrial sector (cars or chemistry for example) or for a specific job (e.g. journalism or computer science). We particularly insist on the process of information extraction from a corpus, by detailing the methodology used. Our approach is based on the recycling of existing terms databases and knowledge discovery from a corpus, according to an iterative process which includes extraction rules. All the components are part of a client/server architecture. We explain the various levels stages of analysis integrated within this architecture and we illustrate them with results taken from the field of Competitive Intelligence. Knowledge components are included in a software suite called Insight DiscovererL'accroissement de l'activité économique, scientifique jointe à l'éclosion des nouvelles technologies de l'information se traduisent par une croissance remarquable de l'information disponible sous forme électronique et exigent de nouveaux outils capables d'analyser et de structurer des documents textuels afin de permettre à des utilisateurs non-experts de fouiller ces documents et/ou de les évaluer. Notre article présente les premiers résultats d'un programme de recherche et de développement de la société TEMIS, dont le cœur d'activité est le text mining, c'est-à-dire l'extraction, le traitement, la visualisation et la capitalisation de l'information à partir des données, produites ou reçues, par une société dans son domaine. L'objet de ce programme est le développement d'une station de travail appelée Knowledge Station ou K-Station, indépendante du domaine d'application et multilingue. La K-Station automatise un processus itératif de construction et validation de composants de connaissances, i.e. la terminologie et les règles d'extraction d'informations relatives à un secteur d'activité, et ce, à partir de corpus de données textuelles. Elle a pour objectif de permettre à un linguiste de construire un ensemble de composants de connaissances pour un secteur industriel déterminé (automobile, chimie…) ou un métier particulier (journalisme, informatique, …). Nous mettrons l'accent sur le processus d'extraction de l'information à partir de corpus, en insistant sur la méthodologie utilisée. Notre approche est fondée sur la réutilisation de bases terminologiques existantes et sur l'acquisition de connaissances à partir d'un corpus selon un processus itératif intégrant des règles d'extraction. Elle repose sur une intégration d'outils dans une architecture client/serveur. Nous présentons les différentes étapes d'analyse intégrées dans cette architecture. Notre exposé sera illustré par des exemples de résultats dans le domaine de l'intelligence économique. Les composants de connaissance s'intègrent automatiquement dans une suite logicielle nommée Insight Discoverer

    Combining NER Systems via a UIMA-based platform

    Get PDF
    International audienceIn this paper, we present a tool aiming at merging named entity annotations provided by different named entity recognition sys-tems. This tool is based on UIMA platform and contains a merging module which uses information about the compatibility of vari-ous annotations and can point out conflicts, and thus yields annotations that are more reli-able than those of any single annotator. This work has been performed as part of the In-fom@gic projec

    VigiTermes : une plateforme de recherche et d'analyse des publications scientifiques au service de la pharmacovigilance

    No full text
    National audienceLa réglementation impose une identification systématique des effets indésirables des médicaments. La plupart des pays se sont dotés de systèmes d'information d'aide à la documentation de ces effets. La tâche est rendue difficile par la multiplicité des sources et le peu d'outils fédérateurs existant pour accéder, rechercher et analyser l'information autour des médicaments. L'objectif du projet VigiTermes est de développer une plateforme pour améliorer la documentation des rapports de cas de pharmacovigilance et de proposer des outils d'accès et d'analyse pour les experts pharmacovigilants, dont l'objectif est d'enquêter sur la détection de nouveaux cas (signaux). Dans ce cadre, nous avons développé un prototype qui reproduit et standardise les stratégies de recherche documentaire formulées par les pharmacovigilants, récupère les résumés PubMed pertinents et en extrait de l'information autour des médicaments et de leurs effets secondaires potentiels

    The Légilocal project: the local law simply shared

    No full text
    International audienceThe Légilocal project aims to help local authorities to improve the quality, interoperability and publication of French local administrative acts in the same way as Légifrance does at the state and EU level. The originality of the approach is to unify the management of contents and the interactions between actors on these contents. The Légilocal platform combines various tools (content management, networking, semantic annotation and search) and resources to assist clerks in the drafting and publication of local acts

    CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content

    No full text
    International audienceBeing the client's first interface, call centres worldwide contain a huge amount of information of all kind under the form of conversational speech. If accessible, this information can be used to detect eg. major events and organizational flaws, improve customer relations and marketing strategies. An efficient way to exploit the unstructured data of telephone calls is data-mining, but current techniques apply on text only. The CALLSURF project gathers a number of academic and industrial partners covering the complete platform, from automatic transcription to information retrieval and data mining. This paper concentrates on the speech recognition module as it discusses the collection, the manual transcription of the training corpus and the techniques used to build the language model. The NLP techniques used to pre-process the transcribed corpus for data mining are POS tagging, lemmatization, noun group and named entity recognition. Some of them have been especially adapted to the conversational speech characteristics. POS tagging and preliminary data mining results obtained on the manually transcribed corpus are briefly discussed

    The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard

    No full text
    International audienceSocial media is a potential source of information on postmarketing drug safety surveillance that still remains unexploited nowadays. Information technology solutions aiming at extracting adverse reactions (ADRs) from posts on health forums require a rigorous evaluation methodology if their results are to be used to make decisions. First, a gold standard, consisting of manual annotations of the ADR by human experts from the corpus extracted from social media, must be implemented and its quality must be assessed. Second, as for clinical research protocols, the sample size must rely on statistical arguments. Finally, the extraction methods must target the relation between the drug and the disease (which might be either treated or caused by the drug) rather than simple co-occurrences in the posts.RR1-10.2196/11448
    corecore