3,467 research outputs found
Spanish named entity recognition in the biomedical domain
Named Entity Recognition in the clinical domain and in languages different from English has the difficulty of the absence of complete dictionaries, the informality of texts, the polysemy of terms, the lack of accordance in the boundaries of an entity, the scarcity of corpora and of other resources available. We present a Named Entity Recognition method for poorly resourced languages. The method was tested with Spanish radiology reports and compared with a conditional random fields system.Peer ReviewedPostprint (author's final draft
Automatic Detection of Adverse Drug Events in Geriatric Care: Study Proposal
BACKGROUND
One-third of older inpatients experience adverse drug events (ADEs), which increase their mortality, morbidity, and health care use and costs. In particular, antithrombotic drugs are among the most at-risk medications for this population. Reporting systems have been implemented at the national, regional, and provider levels to monitor ADEs and design prevention strategies. Owing to their well-known limitations, automated detection technologies based on electronic medical records (EMRs) are being developed to routinely detect or predict ADEs.
OBJECTIVE
This study aims to develop and validate an automated detection tool for monitoring antithrombotic-related ADEs using EMRs from 4 large Swiss hospitals. We aim to assess cumulative incidences of hemorrhages and thromboses in older inpatients associated with the prescription of antithrombotic drugs, identify triggering factors, and propose improvements for clinical practice.
METHODS
This project is a multicenter, cross-sectional study based on 2015 to 2016 EMR data from 4 large hospitals in Switzerland: Lausanne, Geneva, and Zürich university hospitals, and Baden Cantonal Hospital. We have included inpatients aged ≥65 years who stayed at 1 of the 4 hospitals during 2015 or 2016, received at least one antithrombotic drug during their stay, and signed or were not opposed to a general consent for participation in research. First, clinical experts selected a list of relevant antithrombotic drugs along with their side effects, risks, and confounding factors. Second, administrative, clinical, prescription, and laboratory data available in the form of free text and structured data were extracted from study participants' EMRs. Third, several automated rule-based and machine learning-based algorithms are being developed, allowing for the identification of hemorrhage and thromboembolic events and their triggering factors from the extracted information. Finally, we plan to validate the developed detection tools (one per ADE type) through manual medical record review. Performance metrics for assessing internal validity will comprise the area under the receiver operating characteristic curve, F-score, sensitivity, specificity, and positive and negative predictive values.
RESULTS
After accounting for the inclusion and exclusion criteria, we will include 34,522 residents aged ≥65 years. The data will be analyzed in 2022, and the research project will run until the end of 2022 to mid-2023.
CONCLUSIONS
This project will allow for the introduction of measures to improve safety in prescribing antithrombotic drugs, which today remain among the drugs most involved in ADEs. The findings will be implemented in clinical practice using indicators of adverse events for risk management and training for health care professionals; the tools and methodologies developed will be disseminated for new research in this field. The increased performance of natural language processing as an important complement to structured data will bring existing tools to another level of efficiency in the detection of ADEs. Currently, such systems are unavailable in Switzerland.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)
DERR1-10.2196/40456
Extracting clinical knowledge from electronic medical records
As the adoption of Electronic Medical Records (EMRs) rises in the healthcare
institutions, these resources’ importance increases due to all clinical information they
contain about patients. However, the unstructured information in the form of clinical
narratives present in these records makes it hard to extract and structure useful clinical
knowledge. This unstructured information limits the potential of the EMRs because the
clinical information these records contain can be used to perform essential tasks inside
healthcare institutions such as searching, summarization, decision support and statistical
analysis, as well as be used to support management decisions or serve for research. These
tasks can only be done if the unstructured clinical information from the narratives is
appropriately extracted, structured and processed in clinical knowledge. Usually, this
information extraction and structuration in clinical knowledge is performed manually by
healthcare practitioners, which is not efficient and is error-prone. This research aims to
propose a solution to this problem, by using Machine Translation (MT) from the
Portuguese language to the English language, Natural Language Processing (NLP) and
Information Extraction (IE) techniques. With the help of these techniques, the goal is to
develop a prototype pipeline modular system that can extract clinical knowledge from
unstructured clinical information contained in Portuguese EMRs, in an automated way,
in order to help EMRs to fulfil their potential and consequently help the Portuguese
hospital involved in this research. This research also intends to show that this generic
prototype system and approach can potentially be applied to other hospitals, even if they
don’t use the Portuguese language.Com a adopção cada vez maior das instituições de saúde face aos Processos ClÃnicos
Electrónicos (PCE), estes documentos ganham cada vez mais importância em contexto
clÃnico, devido a toda a informação clÃnica que contêm relativamente aos pacientes. No
entanto, a informação não estruturada na forma de narrativas clÃnicas presente nestes
documentos electrónicos, faz com que seja difÃcil extrair e estruturar deles conhecimento
clÃnico. Esta informação não estruturada limita o potencial dos PCE, uma vez que essa
mesma informação, caso seja extraÃda e estruturada devidamente, pode servir para que as
instituições de saúde possam efectuar actividades importantes com maior eficiência e
sucesso, como por exemplo actividades de pesquisa, sumarização, apoio à decisão,
análises estatÃsticas, suporte a decisões de gestão e de investigação. Este tipo de
actividades apenas podem ser feitas com sucesso caso a informação clÃnica não
estruturada presente nos PCE seja devidamente extraÃda, estruturada e processada em
conhecimento clÃnico. Habitualmente, esta extração é realizada manualmente pelos
profissionais médicos, o que não é eficiente e é susceptÃvel a erros. Esta dissertação
pretende então propôr uma solução para este problema, ao utilizar técnicas de Tradução
Automática (TA) da lÃngua portuguesa para a lÃngua inglesa, Processamento de
Linguagem Natural (PLN) e Extração de Informação (EI). O objectivo é desenvolver um
sistema protótipo de módulos em série que utilize estas técnicas, possibilitando a extração
de conhecimento clÃnico, de uma forma automática, de informação clÃnica não estruturada
presente nos PCE de um hospital português. O principal objetivo é ajudar os PCE a
atingirem todo o seu potencial em termos de conhecimento clÃnico que contêm e
consequentemente ajudar o hospital português em questão envolvido nesta dissertação,
demonstrando também que este sistema protótipo e esta abordagem podem
potencialmente ser aplicados a outros hospitais, mesmo que não sejam de lÃngua
portuguesa
Arabic medical entity tagging using distant learning in a Multilingual Framework
AbstractA semantic tagger aiming to detect relevant entities in Arabic medical documents and tagging them with their appropriate semantic class is presented. The system takes profit of a Multilingual Framework covering four languages (Arabic, English, French, and Spanish), in a way that resources available for each language can be used to improve the results of the others, this is specially important for less resourced languages as Arabic. The approach has been evaluated against Wikipedia pages of the four languages belonging to the medical domain. The core of the system is the definition of a base tagset consisting of the three most represented classes in SNOMED-CT taxonomy and the learning of a binary classifier for each semantic category in the tagset and each language, using a distant learning approach over three widely used knowledge resources, namely Wikipedia, Dbpedia, and SNOMED-CT
From Text to Knowledge with Graphs: modelling, querying and exploiting textual content
This paper highlights the challenges, current trends, and open issues related
to the representation, querying and analytics of content extracted from texts.
The internet contains vast text-based information on various subjects,
including commercial documents, medical records, scientific experiments,
engineering tests, and events that impact urban and natural environments.
Extracting knowledge from this text involves understanding the nuances of
natural language and accurately representing the content without losing
information. This allows knowledge to be accessed, inferred, or discovered. To
achieve this, combining results from various fields, such as linguistics,
natural language processing, knowledge representation, data storage, querying,
and analytics, is necessary. The vision in this paper is that graphs can be a
well-suited text content representation once annotated and the right querying
and analytics techniques are applied. This paper discusses this hypothesis from
the perspective of linguistics, natural language processing, graph models and
databases and artificial intelligence provided by the panellists of the DOING
session in the MADICS Symposium 2022
Extracting and Structuring Drug Information to Improve e-Prescription and Streamline Medical Treatment
Currently, physicians are using the patient electronic health record (EHR) to support their practice. The Romanian healthcare system switched to the electronic prescription starting with 2012. Physicians use the electronic medical record and health card to access patient data whenever available. To improve the medical act, we propose a tool supporting the prescription process, structuring and extracting important information from drug characteristics leaflets (prospectus). The application processes data extracted from around 3.000 medical prospectuses using several Romanian language Web sources. The drug leaflet data is structured on sections: therapeutic action, contraindications, mode of administration, adverse reactions, etc. A stemming algorithm has been applied to each section, extracting the root of the word for an easy search. The result is a text in an *.xml file. After structuring step, the application searches in the structured file the necessary information to prescribe the patient’s medication as closely as possible related to patient state. The application suggests all the drugs matching the patient's disease and are not contraindicated, or enter in conflict with other diseases, treatments or allergies of the patient, and the physician may select the best solution for the given situation
Underrepresentation of Elderly People in Randomised Controlled Trials. The Example of Trials of 4 Widely Prescribed Drugs
BACKGROUND: We aimed to determine the representation of elderly people in published reports of randomized controlled trials (RCTs). We focused on trials of 4 medications--pioglitazone, rosuvastatin, risedronate, and valsartan-frequently used by elderly patients with chronic medical conditions. METHODS AND FINDINGS: We selected all reports of RCTs indexed in PubMed from 1966 to April 2008 evaluating one of the 4 medications of interest. Estimates of the community-based "on-treatment" population were from a national health insurance database (SNIIR-AM) covering approximately 86% of the population in France. From this database, we evaluated data claims from January 2006 to December 2007 for 1,958,716 patients who received one of the medications of interest for more than 6 months. Of the 155 RCT reports selected, only 3 studies were exclusively of elderly patients (2 assessing valsartan; 1 risedronate). In only 4 of 37 reports (10.8%) for pioglitazone, 4 of 22 (18.2%) for risedronate, 3 of 29 (10.3%) for rosuvastatine and 9 of 67 (13.4%) for valsartan, the proportion of patients aged 65 or older was within or above that treated in clinical practice. In 62.2% of the reports for pioglitazone, 40.9% for risedronate, 37.9% for rosuvastatine, and 70.2% for valsartan, the proportion of patients aged 65 or older was lower than half that in the treated population. The representation of elderly people did not differ by publication date or sample size. CONCLUSIONS: Elderly patients are poorly represented in RCTs of drugs they are likely to receive
Recognition and normalization of temporal expressions in Serbian medical narratives
The temporal dimension emerges as one of the essential concepts in the field of medicine, providing a basis for the proper interpretation and understanding of medically relevant information, often recorded only in unstructured texts. Automatic processing of temporal expressions involves their identification and formalization in a language understandable to computers. This paper aims to apply the existing system for automatic processing of temporal expressions in Serbian natural language texts to medical narrative texts, to evaluate the system’s efficiency in recognition and normalization of temporal expressions and to determine the degree of necessary adaptation according to the characteristics and requirements of the medical domain
- …