17 research outputs found

    A scoping review of natural language processing of radiology reports in breast cancer

    Get PDF
    Various natural language processing (NLP) algorithms have been applied in the literature to analyze radiology reports pertaining to the diagnosis and subsequent care of cancer patients. Applications of this technology include cohort selection for clinical trials, population of large-scale data registries, and quality improvement in radiology workflows including mammography screening. This scoping review is the first to examine such applications in the specific context of breast cancer. Out of 210 identified articles initially, 44 met our inclusion criteria for this review. Extracted data elements included both clinical and technical details of studies that developed or evaluated NLP algorithms applied to free-text radiology reports of breast cancer. Our review illustrates an emphasis on applications in diagnostic and screening processes over treatment or therapeutic applications and describes growth in deep learning and transfer learning approaches in recent years, although rule-based approaches continue to be useful. Furthermore, we observe increased efforts in code and software sharing but not with data sharing

    Improving knowledge management through the support of image examination and data annotation using DICOM structured reporting

    Get PDF
    [EN] An important effort has been invested on improving the image diagnosis process in different medical areas using information technologies. The field of medical imaging involves two main data types: medical imaging and reports. Developments based on the DICOM standard have demonstrated to be a convenient and widespread solution among the medical community. The main objective of this work is to design a Web application prototype that will be able to improve diagnosis and follow-on of breast cancer patients. It is based on TRENCADIS middleware, which provides a knowledge-oriented storage model composed by federated repositories of DICOM image studies and DICOM-SR medical reports. The full structure and contents of the diagnosis reports are used as metadata for indexing images. The TRENCADIS infrastructure takes full advantage of Grid technologies by deploying multi-resource grid services that enable multiple views (reports schemes) of the knowledge database. The paper presents a real deployment of such Web application prototype in the Dr. Peset Hospital providing radiologists with a tool to create, store and search diagnostic reports based on breast cancer explorations (mammography, magnetic resonance, ultrasound, pre-surgery biopsy and post-surgery biopsy), improving support for diagnostics decisions. A technical details for use cases (outlining enhanced multi-resource grid services communication and processing steps) and interactions between actors and the deployed prototype are described. As a result, information is more structured, the logic is clearer, network messages have been reduced and, in general, the system is more resistant to failures.The authors wish to thank the financial support received from The Spanish Ministry of Education and Science to develop the project "CodeCloud", with reference TIN2010-17804.Salavert Torres, J.; Segrelles Quilis, JD.; Blanquer Espert, I.; Hernández García, V. (2012). Improving knowledge management through the support of image examination and data annotation using DICOM structured reporting. Journal of Biomedical Informatics. 45(6):1066-1074. https://doi.org/10.1016/j.jbi.2012.07.004S1066107445

    Enhanced Neurologic Concept Recognition using a Named Entity Recognition Model based on Transformers

    Get PDF
    Although Deep Learning Has Been Applied to the Recognition of Diseases and Drugs in Electronic Health Records and the Biomedical Literature, Relatively Little Study Has Been Devoted to the Utility of Deep Learning for the Recognition of Signs and Symptoms. the Recognition of Signs and Symptoms is Critical to the Success of Deep Phenotyping and Precision Medicine. We Have Developed a Named Entity Recognition Model that Uses Deep Learning to Identify Text Spans Containing Neurological Signs and Symptoms and Then Maps These Text Spans to the Clinical Concepts of a Neuro-Ontology. We Compared a Model based on Convolutional Neural Networks to One based on Bidirectional Encoder Representation from Transformers. Models Were Evaluated for Accuracy of Text Span Identification on Three Text Corpora: Physician Notes from an Electronic Health Record, Case Histories from Neurologic Textbooks, and Clinical Synopses from an Online Database of Genetic Diseases. Both Models Performed Best on the Professionally-Written Clinical Synopses and Worst on the Physician-Written Clinical Notes. Both Models Performed Better When Signs and Symptoms Were Represented as Shorter Text Spans. Consistent with Prior Studies that Examined the Recognition of Diseases and Drugs, the Model based on Bidirectional Encoder Representations from Transformers Outperformed the Model based on Convolutional Neural Networks for Recognizing Signs and Symptoms. Recall for Signs and Symptoms Ranged from 59.5% to 82.0% and Precision Ranged from 61.7% to 80.4%. with Further Advances in NLP, Fully Automated Recognition of Signs and Symptoms in Electronic Health Records and the Medical Literature Should Be Feasible

    The RareDis corpus: A corpus annotated with rare diseases, their signs and symptoms

    Get PDF
    Rare diseases affect a small number of people compared to the general population. However, more than 6,000 different rare diseases exist and, in total, they affect more than 300 million people worldwide. Rare diseases share as part of their main problem, the delay in diagnosis and the sparse information available for researchers, clinicians, and patients. Finding a diagnostic can be a very long and frustrating experience for patients and their families. The average diagnostic delay is between 6–8 years. Many of these diseases result in different manifestations among patients, which hampers even more their detection and the correct treatment choice. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments, but most NLP techniques require manually annotated corpora. Therefore, our goal is to create a gold standard corpus annotated with rare diseases and their clinical manifestations. It could be used to train and test NLP approaches and the information extracted through NLP could enrich the knowledge of rare diseases, and thereby, help to reduce the diagnostic delay and improve the treatment of rare diseases. The paper describes the selection of 1,041 texts to be included in the corpus, the annotation process and the annotation guidelines. The entities (disease, rare disease, symptom, sign and anaphor) and the relationships (produces, is a, is acron, is synon, increases risk of, anaphora) were annotated. The RareDis corpus contains more than 5,000 rare diseases and almost 6,000 clinical manifestations are annotated. Moreover, the Inter Annotator Agreement evaluation shows a relatively high agreement (F1-measure equal to 83.5% under exact match criteria for the entities and equal to 81.3% for the relations). Based on these results, this corpus is of high quality, supposing a significant step for the field since there is a scarcity of available corpus annotated with rare diseases. This could open the door to further NLP applications, which would facilitate the diagnosis and treatment of these rare diseases and, therefore, would improve dramatically the quality of life of these patients.This work was supported by the Madrid Government (Comunidad de Madrid) under the Multiannual Agreement with UC3M in the line of "Fostering Young Doctors Research" (NLP4RARE-CM-UC3M) and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation; the Multiannual Agreement with UC3M in the line of "Excellence of University Professors (EPUC3M17)"; and a grant from Spanish Ministry of Economy and Competitiveness (SAF2017-86810-R)

    Structured and unstructured data integration with electronic medical records

    Get PDF
    In recent years there has been a great population and technological evolution all over the world. At the same time, more areas beyond technology and information technology have also developed, namely medicine, which has led to an increase in average life expectancy which in turn, leads to a greater need for healthcare. In order to provide the best possible treatments and healthcare services, nowadays the hospitals store large amounts of data regarding patients and diseases (in the form of electronic medical records) or the logistics of some departments in their storage systems. Therefore, computer science techniques such as data mining and natural language processing have been used to extract knowledge and value from these information-rich sources in order not only to develop, for example, new models for disease prediction, as well as improving existing processes in healthcare centres and hospitals. This data storage can be done in one of three ways: structured, unstructured or semi-structured. In this paper, the author tested the integration of structured and unstructured data from two different departments of the same Portuguese hospital, in order to extract knowledge and improve hospital processes. Aiming to reduce the value loss of loading data that is not used in the healthcare providers systems.Nos últimos anos tem-se assistido a uma grande evolução populacional e tecnológica por todo o mundo. Paralelamente, mais áreas para além da tecnologia e informática têm-se também desenvolvido, nomeadamente a área da medicina, o que tem permitido um aumento na esperança média de vida que por sua vez leva a uma maior necessidade de cuidados de saúde. Com o intuito de fornecer os melhores serviços de saúde possíveis, nos dias que hoje os hospitais guardam nos seus sistemas informáticos grandes quantidades de dados relativamente aos pacientes e doenças (sobre a forma de registos médicos eletrónicos) ou relativos à logística de alguns departamentos dos hospitais, etc. Por conseguinte, a estes dados têm vindo a ser utilizadas técnicas da área das ciências da computação como o data mining e o processamento da língua natural para extrair conhecimento e valor dessas fontes ricas em informação com o intuito não só de desenvolver, por exemplo, novos modelos de predição de doenças, como também de melhorar processos já existentes em centros de saúde e hospitais. Este armazenamento de dados pode ser feito em uma de três formas: de forma estruturada, não estruturada ou semi-estruturada. Neste trabalho o autor testou a integração de dados estruturados e não estruturados de dois departamentos diferentes do mesmo hospital português, com o intuito de extrair conhecimento e melhorar os processos do hospital. Com o intuito de reduzir a perda do armazenamento de dados que não são utilizados

    Enhanced neurologic concept recognition using a named entity recognition model based on transformers

    Get PDF
    Although deep learning has been applied to the recognition of diseases and drugs in electronic health records and the biomedical literature, relatively little study has been devoted to the utility of deep learning for the recognition of signs and symptoms. The recognition of signs and symptoms is critical to the success of deep phenotyping and precision medicine. We have developed a named entity recognition model that uses deep learning to identify text spans containing neurological signs and symptoms and then maps these text spans to the clinical concepts of a neuro-ontology. We compared a model based on convolutional neural networks to one based on bidirectional encoder representation from transformers. Models were evaluated for accuracy of text span identification on three text corpora: physician notes from an electronic health record, case histories from neurologic textbooks, and clinical synopses from an online database of genetic diseases. Both models performed best on the professionally-written clinical synopses and worst on the physician-written clinical notes. Both models performed better when signs and symptoms were represented as shorter text spans. Consistent with prior studies that examined the recognition of diseases and drugs, the model based on bidirectional encoder representations from transformers outperformed the model based on convolutional neural networks for recognizing signs and symptoms. Recall for signs and symptoms ranged from 59.5% to 82.0% and precision ranged from 61.7% to 80.4%. With further advances in NLP, fully automated recognition of signs and symptoms in electronic health records and the medical literature should be feasible

    Mining the Medical and Patent Literature to Support Healthcare and Pharmacovigilance

    Get PDF
    Recent advancements in healthcare practices and the increasing use of information technology in the medical domain has lead to the rapid generation of free-text data in forms of scientific articles, e-health records, patents, and document inventories. This has urged the development of sophisticated information retrieval and information extraction technologies. A fundamental requirement for the automatic processing of biomedical text is the identification of information carrying units such as the concepts or named entities. In this context, this work focuses on the identification of medical disorders (such as diseases and adverse effects) which denote an important category of concepts in the medical text. Two methodologies were investigated in this regard and they are dictionary-based and machine learning-based approaches. Futhermore, the capabilities of the concept recognition techniques were systematically exploited to build a semantic search platform for the retrieval of e-health records and patents. The system facilitates conventional text search as well as semantic and ontological searches. Performance of the adapted retrieval platform for e-health records and patents was evaluated within open assessment challenges (i.e. TRECMED and TRECCHEM respectively) wherein the system was best rated in comparison to several other competing information retrieval platforms. Finally, from the medico-pharma perspective, a strategy for the identification of adverse drug events from medical case reports was developed. Qualitative evaluation as well as an expert validation of the developed system's performance showed robust results. In conclusion, this thesis presents approaches for efficient information retrieval and information extraction from various biomedical literature sources in the support of healthcare and pharmacovigilance. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. The applied strategies have potential to enhance the literature-searches performed by biomedical, healthcare, and patent professionals. This can promote the literature-based knowledge discovery, improve the safety and effectiveness of medical practices, and drive the research and development in medical and healthcare arena
    corecore