3,218 research outputs found

    A Survey on Deep Learning in Medical Image Analysis

    Full text link
    Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

    PathologyBERT -- Pre-trained Vs. A New Transformer Language Model for Pathology Domain

    Full text link
    Pathology text mining is a challenging task given the reporting variability and constant new findings in cancer sub-type definitions. However, successful text mining of a large pathology database can play a critical role to advance 'big data' cancer research like similarity-based treatment selection, case identification, prognostication, surveillance, clinical trial screening, risk stratification, and many others. While there is a growing interest in developing language models for more specific clinical domains, no pathology-specific language space exist to support the rapid data-mining development in pathology space. In literature, a few approaches fine-tuned general transformer models on specialized corpora while maintaining the original tokenizer, but in fields requiring specialized terminology, these models often fail to perform adequately. We propose PathologyBERT - a pre-trained masked language model which was trained on 347,173 histopathology specimen reports and publicly released in the Huggingface repository. Our comprehensive experiments demonstrate that pre-training of transformer model on pathology corpora yields performance improvements on Natural Language Understanding (NLU) and Breast Cancer Diagnose Classification when compared to nonspecific language models.Comment: submitted to "American Medical Informatics Association (AMIA)" 2022 Annual Symposiu

    Metastasis and circulating tumor cells

    Get PDF
    Cancer is a prominent cause of death worldwide. In most cases, it is not the primary tumor which causes death, but the metastases. Metastatic tumors are spread over the entire human body and are more difficult to remove or treat than the primary tumor. In a patient with metastatic disease, circulating tumor cells (CTCs) can be found in venous blood. These circulating tumor cells are part of the metastatic cascade. Clinical studies have shown that these cells can be used to predict treatment response and their presence is strongly associated with poor survival prospects. Enumeration and characterization of CTCs is important as this can help clinicians make more informed decisions when choosing or evaluating treatment. CTC counts are being included in an increasing number of studies and thus are becoming a bigger part of disease diagnosis and therapy management. We present an overview of the most prominent CTC enumeration and characterization methods and discuss the assumptions made \ud about the CTC phenotype. Extensive CTC characterization of for example the DNA, RNA and antigen expression may lead to more understanding of the metastatic process

    Understanding Breast Cancer Survival: Using Causality and Language Models on Multi-omics Data

    Full text link
    The need for more usable and explainable machine learning models in healthcare increases the importance of developing and utilizing causal discovery algorithms, which aim to discover causal relations by analyzing observational data. Explainable approaches aid clinicians and biologists in predicting the prognosis of diseases and suggesting proper treatments. However, very little research has been conducted at the crossroads between causal discovery, genomics, and breast cancer, and we aim to bridge this gap. Moreover, evaluation of causal discovery methods on real data is in general notoriously difficult because ground-truth causal relations are usually unknown, and accordingly, in this paper, we also propose to address the evaluation problem with large language models. In particular, we exploit suitable causal discovery algorithms to investigate how various perturbations in the genome can affect the survival of patients diagnosed with breast cancer. We used three main causal discovery algorithms: PC, Greedy Equivalence Search (GES), and a Generalized Precision Matrix-based one. We experiment with a subset of The Cancer Genome Atlas, which contains information about mutations, copy number variations, protein levels, and gene expressions for 705 breast cancer patients. Our findings reveal important factors related to the vital status of patients using causal discovery algorithms. However, the reliability of these results remains a concern in the medical domain. Accordingly, as another contribution of the work, the results are validated through language models trained on biomedical literature, such as BlueBERT and other large language models trained on medical corpora. Our results profess proper utilization of causal discovery algorithms and language models for revealing reliable causal relations for clinical applications
    corecore