3,218 research outputs found
Recommended from our members
Cancer Informatics for Cancer Centers (CI4CC): Building a Community Focused on Sharing Ideas and Best Practices to Improve Cancer Care and Patient Outcomes.
Cancer Informatics for Cancer Centers (CI4CC) is a grassroots, nonprofit 501c3 organization intended to provide a focused national forum for engagement of senior cancer informatics leaders, primarily aimed at academic cancer centers anywhere in the world but with a special emphasis on the 70 National Cancer Institute-funded cancer centers. Although each of the participating cancer centers is structured differently, and leaders' titles vary, we know firsthand there are similarities in both the issues we face and the solutions we achieve. As a consortium, we have initiated a dedicated listserv, an open-initiatives program, and targeted biannual face-to-face meetings. These meetings are a place to review our priorities and initiatives, providing a forum for discussion of the strategic and pragmatic issues we, as informatics leaders, individually face at our respective institutions and cancer centers. Here we provide a brief history of the CI4CC organization and meeting highlights from the latest CI4CC meeting that took place in Napa, California from October 14-16, 2019. The focus of this meeting was "intersections between informatics, data science, and population science." We conclude with a discussion on "hot topics" on the horizon for cancer informatics
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
PathologyBERT -- Pre-trained Vs. A New Transformer Language Model for Pathology Domain
Pathology text mining is a challenging task given the reporting variability
and constant new findings in cancer sub-type definitions. However, successful
text mining of a large pathology database can play a critical role to advance
'big data' cancer research like similarity-based treatment selection, case
identification, prognostication, surveillance, clinical trial screening, risk
stratification, and many others. While there is a growing interest in
developing language models for more specific clinical domains, no
pathology-specific language space exist to support the rapid data-mining
development in pathology space. In literature, a few approaches fine-tuned
general transformer models on specialized corpora while maintaining the
original tokenizer, but in fields requiring specialized terminology, these
models often fail to perform adequately. We propose PathologyBERT - a
pre-trained masked language model which was trained on 347,173 histopathology
specimen reports and publicly released in the Huggingface repository. Our
comprehensive experiments demonstrate that pre-training of transformer model on
pathology corpora yields performance improvements on Natural Language
Understanding (NLU) and Breast Cancer Diagnose Classification when compared to
nonspecific language models.Comment: submitted to "American Medical Informatics Association (AMIA)" 2022
Annual Symposiu
Metastasis and circulating tumor cells
Cancer is a prominent cause of death worldwide. In most cases, it is not the primary tumor which causes death, but the metastases. Metastatic tumors are spread over the entire human body and are more difficult to remove or treat than the primary tumor. In a patient with metastatic disease, circulating tumor cells (CTCs) can be found in venous blood. These circulating tumor cells are part of the metastatic cascade. Clinical studies have shown that these cells can be used to predict treatment response and their presence is strongly associated with poor survival prospects. Enumeration and characterization of CTCs is important as this can help clinicians make more informed decisions when choosing or evaluating treatment. CTC counts are being included in an increasing number of studies and thus are becoming a bigger part of disease diagnosis and therapy management. We present an overview of the most prominent CTC enumeration and characterization methods and discuss the assumptions made \ud
about the CTC phenotype. Extensive CTC characterization of for example the DNA, RNA and antigen expression may lead to more understanding of the metastatic process
Understanding Breast Cancer Survival: Using Causality and Language Models on Multi-omics Data
The need for more usable and explainable machine learning models in
healthcare increases the importance of developing and utilizing causal
discovery algorithms, which aim to discover causal relations by analyzing
observational data. Explainable approaches aid clinicians and biologists in
predicting the prognosis of diseases and suggesting proper treatments. However,
very little research has been conducted at the crossroads between causal
discovery, genomics, and breast cancer, and we aim to bridge this gap.
Moreover, evaluation of causal discovery methods on real data is in general
notoriously difficult because ground-truth causal relations are usually
unknown, and accordingly, in this paper, we also propose to address the
evaluation problem with large language models. In particular, we exploit
suitable causal discovery algorithms to investigate how various perturbations
in the genome can affect the survival of patients diagnosed with breast cancer.
We used three main causal discovery algorithms: PC, Greedy Equivalence Search
(GES), and a Generalized Precision Matrix-based one. We experiment with a
subset of The Cancer Genome Atlas, which contains information about mutations,
copy number variations, protein levels, and gene expressions for 705 breast
cancer patients. Our findings reveal important factors related to the vital
status of patients using causal discovery algorithms. However, the reliability
of these results remains a concern in the medical domain. Accordingly, as
another contribution of the work, the results are validated through language
models trained on biomedical literature, such as BlueBERT and other large
language models trained on medical corpora. Our results profess proper
utilization of causal discovery algorithms and language models for revealing
reliable causal relations for clinical applications
- …