4,631 research outputs found
Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks
Interventional cancer clinical trials are generally too restrictive, and some
patients are often excluded on the basis of comorbidity, past or concomitant
treatments, or the fact that they are over a certain age. The efficacy and
safety of new treatments for patients with these characteristics are,
therefore, not defined. In this work, we built a model to automatically predict
whether short clinical statements were considered inclusion or exclusion
criteria. We used protocols from cancer clinical trials that were available in
public registries from the last 18 years to train word-embeddings, and we
constructed a~dataset of 6M short free-texts labeled as eligible or not
eligible. A text classifier was trained using deep neural networks, with
pre-trained word-embeddings as inputs, to predict whether or not short
free-text statements describing clinical information were considered eligible.
We additionally analyzed the semantic reasoning of the word-embedding
representations obtained and were able to identify equivalent treatments for a
type of tumor analogous with the drugs used to treat other tumors. We show that
representation learning using {deep} neural networks can be successfully
leveraged to extract the medical knowledge from clinical trial protocols for
potentially assisting practitioners when prescribing treatments
Bigram feature extraction and conditional random fields model to improve text classification clinical trial document
In the field of health and medicine, there is a very important term known as clinical trials. Clinical trials are a type of activity that studies how the safest way to treat patients is. These clinical trials are usually written in unstructured free text which requires translation from a computer. The aim of this paper is to classify the texts of cancer clinical trial documents consisting of unstructured free texts taken from cancer clinical trial protocols. The proposed algorithm is conditional random Fields and bigram features. A new classification model from the cancer clinical trial document text is proposed to compete with other methods in terms of precision, recall, and f-1 score. The results of this study are better than the previous results, namely 88.07 precision, 88.05 recall and f-1 score 88.06
Evaluating BERT Embeddings for Text Classification in Bio-Medical Domain to Determine Eligibility of Patients in Clinical Trials
Clinical Trials are studies conducted by researchers in order to assess the impact of new medicine in terms of its efficacy and most importantly safety on human health. For any advancement in the field of medicine it is very important that clinical trials are conducted with right ethics supported by scientific evidence. Not all people who volunteer or participate in clinical trials are allowed to undergo the trials. Age, comorbidity and other health issues present in a patient can be a major factor to decide whether the profile is suitable or not for the trial. Profiles selected for clinical trials should be documented and also the profiles which were excluded. This research which took over a long time period conducted trials on 15,000 cancer drugs. Keeping track of so many trials, their outcomes and formulating a standard health guideline is easier said than done. In this paper, Text classification which is one of the primary assessment tasks in Natural Language Processing (NLP) is discussed. One of the most common problems in NLP, but it becomes complex when it is dealing with a specific domain like bio-medical which finds presence of quite a few jargons pertaining to the medical field. This paper proposes a framework with two major components comprising transformer architecture to produce embedding coupled with a text classifier. In the later section it is proved that pre-trained embeddings generated by BERT (Bidirectional Encoder Representations from Transformers) can perform as efficiently and achieve a better F1-score and accuracy than the current benchmark score which uses embeddings trained from the same dataset. The main contribution of this paper is the framework which can be extended to different bio-medical problems. The design can also be reused for different domains by fine-tuning. The framework also provides support for different optimization techniques like Mixed Precision, Dynamic Padding and Uniform Length Batching which improves performance by up to 3 times in GPU (Graphics Processing Unit) processors and by 60% in TPU (Tensor Processing Unit)
LLM for Patient-Trial Matching: Privacy-Aware Data Augmentation Towards Better Performance and Generalizability
The process of matching patients with suitable clinical trials is essential
for advancing medical research and providing optimal care. However, current
approaches face challenges such as data standardization, ethical
considerations, and a lack of interoperability between Electronic Health
Records (EHRs) and clinical trial criteria. In this paper, we explore the
potential of large language models (LLMs) to address these challenges by
leveraging their advanced natural language generation capabilities to improve
compatibility between EHRs and clinical trial descriptions. We propose an
innovative privacy-aware data augmentation approach for LLM-based patient-trial
matching (LLM-PTM), which balances the benefits of LLMs while ensuring the
security and confidentiality of sensitive patient data. Our experiments
demonstrate a 7.32% average improvement in performance using the proposed
LLM-PTM method, and the generalizability to new data is improved by 12.12%.
Additionally, we present case studies to further illustrate the effectiveness
of our approach and provide a deeper understanding of its underlying
principles
- …