4,555 research outputs found

    Breast Cancer Early Detection Comparison with Deep Learning and Machine Learning Models: A Case of Study

    Get PDF
    Breast cancer is one of the most widespread in the female population, being able to predict its developments and capturing the inputs of the onset of the disease is one of the main objectives that science is pursuing. Clinical Decision Support Systems (CDSS) in recent decades are extensively using these technological tools, such as Machine Learning (ML) and Deep Learning (DL). In this paper, two of the main methods of these subset of AI are compared: an ensemble-type algorithm, XGBoost (or Extreme Gradient Boosting) and a deep neural network (DNN) are applied to the data of a study conducted on an Indonesian population. The results obtained are very interesting as despite being tabular, binary categorical and multiclass data, the DNN model achieves performance and results much higher than the well-known XGB used in literature for data of this type

    Translational Bioinformatics for Human Reproductive Biology Research: Examples, Opportunities and Challenges for a Future Reproductive Medicine

    Get PDF
    Since 1978, with the first IVF (in vitro fertilization) baby birth in Manchester (England), more than eight million IVF babies have been born throughout the world, and many new techniques and discoveries have emerged in reproductive medicine. To summarize the modern technology and progress in reproductive medicine, all scientific papers related to reproductive medicine, especially papers related to reproductive translational medicine, were fully searched, manually curated and reviewed. Results indicated whether male reproductive medicine or female reproductive medicine all have made significant progress, and their markers have experienced the progress from karyotype analysis to single-cell omics. However, due to the lack of comprehensive databases, especially databases collecting risk exposures, disease markers and models, prevention drugs and effective treatment methods, the application of the latest precision medicine technologies and methods in reproductive medicine is limited.This research was funded by Project of Natural Science Foundation of Gansu Province (20JR5RA363); Project of Gansu Provincial Education Department (2020B-003)

    Machine Learning Approaches for Cancer Analysis

    Get PDF
    In addition, we propose many machine learning models that serve as contributions to solve a biological problem. First, we present Zseq, a linear time method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors, such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Studying the abundance of select mRNA species throughout prostate cancer progression may provide some insight into the molecular mechanisms that advance the disease. In the second contribution of this dissertation, we reveal that the combination of proper clustering, distance function and Index validation for clusters are suitable in identifying outlier transcripts, which show different trending than the majority of the transcripts, the trending of the transcript is the abundance throughout different stages of prostate cancer. We compare this model with standard hierarchical time-series clustering method based on Euclidean distance. Using time-series profile hierarchical clustering methods, we identified stage-specific mRNA species termed outlier transcripts that exhibit unique trending patterns as compared to most other transcripts during disease progression. This method is able to identify those outliers rather than finding patterns among the trending transcripts compared to the hierarchical clustering method based on Euclidean distance. A wet-lab experiment on a biomarker (CAM2G gene) confirmed the result of the computational model. Genes related to these outlier transcripts were found to be strongly associated with cancer, and in particular, prostate cancer. Further investigation of these outlier transcripts in prostate cancer may identify them as potential stage-specific biomarkers that can predict the progression of the disease. Breast cancer, on the other hand, is a widespread type of cancer in females and accounts for a lot of cancer cases and deaths in the world. Identifying the subtype of breast cancer plays a crucial role in selecting the best treatment. In the third contribution, we propose an optimized hierarchical classification model that is used to predict the breast cancer subtype. Suitable filter feature selection methods and new hybrid feature selection methods are utilized to find discriminative genes. Our proposed model achieves 100% accuracy for predicting the breast cancer subtypes using the same or even fewer genes. Studying breast cancer survivability among different patients who received various treatments may help understand the relationship between the survivability and treatment therapy based on gene expression. In the fourth contribution, we have built a classifier system that predicts whether a given breast cancer patient who underwent some form of treatment, which is either hormone therapy, radiotherapy, or surgery will survive beyond five years after the treatment therapy. Our classifier is a tree-based hierarchical approach that partitions breast cancer patients based on survivability classes; each node in the tree is associated with a treatment therapy and finds a predictive subset of genes that can best predict whether a given patient will survive after that particular treatment. We applied our tree-based method to a gene expression dataset that consists of 347 treated breast cancer patients and identified potential biomarker subsets with prediction accuracies ranging from 80.9% to 100%. We have further investigated the roles of many biomarkers through the literature. Studying gene expression through various time intervals of breast cancer survival may provide insights into the recovery of the patients. Discovery of gene indicators can be a crucial step in predicting survivability and handling of breast cancer patients. In the fifth contribution, we propose a hierarchical clustering method to separate dissimilar groups of genes in time-series data as outliers. These isolated outliers, genes that trend differently from other genes, can serve as potential biomarkers of breast cancer survivability. In the last contribution, we introduce a method that uses machine learning techniques to identify transcripts that correlate with prostate cancer development and progression. We have isolated transcripts that have the potential to serve as prognostic indicators and may have significant value in guiding treatment decisions. Our study also supports PTGFR, NREP, scaRNA22, DOCK9, FLVCR2, IK2F3, USP13, and CLASP1 as potential biomarkers to predict prostate cancer progression, especially between stage II and subsequent stages of the disease

    A review of artificial intelligence in prostate cancer detection on imaging

    Get PDF
    A multitude of studies have explored the role of artificial intelligence (AI) in providing diagnostic support to radiologists, pathologists, and urologists in prostate cancer detection, risk-stratification, and management. This review provides a comprehensive overview of relevant literature regarding the use of AI models in (1) detecting prostate cancer on radiology images (magnetic resonance and ultrasound imaging), (2) detecting prostate cancer on histopathology images of prostate biopsy tissue, and (3) assisting in supporting tasks for prostate cancer detection (prostate gland segmentation, MRI-histopathology registration, MRI-ultrasound registration). We discuss both the potential of these AI models to assist in the clinical workflow of prostate cancer diagnosis, as well as the current limitations including variability in training data sets, algorithms, and evaluation criteria. We also discuss ongoing challenges and what is needed to bridge the gap between academic research on AI for prostate cancer and commercial solutions that improve routine clinical care

    Study of microRNAs-21/221 as potential breast cancer biomarkers in Egyptian women

    Get PDF
    microRNAs (miRNAs) play an important role in cancer prognosis. They are small molecules, approximately 17–25 nucleotides in length, and their high stability in human serum supports their use as novel diagnostic biomarkers of cancer and other pathological conditions. In this study, we analyzed the expression patterns of miR-21 and miR-221 in the serum from a total of 100 Egyptian female subjects with breast cancer, fibroadenoma, and healthy control subjects. Using microarray-based expression profiling followed by real-time polymerase chain reaction validation, we compared the levels of the two circulating miRNAs in the serum of patients with breast cancer (n = 50), fibroadenoma (n = 25), and healthy controls (n = 25). The miRNA SNORD68 was chosen as the housekeeping endogenous control. We found that the serum levels of miR-21 and miR-221 were significantly overexpressed in breast cancer patients compared to normal controls and fibroadenoma patients. Receiver Operating Characteristic (ROC) curve analysis revealed that miR-21 has greater potential in discriminating between breast cancer patients and the control group, while miR-221 has greater potential in discriminating between breast cancer and fibroadenoma patients. Classification models using k-Nearest Neighbor (kNN), Naïve Bayes (NB), and Random Forests (RF) were developed using expression levels of both miR-21 and miR-221. Best classification performance was achieved by NB Classification models, reaching 91% of correct classification. Furthermore, relative miR-221 expression was associated with histological tumor grades. Therefore, it may be concluded that both miR-21 and miR-221 can be used to differentiate between breast cancer patients and healthy controls, but that the diagnostic accuracy of serum miR-21 is superior to miR-221 for breast cancer prediction. miR-221 has more diagnostic power in discriminating between breast cancer and fibroadenoma patients. The overexpression of miR-221 has been associated with the breast cancer grade. We also demonstrated that the combined expression of miR-21 and miR-221can be successfully applied as breast cancer biomarkers

    Artificial intelligence in digital pathology: a diagnostic test accuracy systematic review and meta-analysis

    Full text link
    Ensuring diagnostic performance of AI models before clinical use is key to the safe and successful adoption of these technologies. Studies reporting AI applied to digital pathology images for diagnostic purposes have rapidly increased in number in recent years. The aim of this work is to provide an overview of the diagnostic accuracy of AI in digital pathology images from all areas of pathology. This systematic review and meta-analysis included diagnostic accuracy studies using any type of artificial intelligence applied to whole slide images (WSIs) in any disease type. The reference standard was diagnosis through histopathological assessment and / or immunohistochemistry. Searches were conducted in PubMed, EMBASE and CENTRAL in June 2022. We identified 2976 studies, of which 100 were included in the review and 48 in the full meta-analysis. Risk of bias and concerns of applicability were assessed using the QUADAS-2 tool. Data extraction was conducted by two investigators and meta-analysis was performed using a bivariate random effects model. 100 studies were identified for inclusion, equating to over 152,000 whole slide images (WSIs) and representing many disease types. Of these, 48 studies were included in the meta-analysis. These studies reported a mean sensitivity of 96.3% (CI 94.1-97.7) and mean specificity of 93.3% (CI 90.5-95.4) for AI. There was substantial heterogeneity in study design and all 100 studies identified for inclusion had at least one area at high or unclear risk of bias. This review provides a broad overview of AI performance across applications in whole slide imaging. However, there is huge variability in study design and available performance data, with details around the conduct of the study and make up of the datasets frequently missing. Overall, AI offers good accuracy when applied to WSIs but requires more rigorous evaluation of its performance.Comment: 26 pages, 5 figures, 8 tables + Supplementary material

    Prostate Cancer Diagnosis using Magnetic Resonance Imaging - a Machine Learning Approach

    Get PDF

    Cancer Health Disparities Drivers with BERTopic Modelling and Pycaret Evaluation

    Get PDF
    The complex interplay of social, behavioural, lifestyle, environmental, health system, and natural health variables contribute to disparities in cancer treatment across racial and ethnic groups. Consequently, it is necessary to identify the variables contributing to cancer health inequalities and develop strategies to achieve health equality. Pubmed abstract on Cancer health disparities was scraped with a bio.Entrez python package. Preprocessed data with regex and Natural tool kit(NLTK), topic modelling with BERTopic embeddings, and c-TF-IDF to construct dense clusters and analyse top topics linked with Cancer health disparities. Model evaluation with Pycaret coherence score and web app deployment with Streamlit. The results showed that Topic 32 with terms obese, female, male, school, survey, student, post, and discrepancy had the best coherence score of 0.3687. In contrast, topic 8 with terms prevalence, adult, income, high, usage, diabetes, education, elderly, change and low, received the least coherence score of 0.3255. The model classifies each Subject Word score based on the scores, the granular topic concerns and trends related to cancer health disparities, investigates the connection between drivers of cancer health disparities, and evaluates the model with their coherence score values
    • …
    corecore