5 research outputs found
Utilizing Machine Learning Techniques to Rapidly Identify MUC2 Expression in Colon Cancer Tissues
Colorectal cancer is the third-most common form of cancer among American men and women. Like most tumors, colon cancer is sustained by a subpopulation of “stem cells” that possess the ability to self-renew and differentiate into more specialized cell types. It would be useful to detect stem cells in images of colon cancer tissue, but the first step in being able to do so is to know what genes are expressed in the stem cells and how to detect their expression pattern from the tissue images. Machine learning (ML) is a powerful tool that is widely used in biological research as a novel and innovative technique to facilitate rapid diagnosis of cancer. The current study demonstrates the feasibility and effectiveness of using ML techniques to rapidly detect the expression of the gene MUC2 (mucin 2) in colon cancer tissue images. We analyzed histological images of colon cancer and segmented the nuclei to look for features (area, perimeter, eccentricity, compactness, etc.) that correlate with high or low levels of MUC2. Grid search was then run on this data set to tune the hyper-parameters, and the following models were tested as potential classifiers: random forest, gradient boosting, decision trees with AdaBoost, and support vector machines. Of all of the tested models, it was found that the random forest classifier (f1 score of 0.71) and the gradient boosting classifier (f1 score of 0.72) were able to predict the output label most accurately. Under certain conditions, we have identified four features that have predictive capabilities. Predicting individual gene expression with machine learning is the first step in detecting genes that are specific to cancer stem cells in the early stages of cancer, while there is still hope for a cure
Identification of histological features to predict MUC2 expression in colon cancer tissues
Colorectal cancer (CRC) is the third-most common form of cancer among Americans. Like normal colon tissue, CRC cells are sustained by a subpopulation of “stem cells” that possess the ability to self-renew and differentiate into more specialized cancer cell types. In normal colon tissue, the enterocytes, goblet cells and other epithelial cells in the mucosa region have distinct morphologies that distinguish them from the other cells in the lamina propria, muscularis mucosa, and submucosa. However, in a tumor, the morphology of the cancer cells varies dramatically. Cancer cells that express genes specific to goblet cells significantly differ in shape and size compared to their normal counterparts. Even though a large number of hematoxylin and eosin (H&E)-stained sections and the corresponding RNA sequencing (RNASeq) data from CRC are available from The Cancer Genome Atlas (TCGA), prediction of gene expression patterns from tissue histological features has not been attempted yet. In this manuscript, we identified histological features that are strongly associated with MUC2 expression patterns in a tumor. Specifically, we show that large nuclear area is associated with MUC2-high tumors (p < 0.001). This discovery provides insight into cancer biology and tumor histology and demonstrates that it may be possible to predict certain gene expressions from histological features
Identification of histological features to predict MUC2 expression in colon cancer tissues
Colorectal cancer (CRC) is the third-most common form of cancer among Americans. Like normal colon tissue, CRC cells are sustained by a subpopulation of “stem cells” that possess the ability to self-renew and differentiate into more specialized cancer cell types. In normal colon tissue, the enterocytes, goblet cells and other epithelial cells in the mucosa region have distinct morphologies that distinguish them from the other cells in the lamina propria, muscularis mucosa, and submucosa. However, in a tumor, the morphology of the cancer cells varies dramatically. Cancer cells that express genes specific to goblet cells significantly differ in shape and size compared to their normal counterparts. Even though a large number of hematoxylin and eosin (H&E)-stained sections and the corresponding RNA sequencing (RNASeq) data from CRC are available from The Cancer Genome Atlas (TCGA), prediction of gene expression patterns from tissue histological features has not been attempted yet. In this manuscript, we identified histological features that are strongly associated with MUC2 expression patterns in a tumor. Specifically, we show that large nuclear area is associated with MUC2-high tumors (p < 0.001). This discovery provides insight into cancer biology and tumor histology and demonstrates that it may be possible to predict certain gene expressions from histological features
Recommended from our members
Are There Differences between the Stress Responses of Philippine Men and Women to the COVID-19 Pandemic?
The SARS-CoV-2 pandemic has had a deleterious impact on human health since its beginning in 2019. The purpose of this study was to examine the psychosocial impact of the COVID-19 pandemic in the Philippines and determine if there were differential impacts on women compared to men. A web-based survey was conducted in the Luzon Islands of the Philippines, during the pandemic quarantine. A total of 1879 participants completed online surveys between 28 March-12 April 2020. A bivariate analysis of both men and women for each psychological measure (stress, anxiety, depression, and impact of COVID-19) was conducted. Multivariable logistic regression models were built for each measure, dichotomized as high or low, separately for men and women. Younger age (p < 0.001), being married (p < 0.001), and being a parent (p < 0.004) were associated with women's poor mental health. Marriage and large household size are protective factors for men (p < 0.002 and p < 0.0012, respectively), but marriage may be a risk factor for women (p < 0.001). Overall, women were disproportionately negatively impacted by the pandemic compared to men
A deep learning pipeline for automated classification of vocal fold polyps in flexible laryngoscopy.
PURPOSE: To develop and validate a deep learning model for distinguishing healthy vocal folds (HVF) and vocal fold polyps (VFP) on laryngoscopy videos, while demonstrating the ability of a previously developed informative frame classifier in facilitating deep learning development.
METHODS: Following retrospective extraction of image frames from 52 HVF and 77 unilateral VFP videos, two researchers manually labeled each frame as informative or uninformative. A previously developed informative frame classifier was used to extract informative frames from the same video set. Both sets of videos were independently divided into training (60%), validation (20%), and test (20%) by patient. Machine-labeled frames were independently verified by two researchers to assess the precision of the informative frame classifier. Two models, pre-trained on ResNet18, were trained to classify frames as containing HVF or VFP. The accuracy of the polyp classifier trained on machine-labeled frames was compared to that of the classifier trained on human-labeled frames. The performance was measured by accuracy and area under the receiver operating characteristic curve (AUROC).
RESULTS: When evaluated on a hold-out test set, the polyp classifier trained on machine-labeled frames achieved an accuracy of 85% and AUROC of 0.84, whereas the classifier trained on human-labeled frames achieved an accuracy of 69% and AUROC of 0.66.
CONCLUSION: An accurate deep learning classifier for vocal fold polyp identification was developed and validated with the assistance of a peer-reviewed informative frame classifier for dataset assembly. The classifier trained on machine-labeled frames demonstrates improved performance compared to the classifier trained on human-labeled frames