5 research outputs found

    Utilizing Machine Learning Techniques to Rapidly Identify MUC2 Expression in Colon Cancer Tissues

    Get PDF
    Colorectal cancer is the third-most common form of cancer among American men and women. Like most tumors, colon cancer is sustained by a subpopulation of “stem cells” that possess the ability to self-renew and differentiate into more specialized cell types. It would be useful to detect stem cells in images of colon cancer tissue, but the first step in being able to do so is to know what genes are expressed in the stem cells and how to detect their expression pattern from the tissue images. Machine learning (ML) is a powerful tool that is widely used in biological research as a novel and innovative technique to facilitate rapid diagnosis of cancer. The current study demonstrates the feasibility and effectiveness of using ML techniques to rapidly detect the expression of the gene MUC2 (mucin 2) in colon cancer tissue images. We analyzed histological images of colon cancer and segmented the nuclei to look for features (area, perimeter, eccentricity, compactness, etc.) that correlate with high or low levels of MUC2. Grid search was then run on this data set to tune the hyper-parameters, and the following models were tested as potential classifiers: random forest, gradient boosting, decision trees with AdaBoost, and support vector machines. Of all of the tested models, it was found that the random forest classifier (f1 score of 0.71) and the gradient boosting classifier (f1 score of 0.72) were able to predict the output label most accurately. Under certain conditions, we have identified four features that have predictive capabilities. Predicting individual gene expression with machine learning is the first step in detecting genes that are specific to cancer stem cells in the early stages of cancer, while there is still hope for a cure

    Identification of histological features to predict MUC2 expression in colon cancer tissues

    Get PDF
    Colorectal cancer (CRC) is the third-most common form of cancer among Americans. Like normal colon tissue, CRC cells are sustained by a subpopulation of “stem cells” that possess the ability to self-renew and differentiate into more specialized cancer cell types. In normal colon tissue, the enterocytes, goblet cells and other epithelial cells in the mucosa region have distinct morphologies that distinguish them from the other cells in the lamina propria, muscularis mucosa, and submucosa. However, in a tumor, the morphology of the cancer cells varies dramatically. Cancer cells that express genes specific to goblet cells significantly differ in shape and size compared to their normal counterparts. Even though a large number of hematoxylin and eosin (H&E)-stained sections and the corresponding RNA sequencing (RNASeq) data from CRC are available from The Cancer Genome Atlas (TCGA), prediction of gene expression patterns from tissue histological features has not been attempted yet. In this manuscript, we identified histological features that are strongly associated with MUC2 expression patterns in a tumor. Specifically, we show that large nuclear area is associated with MUC2-high tumors (p < 0.001). This discovery provides insight into cancer biology and tumor histology and demonstrates that it may be possible to predict certain gene expressions from histological features

    Identification of histological features to predict MUC2 expression in colon cancer tissues

    Get PDF
    Colorectal cancer (CRC) is the third-most common form of cancer among Americans. Like normal colon tissue, CRC cells are sustained by a subpopulation of “stem cells” that possess the ability to self-renew and differentiate into more specialized cancer cell types. In normal colon tissue, the enterocytes, goblet cells and other epithelial cells in the mucosa region have distinct morphologies that distinguish them from the other cells in the lamina propria, muscularis mucosa, and submucosa. However, in a tumor, the morphology of the cancer cells varies dramatically. Cancer cells that express genes specific to goblet cells significantly differ in shape and size compared to their normal counterparts. Even though a large number of hematoxylin and eosin (H&E)-stained sections and the corresponding RNA sequencing (RNASeq) data from CRC are available from The Cancer Genome Atlas (TCGA), prediction of gene expression patterns from tissue histological features has not been attempted yet. In this manuscript, we identified histological features that are strongly associated with MUC2 expression patterns in a tumor. Specifically, we show that large nuclear area is associated with MUC2-high tumors (p < 0.001). This discovery provides insight into cancer biology and tumor histology and demonstrates that it may be possible to predict certain gene expressions from histological features

    A deep learning pipeline for automated classification of vocal fold polyps in flexible laryngoscopy.

    No full text
    PURPOSE: To develop and validate a deep learning model for distinguishing healthy vocal folds (HVF) and vocal fold polyps (VFP) on laryngoscopy videos, while demonstrating the ability of a previously developed informative frame classifier in facilitating deep learning development. METHODS: Following retrospective extraction of image frames from 52 HVF and 77 unilateral VFP videos, two researchers manually labeled each frame as informative or uninformative. A previously developed informative frame classifier was used to extract informative frames from the same video set. Both sets of videos were independently divided into training (60%), validation (20%), and test (20%) by patient. Machine-labeled frames were independently verified by two researchers to assess the precision of the informative frame classifier. Two models, pre-trained on ResNet18, were trained to classify frames as containing HVF or VFP. The accuracy of the polyp classifier trained on machine-labeled frames was compared to that of the classifier trained on human-labeled frames. The performance was measured by accuracy and area under the receiver operating characteristic curve (AUROC). RESULTS: When evaluated on a hold-out test set, the polyp classifier trained on machine-labeled frames achieved an accuracy of 85% and AUROC of 0.84, whereas the classifier trained on human-labeled frames achieved an accuracy of 69% and AUROC of 0.66. CONCLUSION: An accurate deep learning classifier for vocal fold polyp identification was developed and validated with the assistance of a peer-reviewed informative frame classifier for dataset assembly. The classifier trained on machine-labeled frames demonstrates improved performance compared to the classifier trained on human-labeled frames
    corecore