13 research outputs found

    Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images.

    Get PDF
    Histopathological images are a rich but incompletely explored data type for studying cancer. Manual inspection is time consuming, making it challenging to use for image data mining. Here we show that convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNN architectures to analyze 27,815 hematoxylin and eosin scanned images from The Cancer Genome Atlas for tumor/normal, cancer subtype, and mutation classification. Our CNNs are able to classify TCGA pathologist-annotated tumor/normal status of whole slide images (WSIs) in 19 cancer types with consistently high AUCs (0.995 ± 0.008), as well as subtypes with lower but significant accuracy (AUC 0.87 ± 0.1). Remarkably, tumor/normal CNNs trained on one tissue are effective in others (AUC 0.88 ± 0.11), with classifier relationships also recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with an average tile-level correlation of 0.45 ± 0.16 between classifier pairs. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. Patterns for TP53 mutations can also be detected, with WSI self- and cross-tissue AUCs ranging from 0.65-0.80. Finally, we comparatively evaluate CNNs on 170 breast and colon cancer images with pathologist-annotated nuclei, finding that both cellular and intercellular regions contribute to CNN accuracy. These results demonstrate the power of CNNs not only for histopathological classification, but also for cross-comparisons to reveal conserved spatial behaviors across tumors

    Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images

    Get PDF
    Histopathological images are a rich but incompletely explored data type for studying cancer. Manual inspection is time consuming, making it challenging to use for image data mining. Here we show that convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNN architectures to analyze 27,815 hematoxylin and eosin scanned images from The Cancer Genome Atlas for tumor/normal, cancer subtype, and mutation classification. Our CNNs are able to classify TCGA pathologist-annotated tumor/normal status of whole slide images (WSIs) in 19 cancer types with consistently high AUCs (0.995 ± 0.008), as well as subtypes with lower but significant accuracy (AUC 0.87 ± 0.1). Remarkably, tumor/normal CNNs trained on one tissue are effective in others (AUC 0.88 ± 0.11), with classifier relationships also recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with an average tile-level correlation of 0.45 ± 0.16 between classifier pairs. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. Patterns for TP53 mutations can also be detected, with WSI self- and cross-tissue AUCs ranging from 0.65-0.80. Finally, we comparatively evaluate CNNs on 170 breast and colon cancer images with pathologist-annotated nuclei, finding that both cellular and intercellular regions contribute to CNN accuracy. These results demonstrate the power of CNNs not only for histopathological classification, but also for cross-comparisons to reveal conserved spatial behaviors across tumors.R01 CA230031 - NCI NIH HHSPublished versio

    Importance of selecting research stimuli: a comparative study of the properties, structure and validity of both standard and novel emotion elicitation techniques

    Get PDF
    The principal aim of this doctoral research has been to investigate whether various popular methods of emotion elicitation perform differently in terms of self-reported participant affect - and if so, whether any of them is better able to mimic real-life emotional situations. A secondary goal has been to understand how continuous affect can be classified into discrete categories - whether by using clustering algorithms, or resorting to human participants for creating the classifications. A variety of research directions subserved these main goals: firstly, developing data-driven strategies for selecting 'appropriate' stimuli, and matching them across various stimulus modalities (i.e., words, sounds, images,films and virtual environments / VEs); secondly, comparing the chosen modalities on various self-report measures (with VEs assessed both with and without a head-mounted display / HMD); thirdly, comparing how humans classify emotional information vs. a clustering algorithm; and finally, comparing all five lab-based stimulus modalities to emotional data collected via an experience sampling phone app. Findings / outputs discussed will include a matched database of stimuli geared towards lab use, how the choice of stimulus modality may affect research results, the links (or discrepancies) between human and machine classification of emotional information, as well as range restriction affecting lab stimuli relative to `real-life' emotional phenomena

    Dimension-reduction and discrimination of neuronal multi-channel signals

    Get PDF
    Dimensionsreduktion und Trennung neuronaler Multikanal-Signale

    Applying Kernel Change Point Detection To Financial Markets

    Get PDF
    The widespread use of computers in everyday living has created a newfound reliance on data systems to support the decisions people make. From wristwatches that monitor your health to fridges that notify users of potential problems, data is constantly being streamed to help users make more informed choices. Because the data has im- mediate importance to users, techniques that analyse live data quickly and efficiently are necessary. One such group of methods are online change point detection methods. Online change point detection is concerned with identifying statistical change points in a datastream as they occur, as quickly as possible. The focus for this thesis is on online kernel change point detection methods. Combining kernel two-sample testing and classic change point algorithms, kernel change point methods provide a robust, non-parametric way to measure changes in probability distributions on a variety of datasets and applications. We compare several kernel change point algorithms on several synthetic datasets across a range of measurements that assess online performance. We also provide a novel way to select the kernel bandwidth hyperparameter that adapts to the data in an online fashion. Additionally, we take a look at the intraday market liquidity changes of several financial markets. We focus on futures instruments of different asset classes from the Chicago Mercantile Exchange. Data is sampled for the first four months of 2020 during which the world fell into an economic recession due to a global pandemic. An online kernel change point detection algorithm is applied to detect changes in the market liquidity distribution that are indicative of important macroeconomic events

    A quality metric to improve wrapper feature selection in multiclass subject invariant brain computer interfaces

    Get PDF
    Title from PDF of title page, viewed on June 5, 2012Dissertation advisor: Reza DerakhshaniVitaIncludes bibliographical references (p. 116-129)Thesis (Ph.D.)--School of Computing and Engineering. University of Missouri--Kansas City, 2012Brain computer interface systems based on electroencephalograph (EEG) signals have limitations which challenge their application as a practical device for general use. The signal features generated by the brain states we wish to detect possess a high degree of inter-subject and intra-subject variation. Additionally, these features usually exhibit a low variation across each of the target states. Collection of EEG signals using low resolution, non-invasive scalp electrodes further degrades the spatial resolution of these signals. The majority of brain computer interface systems to date require extensive training prior to use by each individual user. The discovery of subject invariant features could reduce or even eliminate individual training requirements. To obtain suitable subject invariant features requires search through a high dimension feature space consisting of combinations of spatial, spectral and temporal features. Poorly separable features can prevent the search from converging to a usable solution as a result of degenerate classifiers. In such instances the system must detect and compensate for degenerate classifier behavior. This dissertation presents a method to accomplish this search using a wrapper architecture comprised of a sequential forward floating search algorithm coupled with a support vector machine classifier. This is successfully achieved by the introduction of a scalar Quality (Q)-factor metric, calculated from the ratio of sensitivity to specificity of the confusion matrix. This method is successfully applied to a multiclass subject independent BCI using 10 untrained subjects performing 4 motor tasks.Introduction to brain computer interface systems -- Historical perspective and state of the art -- Experimental design -- Degeneracy in support vector machines -- Discussion of research -- Results -- Conclusion -- Appendix A. Information transfer rate -- Appendix B. Additional surface plots for individual tasks and subject
    corecore