296,913 research outputs found

    DFDL: Discriminative Feature-oriented Dictionary Learning for Histopathological Image Classification

    Full text link
    In histopathological image analysis, feature extraction for classification is a challenging task due to the diversity of histology features suitable for each problem as well as presence of rich geometrical structure. In this paper, we propose an automatic feature discovery framework for extracting discriminative class-specific features and present a low-complexity method for classification and disease grading in histopathology. Essentially, our Discriminative Feature-oriented Dictionary Learning (DFDL) method learns class-specific features which are suitable for representing samples from the same class while are poorly capable of representing samples from other classes. Experiments on three challenging real-world image databases: 1) histopathological images of intraductal breast lesions, 2) mammalian lung images provided by the Animal Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor images from The Cancer Genome Atlas (TCGA) database, show the significance of DFDL model in a variety problems over state-of-the-art methodsComment: Accepted to IEEE International Symposium on Biomedical Imaging (ISBI), 201

    Relatedness Measures to Aid the Transfer of Building Blocks among Multiple Tasks

    Full text link
    Multitask Learning is a learning paradigm that deals with multiple different tasks in parallel and transfers knowledge among them. XOF, a Learning Classifier System using tree-based programs to encode building blocks (meta-features), constructs and collects features with rich discriminative information for classification tasks in an observed list. This paper seeks to facilitate the automation of feature transferring in between tasks by utilising the observed list. We hypothesise that the best discriminative features of a classification task carry its characteristics. Therefore, the relatedness between any two tasks can be estimated by comparing their most appropriate patterns. We propose a multiple-XOF system, called mXOF, that can dynamically adapt feature transfer among XOFs. This system utilises the observed list to estimate the task relatedness. This method enables the automation of transferring features. In terms of knowledge discovery, the resemblance estimation provides insightful relations among multiple data. We experimented mXOF on various scenarios, e.g. representative Hierarchical Boolean problems, classification of distinct classes in the UCI Zoo dataset, and unrelated tasks, to validate its abilities of automatic knowledge-transfer and estimating task relatedness. Results show that mXOF can estimate the relatedness reasonably between multiple tasks to aid the learning performance with the dynamic feature transferring.Comment: accepted by The Genetic and Evolutionary Computation Conference (GECCO 2020

    Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks

    Get PDF
    We participated in three of the protein-protein interaction subtasks of the Second BioCreative Challenge: classification of abstracts relevant for protein-protein interaction (IAS), discovery of protein pairs (IPS) and text passages characterizing protein interaction (ISS) in full text documents. We approached the abstract classification task with a novel, lightweight linear model inspired by spam-detection techniques, as well as an uncertainty-based integration scheme. We also used a Support Vector Machine and the Singular Value Decomposition on the same features for comparison purposes. Our approach to the full text subtasks (protein pair and passage identification) includes a feature expansion method based on word-proximity networks. Our approach to the abstract classification task (IAS) was among the top submissions for this task in terms of the measures of performance used in the challenge evaluation (accuracy, F-score and AUC). We also report on a web-tool we produced using our approach: the Protein Interaction Abstract Relevance Evaluator (PIARE). Our approach to the full text tasks resulted in one of the highest recall rates as well as mean reciprocal rank of correct passages. Our approach to abstract classification shows that a simple linear model, using relatively few features, is capable of generalizing and uncovering the conceptual nature of protein-protein interaction from the bibliome. Since the novel approach is based on a very lightweight linear model, it can be easily ported and applied to similar problems. In full text problems, the expansion of word features with word-proximity networks is shown to be useful, though the need for some improvements is discussed

    Automated Knowledge Discovery from Functional Magnetic Resonance Images using Spatial Coherence

    Get PDF
    Functional Magnetic Resonance Imaging (fMRI) has the potential to unlock many of the mysteries of the brain. Although this imaging modality is popular for brain-mapping activities, clinical applications of this technique are relatively rare. For clinical applications, classification models are more useful than the current practice of reporting loci of neural activation associated with particular disorders. Also, since the methods used to account for anatomical variations between subjects are generally imprecise, the conventional voxel-by-voxel analysis limits the types of discoveries that are possible. This work presents a classification-based framework for knowledge discovery from fMRI data. Instead of voxel-centric knowledge discovery, this framework is segment-centric, where functional segments are clumps of voxels that represent a functional unit in the brain. With simulated activation images, it is shown that this segment-based approach can be more successful for knowledge discovery than conventional voxel-based approaches. The spatial coherence principle refers to the homogeneity of behavior of spatially contiguous voxels. Auto-threshold Contrast Enhancing Iterative Clustering (ACEIC) - a new algorithm based on the spatial coherence principle is presented here for functional segmentation. With benchmark data, it is shown that the ACEIC method can achieve higher segmentation accuracy than Probabilistic Independent Component Analysis - a popular method used for fMRI data analysis. The spatial coherence principle can also be exploited for voxel-centric image-classification problems. Spatially Coherent Voxels (SCV) is a new feature selection method that uses the spatial coherence principle to eliminate features that are unlikely to be useful for classification. For a Substance Use Disorder dataset, it is demonstrated that feature selection with SCV can achieve higher classification accuracies than conventional feature selection methods

    ECoFFeS: A Software Using Evolutionary Computation for Feature Selection in Drug Discovery

    Get PDF
    Feature selection is of particular importance in the field of drug discovery. Many methods have been put forward for feature selection during recent decades. Among them, evolutionary computation has gained increasing attention owing to its superior global search ability. However, there still lacks a simple and efficient software for drug developers to take advantage of evolutionary computation for feature selection. To remedy this issue, in this paper, a user-friendly and standalone software, named ECoFFeS, is developed. ECoFFeS is expected to lower the entry barrier for drug developers to deal with feature selection problems at hand by using evolutionary algorithms. To the best of our knowledge, it is the first software integrating a set of evolutionary algorithms (including two modified evolutionary algorithms proposed by the authors) with various evaluation combinations for feature selection. Specifically, ECoFFeS considers both single-objective and multi-objective evolutionary algorithms, and both regression- and classification-based models to meet different requirements. Five data sets in drug discovery are collected in ECoFFeS. In addition, to reduce the total analysis time, the parallel execution technique is incorporated into ECoFFeS. The source code of ECoFFeS can be available from https://github.com/JiaweiHuang/ECoFFeS/

    Deep Cytometry: Deep learning with Real-time Inference in Cell Sorting and Flow Cytometry

    Get PDF
    Deep learning has achieved spectacular performance in image and speech recognition and synthesis. It outperforms other machine learning algorithms in problems where large amounts of data are available. In the area of measurement technology, instruments based on the photonic time stretch have established record real-time measurement throughput in spectroscopy, optical coherence tomography, and imaging flow cytometry. These extreme-throughput instruments generate approximately 1 Tbit/s of continuous measurement data and have led to the discovery of rare phenomena in nonlinear and complex systems as well as new types of biomedical instruments. Owing to the abundance of data they generate, time-stretch instruments are a natural fit to deep learning classification. Previously we had shown that high-throughput label-free cell classification with high accuracy can be achieved through a combination of time-stretch microscopy, image processing and feature extraction, followed by deep learning for finding cancer cells in the blood. Such a technology holds promise for early detection of primary cancer or metastasis. Here we describe a new deep learning pipeline, which entirely avoids the slow and computationally costly signal processing and feature extraction steps by a convolutional neural network that directly operates on the measured signals. The improvement in computational efficiency enables low-latency inference and makes this pipeline suitable for cell sorting via deep learning. Our neural network takes less than a few milliseconds to classify the cells, fast enough to provide a decision to a cell sorter for real-time separation of individual target cells. We demonstrate the applicability of our new method in the classification of OT-II white blood cells and SW-480 epithelial cancer cells with more than 95% accuracy in a label-free fashion
    corecore