296,913 research outputs found
DFDL: Discriminative Feature-oriented Dictionary Learning for Histopathological Image Classification
In histopathological image analysis, feature extraction for classification is
a challenging task due to the diversity of histology features suitable for each
problem as well as presence of rich geometrical structure. In this paper, we
propose an automatic feature discovery framework for extracting discriminative
class-specific features and present a low-complexity method for classification
and disease grading in histopathology. Essentially, our Discriminative
Feature-oriented Dictionary Learning (DFDL) method learns class-specific
features which are suitable for representing samples from the same class while
are poorly capable of representing samples from other classes. Experiments on
three challenging real-world image databases: 1) histopathological images of
intraductal breast lesions, 2) mammalian lung images provided by the Animal
Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor
images from The Cancer Genome Atlas (TCGA) database, show the significance of
DFDL model in a variety problems over state-of-the-art methodsComment: Accepted to IEEE International Symposium on Biomedical Imaging
(ISBI), 201
Relatedness Measures to Aid the Transfer of Building Blocks among Multiple Tasks
Multitask Learning is a learning paradigm that deals with multiple different
tasks in parallel and transfers knowledge among them. XOF, a Learning
Classifier System using tree-based programs to encode building blocks
(meta-features), constructs and collects features with rich discriminative
information for classification tasks in an observed list. This paper seeks to
facilitate the automation of feature transferring in between tasks by utilising
the observed list. We hypothesise that the best discriminative features of a
classification task carry its characteristics. Therefore, the relatedness
between any two tasks can be estimated by comparing their most appropriate
patterns. We propose a multiple-XOF system, called mXOF, that can dynamically
adapt feature transfer among XOFs. This system utilises the observed list to
estimate the task relatedness. This method enables the automation of
transferring features. In terms of knowledge discovery, the resemblance
estimation provides insightful relations among multiple data. We experimented
mXOF on various scenarios, e.g. representative Hierarchical Boolean problems,
classification of distinct classes in the UCI Zoo dataset, and unrelated tasks,
to validate its abilities of automatic knowledge-transfer and estimating task
relatedness. Results show that mXOF can estimate the relatedness reasonably
between multiple tasks to aid the learning performance with the dynamic feature
transferring.Comment: accepted by The Genetic and Evolutionary Computation Conference
(GECCO 2020
Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks
We participated in three of the protein-protein interaction subtasks of the
Second BioCreative Challenge: classification of abstracts relevant for
protein-protein interaction (IAS), discovery of protein pairs (IPS) and text
passages characterizing protein interaction (ISS) in full text documents. We
approached the abstract classification task with a novel, lightweight linear
model inspired by spam-detection techniques, as well as an uncertainty-based
integration scheme. We also used a Support Vector Machine and the Singular
Value Decomposition on the same features for comparison purposes. Our approach
to the full text subtasks (protein pair and passage identification) includes a
feature expansion method based on word-proximity networks. Our approach to the
abstract classification task (IAS) was among the top submissions for this task
in terms of the measures of performance used in the challenge evaluation
(accuracy, F-score and AUC). We also report on a web-tool we produced using our
approach: the Protein Interaction Abstract Relevance Evaluator (PIARE). Our
approach to the full text tasks resulted in one of the highest recall rates as
well as mean reciprocal rank of correct passages. Our approach to abstract
classification shows that a simple linear model, using relatively few features,
is capable of generalizing and uncovering the conceptual nature of
protein-protein interaction from the bibliome. Since the novel approach is
based on a very lightweight linear model, it can be easily ported and applied
to similar problems. In full text problems, the expansion of word features with
word-proximity networks is shown to be useful, though the need for some
improvements is discussed
Automated Knowledge Discovery from Functional Magnetic Resonance Images using Spatial Coherence
Functional Magnetic Resonance Imaging (fMRI) has the potential to unlock many of the mysteries of the brain. Although this imaging modality is popular for brain-mapping activities, clinical applications of this technique are relatively rare. For clinical applications, classification models are more useful than the current practice of reporting loci of neural activation associated with particular disorders. Also, since the methods used to account for anatomical variations between subjects are generally imprecise, the conventional voxel-by-voxel analysis limits the types of discoveries that are possible. This work presents a classification-based framework for knowledge discovery from fMRI data. Instead of voxel-centric knowledge discovery, this framework is segment-centric, where functional segments are clumps of voxels that represent a functional unit in the brain. With simulated activation images, it is shown that this segment-based approach can be more successful for knowledge discovery than conventional voxel-based approaches. The spatial coherence principle refers to the homogeneity of behavior of spatially contiguous voxels. Auto-threshold Contrast Enhancing Iterative Clustering (ACEIC) - a new algorithm based on the spatial coherence principle is presented here for functional segmentation. With benchmark data, it is shown that the ACEIC method can achieve higher segmentation accuracy than Probabilistic Independent Component Analysis - a popular method used for fMRI data analysis. The spatial coherence principle can also be exploited for voxel-centric image-classification problems. Spatially Coherent Voxels (SCV) is a new feature selection method that uses the spatial coherence principle to eliminate features that are unlikely to be useful for classification. For a Substance Use Disorder dataset, it is demonstrated that feature selection with SCV can achieve higher classification accuracies than conventional feature selection methods
ECoFFeS: A Software Using Evolutionary Computation for Feature Selection in Drug Discovery
Feature selection is of particular importance in the field of drug discovery. Many methods have been put forward for feature selection during recent decades. Among them, evolutionary computation has gained increasing attention owing to its superior global search ability. However, there still lacks a simple and efficient software for drug developers to take advantage of evolutionary computation for feature selection. To remedy this issue, in this paper, a user-friendly and standalone software, named ECoFFeS, is developed. ECoFFeS is expected to lower the entry barrier for drug developers to deal with feature selection problems at hand by using evolutionary algorithms. To the best of our knowledge, it is the first software integrating a set of evolutionary algorithms (including two modified evolutionary algorithms proposed by the authors) with various evaluation combinations for feature selection. Specifically, ECoFFeS considers both single-objective and multi-objective evolutionary algorithms, and both regression- and classification-based models to meet different requirements. Five data sets in drug discovery are collected in ECoFFeS. In addition, to reduce the total analysis time, the parallel execution technique is incorporated into ECoFFeS. The source code of ECoFFeS can be available from https://github.com/JiaweiHuang/ECoFFeS/
Deep Cytometry: Deep learning with Real-time Inference in Cell Sorting and Flow Cytometry
Deep learning has achieved spectacular performance in image and speech
recognition and synthesis. It outperforms other machine learning algorithms in
problems where large amounts of data are available. In the area of measurement
technology, instruments based on the photonic time stretch have established
record real-time measurement throughput in spectroscopy, optical coherence
tomography, and imaging flow cytometry. These extreme-throughput instruments
generate approximately 1 Tbit/s of continuous measurement data and have led to
the discovery of rare phenomena in nonlinear and complex systems as well as new
types of biomedical instruments. Owing to the abundance of data they generate,
time-stretch instruments are a natural fit to deep learning classification.
Previously we had shown that high-throughput label-free cell classification
with high accuracy can be achieved through a combination of time-stretch
microscopy, image processing and feature extraction, followed by deep learning
for finding cancer cells in the blood. Such a technology holds promise for
early detection of primary cancer or metastasis. Here we describe a new deep
learning pipeline, which entirely avoids the slow and computationally costly
signal processing and feature extraction steps by a convolutional neural
network that directly operates on the measured signals. The improvement in
computational efficiency enables low-latency inference and makes this pipeline
suitable for cell sorting via deep learning. Our neural network takes less than
a few milliseconds to classify the cells, fast enough to provide a decision to
a cell sorter for real-time separation of individual target cells. We
demonstrate the applicability of our new method in the classification of OT-II
white blood cells and SW-480 epithelial cancer cells with more than 95%
accuracy in a label-free fashion
- …