18,003 research outputs found
Multiple instance learning for sequence data with across bag dependencies
In Multiple Instance Learning (MIL) problem for sequence data, the instances
inside the bags are sequences. In some real world applications such as
bioinformatics, comparing a random couple of sequences makes no sense. In fact,
each instance may have structural and/or functional relations with instances of
other bags. Thus, the classification task should take into account this across
bag relation. In this work, we present two novel MIL approaches for sequence
data classification named ABClass and ABSim. ABClass extracts motifs from
related instances and use them to encode sequences. A discriminative classifier
is then applied to compute a partial classification result for each set of
related sequences. ABSim uses a similarity measure to discriminate the related
instances and to compute a scores matrix. For both approaches, an aggregation
method is applied in order to generate the final classification result. We
applied both approaches to solve the problem of bacterial Ionizing Radiation
Resistance prediction. The experimental results of the presented approaches are
satisfactory
Learning and Designing Stochastic Processes from Logical Constraints
Stochastic processes offer a flexible mathematical formalism to model and
reason about systems. Most analysis tools, however, start from the premises
that models are fully specified, so that any parameters controlling the
system's dynamics must be known exactly. As this is seldom the case, many
methods have been devised over the last decade to infer (learn) such parameters
from observations of the state of the system. In this paper, we depart from
this approach by assuming that our observations are {\it qualitative}
properties encoded as satisfaction of linear temporal logic formulae, as
opposed to quantitative observations of the state of the system. An important
feature of this approach is that it unifies naturally the system identification
and the system design problems, where the properties, instead of observations,
represent requirements to be satisfied. We develop a principled statistical
estimation procedure based on maximising the likelihood of the system's
parameters, using recent ideas from statistical machine learning. We
demonstrate the efficacy and broad applicability of our method on a range of
simple but non-trivial examples, including rumour spreading in social networks
and hybrid models of gene regulation
Machine Learning and Graph Theory Approaches for Classification and Prediction of Protein Structure
Recently, many methods have been proposed for the classification and prediction problems in bioinformatics. One of these problems is the protein structure prediction. Machine learning approaches and new algorithms have been proposed to solve this problem. Among the machine learning approaches, Support Vector Machines (SVM) have attracted a lot of attention due to their high prediction accuracy. Since protein data consists of sequence and structural information, another most widely used approach for modeling this structured data is to use graphs. In computer science, graph theory has been widely studied; however it has only been recently applied to bioinformatics. In this work, we introduced new algorithms based on statistical methods, graph theory concepts and machine learning for the protein structure prediction problem. A new statistical method based on z-scores has been introduced for seed selection in proteins. A new method based on finding common cliques in protein data for feature selection is also introduced, which reduces noise in the data. We also introduced new binary classifiers for the prediction of structural transitions in proteins. These new binary classifiers achieve much higher accuracy results than the current traditional binary classifiers
Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification
Network biology has been successfully used to help reveal complex mechanisms
of disease, especially cancer. On the other hand, network biology requires
in-depth knowledge to construct disease-specific networks, but our current
knowledge is very limited even with the recent advances in human cancer
biology. Deep learning has shown a great potential to address the difficult
situation like this. However, deep learning technologies conventionally use
grid-like structured data, thus application of deep learning technologies to
the classification of human disease subtypes is yet to be explored. Recently,
graph based deep learning techniques have emerged, which becomes an opportunity
to leverage analyses in network biology. In this paper, we proposed a hybrid
model, which integrates two key components 1) graph convolution neural network
(graph CNN) and 2) relation network (RN). We utilize graph CNN as a component
to learn expression patterns of cooperative gene community, and RN as a
component to learn associations between learned patterns. The proposed model is
applied to the PAM50 breast cancer subtype classification task, the standard
breast cancer subtype classification of clinical utility. In experiments of
both subtype classification and patient survival analysis, our proposed method
achieved significantly better performances than existing methods. We believe
that this work is an important starting point to realize the upcoming
personalized medicine.Comment: 8 pages, To be published in proceeding of IJCAI 201
- …