39,987 research outputs found
Extension of TSVM to Multi-Class and Hierarchical Text Classification Problems With General Losses
Transductive SVM (TSVM) is a well known semi-supervised large margin learning
method for binary text classification. In this paper we extend this method to
multi-class and hierarchical classification problems. We point out that the
determination of labels of unlabeled examples with fixed classifier weights is
a linear programming problem. We devise an efficient technique for solving it.
The method is applicable to general loss functions. We demonstrate the value of
the new method using large margin loss on a number of multi-class and
hierarchical classification datasets. For maxent loss we show empirically that
our method is better than expectation regularization/constraint and posterior
regularization methods, and competitive with the version of entropy
regularization method which uses label constraints
Disordered proteins and network disorder in network descriptions of protein structure, dynamics and function. Hypotheses and a comprehensive review
During the last decade, network approaches became a powerful tool to describe protein structure and dynamics. Here we review the links between disordered proteins and the associated networks, and describe the consequences of local, mesoscopic and global network disorder on changes in protein structure and dynamics. We introduce a new classification of protein networks into ‘cumulus-type’, i.e., those similar to puffy (white) clouds, and ‘stratus-type’, i.e., those similar to flat, dense (dark) low-lying clouds, and relate these network types to protein disorder dynamics and to differences in energy transmission processes. In the first class, there is limited overlap between the modules, which implies higher rigidity of the individual units; there the conformational changes can be described by an ‘energy transfer’ mechanism. In the second class, the topology presents a compact structure with significant overlap between the modules; there the conformational changes can be described by ‘multi-trajectories’; that is, multiple highly populated pathways. We further propose that disordered protein regions evolved to help other protein segments reach ‘rarely visited’ but functionally-related states. We also show the role of disorder in ‘spatial games’ of amino acids; highlight the effects of intrinsically disordered proteins (IDPs) on cellular networks and list some possible studies linking protein disorder and protein structure networks
Nature-Inspired Learning Models
Intelligent learning mechanisms found in natural world are still unsurpassed in their learning performance and eficiency of dealing with uncertain information coming in a variety of forms, yet remain under continuous challenge
from human driven artificial intelligence methods. This work intends to demonstrate how the phenomena observed in physical world can be directly used to guide artificial learning models. An inspiration for the new
learning methods has been found in the mechanics of physical fields found in both micro and macro scale.
Exploiting the analogies between data and particles subjected to gravity, electrostatic and gas particle fields, new algorithms have been developed and applied to classification and clustering while the properties of the
field further reused in regression and visualisation of classification and classifier fusion. The paper covers extensive pictorial examples and visual interpretations of the presented techniques along with some testing over
the well-known real and artificial datasets, compared when possible to the traditional methods
Fast Single-Class Classification and the Principle of Logit Separation
We consider neural network training, in applications in which there are many
possible classes, but at test-time, the task is a binary classification task of
determining whether the given example belongs to a specific class, where the
class of interest can be different each time the classifier is applied. For
instance, this is the case for real-time image search. We define the Single
Logit Classification (SLC) task: training the network so that at test-time, it
would be possible to accurately identify whether the example belongs to a given
class in a computationally efficient manner, based only on the output logit for
this class. We propose a natural principle, the Principle of Logit Separation,
as a guideline for choosing and designing losses suitable for the SLC. We show
that the cross-entropy loss function is not aligned with the Principle of Logit
Separation. In contrast, there are known loss functions, as well as novel batch
loss functions that we propose, which are aligned with this principle. In
total, we study seven loss functions. Our experiments show that indeed in
almost all cases, losses that are aligned with the Principle of Logit
Separation obtain at least 20% relative accuracy improvement in the SLC task
compared to losses that are not aligned with it, and sometimes considerably
more. Furthermore, we show that fast SLC does not cause any drop in binary
classification accuracy, compared to standard classification in which all
logits are computed, and yields a speedup which grows with the number of
classes. For instance, we demonstrate a 10x speedup when the number of classes
is 400,000. Tensorflow code for optimizing the new batch losses is publicly
available at https://github.com/cruvadom/Logit Separation.Comment: Published as a conference paper in ICDM 201
- …