82,016 research outputs found
Extension of TSVM to Multi-Class and Hierarchical Text Classification Problems With General Losses
Transductive SVM (TSVM) is a well known semi-supervised large margin learning
method for binary text classification. In this paper we extend this method to
multi-class and hierarchical classification problems. We point out that the
determination of labels of unlabeled examples with fixed classifier weights is
a linear programming problem. We devise an efficient technique for solving it.
The method is applicable to general loss functions. We demonstrate the value of
the new method using large margin loss on a number of multi-class and
hierarchical classification datasets. For maxent loss we show empirically that
our method is better than expectation regularization/constraint and posterior
regularization methods, and competitive with the version of entropy
regularization method which uses label constraints
apk2vec: Semi-supervised multi-view representation learning for profiling Android applications
Building behavior profiles of Android applications (apps) with holistic, rich
and multi-view information (e.g., incorporating several semantic views of an
app such as API sequences, system calls, etc.) would help catering downstream
analytics tasks such as app categorization, recommendation and malware analysis
significantly better. Towards this goal, we design a semi-supervised
Representation Learning (RL) framework named apk2vec to automatically generate
a compact representation (aka profile/embedding) for a given app. More
specifically, apk2vec has the three following unique characteristics which make
it an excellent choice for largescale app profiling: (1) it encompasses
information from multiple semantic views such as API sequences, permissions,
etc., (2) being a semi-supervised embedding technique, it can make use of
labels associated with apps (e.g., malware family or app category labels) to
build high quality app profiles, and (3) it combines RL and feature hashing
which allows it to efficiently build profiles of apps that stream over time
(i.e., online learning). The resulting semi-supervised multi-view hash
embeddings of apps could then be used for a wide variety of downstream tasks
such as the ones mentioned above. Our extensive evaluations with more than
42,000 apps demonstrate that apk2vec's app profiles could significantly
outperform state-of-the-art techniques in four app analytics tasks namely,
malware detection, familial clustering, app clone detection and app
recommendation.Comment: International Conference on Data Mining, 201
- …