2,305,603 research outputs found
Using Word Embeddings in Twitter Election Classification
Word embeddings and convolutional neural networks (CNN)
have attracted extensive attention in various classification
tasks for Twitter, e.g. sentiment classification. However,
the effect of the configuration used to train and generate
the word embeddings on the classification performance has
not been studied in the existing literature. In this paper,
using a Twitter election classification task that aims to detect
election-related tweets, we investigate the impact of
the background dataset used to train the embedding models,
the context window size and the dimensionality of word
embeddings on the classification performance. By comparing
the classification results of two word embedding models,
which are trained using different background corpora
(e.g. Wikipedia articles and Twitter microposts), we show
that the background data type should align with the Twitter
classification dataset to achieve a better performance. Moreover,
by evaluating the results of word embeddings models
trained using various context window sizes and dimensionalities,
we found that large context window and dimension
sizes are preferable to improve the performance. Our experimental
results also show that using word embeddings and
CNN leads to statistically significant improvements over various
baselines such as random, SVM with TF-IDF and SVM
with word embeddings
The Devil is in the Tails: Fine-grained Classification in the Wild
The world is long-tailed. What does this mean for computer vision and visual
recognition? The main two implications are (1) the number of categories we need
to consider in applications can be very large, and (2) the number of training
examples for most categories can be very small. Current visual recognition
algorithms have achieved excellent classification accuracy. However, they
require many training examples to reach peak performance, which suggests that
long-tailed distributions will not be dealt with well. We analyze this question
in the context of eBird, a large fine-grained classification dataset, and a
state-of-the-art deep network classification algorithm. We find that (a) peak
classification performance on well-represented categories is excellent, (b)
given enough data, classification performance suffers only minimally from an
increase in the number of classes, (c) classification performance decays
precipitously as the number of training examples decreases, (d) surprisingly,
transfer learning is virtually absent in current methods. Our findings suggest
that our community should come to grips with the question of long tails
Learning distance to subspace for the nearest subspace methods in high-dimensional data classification
The nearest subspace methods (NSM) are a category of classification methods widely applied to classify high-dimensional data. In this paper, we propose to improve the classification performance of NSM through learning tailored distance metrics from samples to class subspaces. The learned distance metric is termed as ‘learned distance to subspace’ (LD2S). Using LD2S in the classification rule of NSM can make the samples closer to their correct class subspaces while farther away from their wrong class subspaces. In this way, the classification task becomes easier and the classification performance of NSM can be improved. The superior classification performance of using LD2S for NSM is demonstrated on three real-world high-dimensional spectral datasets
Evaluation methods and decision theory for classification of streaming data with temporal dependence
Predictive modeling on data streams plays an important role in modern data analysis, where data arrives continuously and needs to be mined in real time. In the stream setting the data distribution is often evolving over time, and models that update themselves during operation are becoming the state-of-the-art. This paper formalizes a learning and evaluation scheme of such predictive models. We theoretically analyze evaluation of classifiers on streaming data with temporal dependence. Our findings suggest that the commonly accepted data stream classification measures, such as classification accuracy and Kappa statistic, fail to diagnose cases of poor performance when temporal dependence is present, therefore they should not be used as sole performance indicators. Moreover, classification accuracy can be misleading if used as a proxy for evaluating change detectors with datasets that have temporal dependence. We formulate the decision theory for streaming data classification with temporal dependence and develop a new evaluation methodology for data stream classification that takes temporal dependence into account. We propose a combined measure for classification performance, that takes into account temporal dependence, and we recommend using it as the main performance measure in classification of streaming data
Validation of Soft Classification Models using Partial Class Memberships: An Extended Concept of Sensitivity & Co. applied to the Grading of Astrocytoma Tissues
We use partial class memberships in soft classification to model uncertain
labelling and mixtures of classes. Partial class memberships are not restricted
to predictions, but may also occur in reference labels (ground truth, gold
standard diagnosis) for training and validation data.
Classifier performance is usually expressed as fractions of the confusion
matrix, such as sensitivity, specificity, negative and positive predictive
values. We extend this concept to soft classification and discuss the bias and
variance properties of the extended performance measures. Ambiguity in
reference labels translates to differences between best-case, expected and
worst-case performance. We show a second set of measures comparing expected and
ideal performance which is closely related to regression performance, namely
the root mean squared error RMSE and the mean absolute error MAE.
All calculations apply to classical crisp classification as well as to soft
classification (partial class memberships and/or one-class classifiers). The
proposed performance measures allow to test classifiers with actual borderline
cases. In addition, hardening of e.g. posterior probabilities into class labels
is not necessary, avoiding the corresponding information loss and increase in
variance.
We implement the proposed performance measures in the R package
"softclassval", which is available from CRAN and at
http://softclassval.r-forge.r-project.org.
Our reasoning as well as the importance of partial memberships for
chemometric classification is illustrated by a real-word application:
astrocytoma brain tumor tissue grading (80 patients, 37000 spectra) for finding
surgical excision borders. As borderline cases are the actual target of the
analytical technique, samples which are diagnosed to be borderline cases must
be included in the validation.Comment: The manuscript is accepted for publication in Chemometrics and
Intelligent Laboratory Systems. Supplementary figures and tables are at the
end of the pd
A study of hierarchical and flat classification of proteins
Automatic classification of proteins using machine learning is an important problem that has received significant attention in the literature. One feature of this problem is that expert-defined hierarchies of protein classes exist and can potentially be exploited to improve classification performance. In this article we investigate empirically whether this is the case for two such hierarchies. We compare multi-class classification techniques that exploit the information in those class hierarchies and those that do not, using logistic regression, decision trees, bagged decision trees, and support vector machines as the underlying base learners. In particular, we compare hierarchical and flat variants of ensembles of nested dichotomies. The latter have been shown to deliver strong classification performance in multi-class settings. We present experimental results for synthetic, fold recognition, enzyme classification, and remote homology detection data. Our results show that exploiting the class hierarchy improves performance on the synthetic data, but not in the case of the protein classification problems. Based on this we recommend that strong flat multi-class methods be used as a baseline to establish the benefit of exploiting class hierarchies in this area
Beat histogram features from NMF-based novelty functions for music classification
In this paper we present novel rhythm features derived from drum tracks extracted from polyphonic music and evaluate them in a genre classification task. Musical excerpts are analyzed using an optimized, partially fixed Non-Negative Matrix Factorization (NMF) method and beat histogram features are calculated on basis of the resulting activation functions for each one out of three drum tracks extracted (Hi-Hat, Snare Drum and Bass Drum). The features are evaluated on two widely used genre datasets (GTZAN and Ballroom) using standard classification methods, concerning the achieved overall classification accuracy. Furthermore, their suitability in distinguishing between rhythmically similar genres and the performance of the features resulting from individual activation functions is discussed. Results show that the presented NMF-based beat histogram features can provide comparable performance to other classification systems, while considering strictly drum patterns
- …
