56,499 research outputs found
Time-Series Contrastive Learning against False Negatives and Class Imbalance
As an exemplary self-supervised approach for representation learning,
time-series contrastive learning has exhibited remarkable advancements in
contemporary research. While recent contrastive learning strategies have
focused on how to construct appropriate positives and negatives, in this study,
we conduct theoretical analysis and find they have overlooked the fundamental
issues: false negatives and class imbalance inherent in the InfoNCE loss-based
framework. Therefore, we introduce a straightforward modification grounded in
the SimCLR framework, universally adaptable to models engaged in the instance
discrimination task. By constructing instance graphs to facilitate interactive
learning among instances, we emulate supervised contrastive learning via the
multiple-instances discrimination task, mitigating the harmful impact of false
negatives. Moreover, leveraging the graph structure and few-labeled data, we
perform semi-supervised consistency classification and enhance the
representative ability of minority classes. We compared our method with the
most popular time-series contrastive learning methods on four real-world
time-series datasets and demonstrated our significant advantages in overall
performance
Time series transductive classification on imbalanced data sets: an experimental study
Graph-based semi-supervised learning (SSL) algorithms perform well on a variety of domains, such as digit recognition and text classification, when the data lie on a low-dimensional manifold. However, it is surprising that these methods have not been effectively applied on time series classification tasks. In this paper, we provide a comprehensive empirical comparison of state-of-the-art graph-based SSL algorithms with respect to graph construction and parameter selection. Specifically, we focus in this paper on the problem of time series transductive classification on imbalanced data sets. Through a comprehensive analysis using recently proposed empirical evaluation models, we confirm some of the hypotheses raised on previous work and show that some of them may not hold in the time series domain. From our results, we suggest the use of the Gaussian Fields and Harmonic Functions algorithm with the mutual k-nearest neighbors graph weighted by the RBF kernel, setting k = 20 on general tasks of time series transductive classification on imbalanced data sets.São Paulo Research Foundation (FAPESP) (grants 2011/17698-5 and 2012/50714-7
Tripartite Graph Clustering for Dynamic Sentiment Analysis on Social Media
The growing popularity of social media (e.g, Twitter) allows users to easily
share information with each other and influence others by expressing their own
sentiments on various subjects. In this work, we propose an unsupervised
\emph{tri-clustering} framework, which analyzes both user-level and tweet-level
sentiments through co-clustering of a tripartite graph. A compelling feature of
the proposed framework is that the quality of sentiment clustering of tweets,
users, and features can be mutually improved by joint clustering. We further
investigate the evolution of user-level sentiments and latent feature vectors
in an online framework and devise an efficient online algorithm to sequentially
update the clustering of tweets, users and features with newly arrived data.
The online framework not only provides better quality of both dynamic
user-level and tweet-level sentiment analysis, but also improves the
computational and storage efficiency. We verified the effectiveness and
efficiency of the proposed approaches on the November 2012 California ballot
Twitter data.Comment: A short version is in Proceeding of the 2014 ACM SIGMOD International
Conference on Management of dat
- …