Search CORE

56,499 research outputs found

Time-Series Contrastive Learning against False Negatives and Class Imbalance

Author: Jin Xiyuan
Lin Youfang
Liu Lei
Wang Jing
Publication venue
Publication date: 19/12/2023
Field of study

As an exemplary self-supervised approach for representation learning, time-series contrastive learning has exhibited remarkable advancements in contemporary research. While recent contrastive learning strategies have focused on how to construct appropriate positives and negatives, in this study, we conduct theoretical analysis and find they have overlooked the fundamental issues: false negatives and class imbalance inherent in the InfoNCE loss-based framework. Therefore, we introduce a straightforward modification grounded in the SimCLR framework, universally adaptable to models engaged in the instance discrimination task. By constructing instance graphs to facilitate interactive learning among instances, we emulate supervised contrastive learning via the multiple-instances discrimination task, mitigating the harmful impact of false negatives. Moreover, leveraging the graph structure and few-labeled data, we perform semi-supervised consistency classification and enhance the representative ability of minority classes. We compared our method with the most popular time-series contrastive learning methods on four real-world time-series datasets and demonstrated our significant advantages in overall performance

arXiv.org e-Print Archive

Time series transductive classification on imbalanced data sets: an experimental study

Author: Batista Gustavo Enrique de Almeida Prado Alves
Sousa Celso Andre Rodrigues de
Souza Vinícius Mourão Alves de
Publication venue: Stockholm
Publication date
Field of study

Graph-based semi-supervised learning (SSL) algorithms perform well on a variety of domains, such as digit recognition and text classification, when the data lie on a low-dimensional manifold. However, it is surprising that these methods have not been effectively applied on time series classification tasks. In this paper, we provide a comprehensive empirical comparison of state-of-the-art graph-based SSL algorithms with respect to graph construction and parameter selection. Specifically, we focus in this paper on the problem of time series transductive classification on imbalanced data sets. Through a comprehensive analysis using recently proposed empirical evaluation models, we confirm some of the hypotheses raised on previous work and show that some of them may not hold in the time series domain. From our results, we suggest the use of the Gaussian Fields and Harmonic Functions algorithm with the mutual k-nearest neighbors graph weighted by the RBF kernel, setting k = 20 on general tasks of time series transductive classification on imbalanced data sets.São Paulo Research Foundation (FAPESP) (grants 2011/17698-5 and 2012/50714-7

Tripartite Graph Clustering for Dynamic Sentiment Analysis on Social Media

Author: Barbosa L.
Cao B.
Davidov D.
Deng H.
Kim J.
Kuhn H. W.
Lee D. D.
Long M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

The growing popularity of social media (e.g, Twitter) allows users to easily share information with each other and influence others by expressing their own sentiments on various subjects. In this work, we propose an unsupervised \emph{tri-clustering} framework, which analyzes both user-level and tweet-level sentiments through co-clustering of a tripartite graph. A compelling feature of the proposed framework is that the quality of sentiment clustering of tweets, users, and features can be mutually improved by joint clustering. We further investigate the evolution of user-level sentiments and latent feature vectors in an online framework and devise an efficient online algorithm to sequentially update the clustering of tweets, users and features with newly arrived data. The online framework not only provides better quality of both dynamic user-level and tweet-level sentiment analysis, but also improves the computational and storage efficiency. We verified the effectiveness and efficiency of the proposed approaches on the November 2012 California ballot Twitter data.Comment: A short version is in Proceeding of the 2014 ACM SIGMOD International Conference on Management of dat

arXiv.org e-Print Archive

CiteSeerX

Crossref