Search CORE

3,801 research outputs found

Top Rank Optimization in Linear Time

Author: Jin Rong
Li Nan
Zhou Zhi-Hua
Publication venue
Publication date: 06/10/2014
Field of study

Bipartite ranking aims to learn a real-valued ranking function that orders positive instances before negative instances. Recent efforts of bipartite ranking are focused on optimizing ranking accuracy at the top of the ranked list. Most existing approaches are either to optimize task specific metrics or to extend the ranking loss by emphasizing more on the error associated with the top ranked instances, leading to a high computational cost that is super-linear in the number of training instances. We propose a highly efficient approach, titled TopPush, for optimizing accuracy at the top that has computational complexity linear in the number of training instances. We present a novel analysis that bounds the generalization error for the top ranked instances for the proposed approach. Empirical study shows that the proposed approach is highly competitive to the state-of-the-art approaches and is 10-100 times faster

arXiv.org e-Print Archive

CiteSeerX

CUR Algorithm for Partially Observed Matrices

Author: Jin Rong
Xu Miao
Zhou Zhi-Hua
Publication venue
Publication date: 04/11/2014
Field of study

CUR matrix decomposition computes the low rank approximation of a given matrix by using the actual rows and columns of the matrix. It has been a very useful tool for handling large matrices. One limitation with the existing algorithms for CUR matrix decomposition is that they need an access to the {\it full} matrix, a requirement that can be difficult to fulfill in many real world applications. In this work, we alleviate this limitation by developing a CUR decomposition algorithm for partially observed matrices. In particular, the proposed algorithm computes the low rank approximation of the target matrix based on (i) the randomly sampled rows and columns, and (ii) a subset of observed entries that are randomly sampled from the matrix. Our analysis shows the relative error bound, measured by spectral norm, for the proposed algorithm when the target matrix is of full rank. We also show that only

O(n r\ln r)

observed entries are needed by the proposed algorithm to perfectly recover a rank

r

matrix of size

n\times n

, which improves the sample complexity of the existing algorithms for matrix completion. Empirical studies on both synthetic and real-world datasets verify our theoretical claims and demonstrate the effectiveness of the proposed algorithm

arXiv.org e-Print Archive

CiteSeerX

One Fits All:Power General Time Series Analysis by Pretrained LM

Author: Jin Rong
Niu PeiSong
Sun Liang
Wang Xue
Zhou Tian
Publication venue
Publication date: 22/09/2023
Field of study

Although we have witnessed great success of pre-trained models in natural language processing (NLP) and computer vision (CV), limited progress has been made for general time series analysis. Unlike NLP and CV where a unified model can be used to perform different tasks, specially designed approach still dominates in each time series analysis task such as classification, anomaly detection, forecasting, and few-shot learning. The main challenge that blocks the development of pre-trained model for time series analysis is the lack of a large amount of data for training. In this work, we address this challenge by leveraging language or CV models, pre-trained from billions of tokens, for time series analysis. Specifically, we refrain from altering the self-attention and feedforward layers of the residual blocks in the pre-trained language or image model. This model, known as the Frozen Pretrained Transformer (FPT), is evaluated through fine-tuning on all major types of tasks involving time series. Our results demonstrate that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks, as illustrated in Figure 1. We also found both theoretically and empirically that the self-attention module behaviors similarly to principle component analysis (PCA), an observation that helps explains how transformer bridges the domain gap and a crucial step towards understanding the universality of a pre-trained transformer.The code is publicly available at https://github.com/DAMO-DI-ML/One_Fits_All.Comment: Neurips 2023 Spotligh

arXiv.org e-Print Archive