1 research outputs found
Robust Regression via Online Feature Selection under Adversarial Data Corruption
The presence of data corruption in user-generated streaming data, such as
social media, motivates a new fundamental problem that learns reliable
regression coefficient when features are not accessible entirely at one time.
Until now, several important challenges still cannot be handled concurrently:
1) corrupted data estimation when only partial features are accessible; 2)
online feature selection when data contains adversarial corruption; and 3)
scaling to a massive dataset. This paper proposes a novel RObust regression
algorithm via Online Feature Selection (\textit{RoOFS}) that concurrently
addresses all the above challenges. Specifically, the algorithm iteratively
updates the regression coefficients and the uncorrupted set via a robust online
feature substitution method. We also prove that our algorithm has a restricted
error bound compared to the optimal solution. Extensive empirical experiments
in both synthetic and real-world datasets demonstrated that the effectiveness
of our new method is superior to that of existing methods in the recovery of
both feature selection and regression coefficients, with very competitive
efficiency.Comment: 10 pages, 3 figure