Search CORE

282 research outputs found

Quadruply Stochastic Gradient Method for Large Scale Nonlinear Semi-Supervised Ordinal Regression AUC Optimization

Author: Gu Bin
Huang Heng
Li Xinag
Shi Wanli
Publication venue
Publication date: 23/12/2019
Field of study

Semi-supervised ordinal regression (S

^2

OR) problems are ubiquitous in real-world applications, where only a few ordered instances are labeled and massive instances remain unlabeled. Recent researches have shown that directly optimizing concordance index or AUC can impose a better ranking on the data than optimizing the traditional error rate in ordinal regression (OR) problems. In this paper, we propose an unbiased objective function for S

^2

OR AUC optimization based on ordinal binary decomposition approach. Besides, to handle the large-scale kernelized learning problems, we propose a scalable algorithm called QS

^3

ORAO using the doubly stochastic gradients (DSG) framework for functional optimization. Theoretically, we prove that our method can converge to the optimal solution at the rate of

O(1/t)

, where

t

is the number of iterations for stochastic data sampling. Extensive experimental results on various benchmark and real-world datasets also demonstrate that our method is efficient and effective while retaining similar generalization performance.Comment: 12 pages, 9 figures, conferenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

AUC Optimization from Multiple Unlabeled Datasets

Author: Li Ming
Liu Yu
Xie Zheng
Publication venue
Publication date: 20/07/2023
Field of study

Weakly supervised learning aims to empower machine learning when the perfect supervision is unavailable, which has drawn great attention from researchers. Among various types of weak supervision, one of the most challenging cases is to learn from multiple unlabeled (U) datasets with only a little knowledge of the class priors, or U

^m

learning for short. In this paper, we study the problem of building an AUC (area under ROC curve) optimization model from multiple unlabeled datasets, which maximizes the pairwise ranking ability of the classifier. We propose U

^m

-AUC, an AUC optimization approach that converts the U

^m

data into a multi-label AUC optimization problem, and can be trained efficiently. We show that the proposed U

^m

-AUC is effective theoretically and empirically

arXiv.org e-Print Archive

A Symmetric Loss Perspective of Reliable Machine Learning

Author: Charoenphakdee Nontawat
Lee Jongyeong
Sugiyama Masashi
Publication venue
Publication date: 05/01/2021
Field of study

When minimizing the empirical risk in binary classification, it is a common practice to replace the zero-one loss with a surrogate loss to make the learning objective feasible to optimize. Examples of well-known surrogate losses for binary classification include the logistic loss, hinge loss, and sigmoid loss. It is known that the choice of a surrogate loss can highly influence the performance of the trained classifier and therefore it should be carefully chosen. Recently, surrogate losses that satisfy a certain symmetric condition (aka., symmetric losses) have demonstrated their usefulness in learning from corrupted labels. In this article, we provide an overview of symmetric losses and their applications. First, we review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization and area under the receiver operating characteristic curve (AUC) maximization. Then, we demonstrate how the robust AUC maximization method can benefit natural language processing in the problem where we want to learn only from relevant keywords and unlabeled documents. Finally, we conclude this article by discussing future directions, including potential applications of symmetric losses for reliable machine learning and the design of non-symmetric losses that can benefit from the symmetric condition.Comment: Preprint of an Invited Review Articl

arXiv.org e-Print Archive

Semi-supervised novelty detection

Author: Blanchard Gilles
Lee Gyemin
Scott Clayton
Publication venue: Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik
Publication date: 01/01/2009
Field of study

A common setting for novelty detection assumes that labeled examples from the nominal class are available, but that labeled examples of novelties are unavailable. The standard (inductive) approach is to declare novelties where the nominal density is low, which reduces the problem to density level set estimation. In this paper, we consider the setting where an unlabeled and possibly contaminated sample is also available at learning time. We argue that novelty detection in this semi-supervised setting is naturally solved by a general reduction to a binary classification problem. In particular, a detector with a desired false positive rate can be achieved through a reduction to Neyman-Pearson classification. Unlike the inductive approach, semi-supervised novelty detection (SSND) yields detectors that are optimal (e.g., statistically consistent) regardless of the distribution on novelties. Therefore, in novelty detection, unlabeled data have a substantial impact on the theoretical properties of the decision rule. We validate the practical utility of SSND with an extensive experimental study. We also show that SSND provides distribution-free, learning-theoretic solutions to two well known problems in hypothesis testing. First, our results provide a general solution to the general two-sample problem, that is, the problem of determining whether two random samples arise from the same distribution. Second, a specialization of SSND coincides with the standard

p

-value approach to multiple testing under the so-called random effects model. Unlike standard rejection regions based on thresholded

p

-values, the general SSND framework allows for adaptation to arbitrary alternative distributions

Repositorium für Naturwissenschaften und Technik

Learning on Graphs with Out-of-Distribution Nodes

Author: Song Yu
Wang Donglin
Publication venue
Publication date: 13/08/2023
Field of study

Graph Neural Networks (GNNs) are state-of-the-art models for performing prediction tasks on graphs. While existing GNNs have shown great performance on various tasks related to graphs, little attention has been paid to the scenario where out-of-distribution (OOD) nodes exist in the graph during training and inference. Borrowing the concept from CV and NLP, we define OOD nodes as nodes with labels unseen from the training set. Since a lot of networks are automatically constructed by programs, real-world graphs are often noisy and may contain nodes from unknown distributions. In this work, we define the problem of graph learning with out-of-distribution nodes. Specifically, we aim to accomplish two tasks: 1) detect nodes which do not belong to the known distribution and 2) classify the remaining nodes to be one of the known classes. We demonstrate that the connection patterns in graphs are informative for outlier detection, and propose Out-of-Distribution Graph Attention Network (OODGAT), a novel GNN model which explicitly models the interaction between different kinds of nodes and separate inliers from outliers during feature propagation. Extensive experiments show that OODGAT outperforms existing outlier detection methods by a large margin, while being better or comparable in terms of in-distribution classification.Comment: Accepted by KDD'2

arXiv.org e-Print Archive