Search CORE

307,817 research outputs found

Recommended from our members

Provably efficient methods for large-scale learning

Author: Yang Shuo, Ph. D.
Publication venue
Publication date: 19/12/2023
Field of study

The scale of machine learning problems grows rapidly in recent years and calls for efficient methods. In this dissertation, we propose simple and efficient methods for various large-scale learning problems. We start with a standard supervised learning problem of solving quadratic regression. In Chapter 2, we show that by utilizing the quadratic structure and a novel gradient estimation algorithm, we can solve sparse quadratic regression with sub-quadratic time complexity and near-optimal sample complexity. We then move to online learning problems. In Chapter 3, we identify a weak assumption and theoretically prove that the standard UCB algorithm efficiently learns from inconsistent human preferences with nearly optimal regret; in Chapter 4 we propose an approximate maximum inner product search data structure for adaptive queries and present two efficient algorithms that achieve sublinear time complexity for linear bandits, which is especially desirable for extremely large and slowly changing action sets. In Chapter 5, we study how to efficiently use privileged features with deep learning models. We present an efficient learning algorithm to exploit privileged features that are not available during testing time. We conduct comprehensive empirical evaluations and present rigorous analysis for linear models to build theoretical insights. It provides a general algorithmic paradigm that can be integrated with many other machine learning methods.Computer Science

Texas ScholarWorks

Multiclass latent locally linear support vector machines

Author: CAPUTO BARBARA
Fornoni Marco
Orabona Francesco
Publication venue: Microtome Publishing
Publication date: 01/01/2013
Field of study

Kernelized Support Vector Machines (SVM) have gained the status of off-the-shelf classifiers, able to deliver state of the art performance on almost any problem. Still, their practical use is constrained by their computational and memory complexity, which grows super-linearly with the number of training samples. In order to retain the low training and testing complexity of linear classifiers and the exibility of non linear ones, a growing, promising alternative is represented by methods that learn non-linear classifiers through local combinations of linear ones. In this paper we propose a new multi class local classifier, based on a latent SVM formulation. The proposed classifier makes use of a set of linear models that are linearly combined using sample and class specific weights. Thanks to the latent formulation, the combination coefficients are modeled as latent variables. We allow soft combinations and we provide a closed-form solution for their estimation, resulting in an efficient prediction rule. This novel formulation allows to learn in a principled way the sample specific weights and the linear classifiers, in a unique optimization problem, using a CCCP optimization procedure. Extensive experiments on ten standard UCI machine learning datasets, one large binary dataset, three character and digit recognition databases, and a visual place categorization dataset show the power of the proposed approach

Infoscience - École polytechnique fédérale de Lausanne

Archivio della ricerca- Università di Roma La Sapienza

Discriminative models for multi-instance problems with tree-structure

Author: Glorot Xavier
Gulcehre Caglar
Muandet Krikamol
Zhang Zhou
Publication venue
Publication date: 07/03/2017
Field of study

Modeling network traffic is gaining importance in order to counter modern threats of ever increasing sophistication. It is though surprisingly difficult and costly to construct reliable classifiers on top of telemetry data due to the variety and complexity of signals that no human can manage to interpret in full. Obtaining training data with sufficiently large and variable body of labels can thus be seen as prohibitive problem. The goal of this work is to detect infected computers by observing their HTTP(S) traffic collected from network sensors, which are typically proxy servers or network firewalls, while relying on only minimal human input in model training phase. We propose a discriminative model that makes decisions based on all computer's traffic observed during predefined time window (5 minutes in our case). The model is trained on collected traffic samples over equally sized time window per large number of computers, where the only labels needed are human verdicts about the computer as a whole (presumed infected vs. presumed clean). As part of training the model itself recognizes discriminative patterns in traffic targeted to individual servers and constructs the final high-level classifier on top of them. We show the classifier to perform with very high precision, while the learned traffic patterns can be interpreted as Indicators of Compromise. In the following we implement the discriminative model as a neural network with special structure reflecting two stacked multi-instance problems. The main advantages of the proposed configuration include not only improved accuracy and ability to learn from gross labels, but also automatic learning of server types (together with their detectors) which are typically visited by infected computers

arXiv.org e-Print Archive

Crossref