Search CORE

8,304 research outputs found

Model Selection for Support Vector Machine Classification

Author: Burges
Carl Gold
Chapelle
Cristianini
Jaakkola
Krauth
Kwok
Kwok
MacKay
MacKay
Neal
Opper
Opper
Peter Sollich
Press
Smola
Sollich
Vapnik
Vapnik
Vapnik
Wahba
Williams
Williams
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

We address the problem of model selection for Support Vector Machine (SVM) classification. For fixed functional form of the kernel, model selection amounts to tuning kernel parameters and the slack penalty coefficient

C

. We begin by reviewing a recently developed probabilistic framework for SVM classification. An extension to the case of SVMs with quadratic slack penalties is given and a simple approximation for the evidence is derived, which can be used as a criterion for model selection. We also derive the exact gradients of the evidence in terms of posterior averages and describe how they can be estimated numerically using Hybrid Monte Carlo techniques. Though computationally demanding, the resulting gradient ascent algorithm is a useful baseline tool for probabilistic SVM model selection, since it can locate maxima of the exact (unapproximated) evidence. We then perform extensive experiments on several benchmark data sets. The aim of these experiments is to compare the performance of probabilistic model selection criteria with alternatives based on estimates of the test error, namely the so-called ``span estimate'' and Wahba's Generalized Approximate Cross-Validation (GACV) error. We find that all the ``simple'' model criteria (Laplace evidence approximations, and the Span and GACV error estimates) exhibit multiple local optima with respect to the hyperparameters. While some of these give performance that is competitive with results from other approaches in the literature, a significant fraction lead to rather higher test errors. The results for the evidence gradient ascent method show that also the exact evidence exhibits local optima, but these give test errors which are much less variable and also consistently lower than for the simpler model selection criteria

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

King's Research Portal

PhysicsGP: A Genetic Programming Approach to Event Selection

Author: Cousins
Cranmer
Cranmer
Field
Kishore
Koza
Kyle Cranmer
Luke
R. Sean Bowman
Rumelhart
Scott
Sontag
Vaiciulis
Vapnik
Vapnik
Werbos
Publication venue: 'Elsevier BV'
Publication date: 05/02/2004
Field of study

We present a novel multivariate classification technique based on Genetic Programming. The technique is distinct from Genetic Algorithms and offers several advantages compared to Neural Networks and Support Vector Machines. The technique optimizes a set of human-readable classifiers with respect to some user-defined performance measure. We calculate the Vapnik-Chervonenkis dimension of this class of learning machines and consider a practical example: the search for the Standard Model Higgs Boson at the LHC. The resulting classifier is very fast to evaluate, human-readable, and easily portable. The software may be downloaded at: http://cern.ch/~cranmer/PhysicsGP.htmlComment: 16 pages 9 figures, 1 table. Submitted to Comput. Phys. Commu

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

From average case complexity to improper learning complexity

Author: Berthet Q.
Daniely A.
Feige U.
Vapnik V. N.
Publication venue
Publication date: 01/01/2014
Field of study

The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms. The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a.k.a. representation independent learning).The difficulty in proving lower bounds for improper learning is that the standard reductions from

\mathbf{NP}

-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in (Kearns and Valiant 89) and relies on cryptographic assumptions. We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption (Feige 02) about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular, 1. Learning

\mathrm{DNF}

's is hard. 2. Agnostically learning halfspaces with a constant approximation ratio is hard. 3. Learning an intersection of

\omega(1)

halfspaces is hard.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fake View Analytics in Online Video Services

Author: Bolton R. J.
Cao Q.
Joachims T.
Vapnik V. N.
Publication venue
Publication date: 18/12/2013
Field of study

Online video-on-demand(VoD) services invariably maintain a view count for each video they serve, and it has become an important currency for various stakeholders, from viewers, to content owners, advertizers, and the online service providers themselves. There is often significant financial incentive to use a robot (or a botnet) to artificially create fake views. How can we detect the fake views? Can we detect them (and stop them) using online algorithms as they occur? What is the extent of fake views with current VoD service providers? These are the questions we study in the paper. We develop some algorithms and show that they are quite effective for this problem.Comment: 25 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Anatomy determines etiology in thoracic aortic aneurysm

Author: Vapnik Joshua
Publication venue
Publication date: 08/04/2016
Field of study

BACKGROUND: It is well established that thoracic aortic aneurysms (TAA) and abdominal aortic aneurysms (AAA) have different risk factors, clinical features, and genetic influences. Differences between and amongst subtypes of TAAs have received less attention. Despite observations of divergent clinical outcomes between ascending thoracic aortic aneurysms (ATAAs) and descending thoracic aortic aneurysms (DTAAs), etiologic factors determining the anatomic distribution of these aneurysms are not well understood. METHODS: From 3,247 patients registered in an institutional Thoracic Aortic Center Database from July 1992 through August 2013, we identified 921 patients with full aortic dimensional imaging by CT or MRI scan with TAA > 3.5 cm and without evidence of aortic dissection (AoD). Patients were analyzed in three groups: isolated ATAA (n=677), isolated DTAA (n=97), and combined ATAA and DTAA (n=146). RESULTS: Patients with a DTAA, alone or with coexistent ATAA, had significantly more hypertension (80.6% vs. 61.8%, p<.001) and a higher burden of atherosclerotic disease ( 86.7% vs. 7.5%, p<.001) ) and were more likely to be female (59.3% vs. 29.5%, P<.001). Conversely, patients with isolated ATAA were significantly younger (average age 59.5 vs. 71, p<.001), and contained almost every case of overt genetically-triggered TAA. Patients with isolated DTAA were demographically indistinguishable from patients with combined ATAA and DTAA. In follow up, patients with isolated DTAA, or with ATAA and DTAA, experienced significantly more aortic events (aortic dissection/rupture) and had higher mortality than patients with isolated ATAA. CONCLUSIONS: Based on patient characteristics and outcomes, subtypes of TAA emerge. DTAA with or without associated ATAA or AAA appears to be a disease more highly associated with atherosclerosis, hypertension, and advanced age. In contrast, isolated ATAA appears to be a clinically distinct entity with a higher burden of genetically triggered disease. These data have important implications for familial screening recommendations for TAA

Boston University Institutional Repository (OpenBU)

1,4-Diazabicyclo[2.2.2]octane (DABCO) as a useful catalyst in organic synthesis

Author: Friston
Friston
MourÃ£o-Miranda
SchÃ¶lkopf
Talairach
Vapnik
Vapnik
Publication venue: 'European Journal of Chemistry'
Publication date: 31/03/2010
Field of study

1,4-diazabicyclo[2.2.2]octane (DABCO) has been used in many organic preparations as a good solid catalyst. DABCO has received considerable attention as an inexpensive, eco-friendly, high reactive, easy to handle and non-toxic base catalyst for various organic transformations, affording the corresponding products in excellent yields with high selectivity. In this review, some applications of this catalyst in organic reactions were discussed

Crossref

European Journal of Chemistry

Second-Generation Objects in the Universe: Radiative Cooling and Collapse of Halos with Virial Temperatures Above 10^4 Kelvin

Author: C Cortes
JC Platt
N Cristianini
VN Vapnik
VN Vapnik
Publication venue
Publication date: 01/01/2001
Field of study

The first generation of protogalaxies likely formed out of primordial gas via H2-cooling in cosmological minihalos with virial temperatures of a few 1000K. However, their abundance is likely to have been severely limited by feedback processes which suppressed H2 formation. The formation of the protogalaxies responsible for reionization and metal-enrichment of the intergalactic medium, then had to await the collapse of larger halos. Here we investigate the radiative cooling and collapse of gas in halos with virial temperatures Tvir > 10^4K. In these halos, efficient atomic line radiation allows rapid cooling of the gas to 8000 K; subsequently the gas can contract nearly isothermally at this temperature. Without an additional coolant, the gas would likely settle into a locally gravitationally stable disk; only disks with unusually low spin would be unstable. However, we find that the initial atomic line cooling leaves a large, out-of-equilibrium residual free electron fraction. This allows the molecular fraction to build up to a universal value of about x(H2) = 10^-3, almost independently of initial density and temperature. We show that this is a non--equilibrium freezeout value that can be understood in terms of timescale arguments. Furthermore, unlike in less massive halos, H2 formation is largely impervious to feedback from external UV fields, due to the high initial densities achieved by atomic cooling. The H2 molecules cool the gas further to about 100K, and allow the gas to fragment on scales of a few 100 Msun. We investigate the importance of various feedback effects such as H2-photodissociation from internal UV fields and radiation pressure due to Ly-alpha photon trapping, which are likely to regulate the efficiency of star formation.Comment: Revised version accepted by ApJ; some reorganization for clarit

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of East Anglia digital repository

On the Chromatic Thresholds of Hypergraphs

Author: DHRUV MUBAYI
Frankl
JANE BUTTERFIELD
JOHN LENZ
JÓZSEF BALOGH
PING HU
Vapnik
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 14/10/2013
Field of study

Let F be a family of r-uniform hypergraphs. The chromatic threshold of F is the infimum of all non-negative reals c such that the subfamily of F comprising hypergraphs H with minimum degree at least

c \binom{|V(H)|}{r-1}

has bounded chromatic number. This parameter has a long history for graphs (r=2), and in this paper we begin its systematic study for hypergraphs. {\L}uczak and Thomass\'e recently proved that the chromatic threshold of the so-called near bipartite graphs is zero, and our main contribution is to generalize this result to r-uniform hypergraphs. For this class of hypergraphs, we also show that the exact Tur\'an number is achieved uniquely by the complete (r+1)-partite hypergraph with nearly equal part sizes. This is one of very few infinite families of nondegenerate hypergraphs whose Tur\'an number is determined exactly. In an attempt to generalize Thomassen's result that the chromatic threshold of triangle-free graphs is 1/3, we prove bounds for the chromatic threshold of the family of 3-uniform hypergraphs not containing {abc, abd, cde}, the so-called generalized triangle. In order to prove upper bounds we introduce the concept of fiber bundles, which can be thought of as a hypergraph analogue of directed graphs. This leads to the notion of fiber bundle dimension, a structural property of fiber bundles that is based on the idea of Vapnik-Chervonenkis dimension in hypergraphs. Our lower bounds follow from explicit constructions, many of which use a hypergraph analogue of the Kneser graph. Using methods from extremal set theory, we prove that these Kneser hypergraphs have unbounded chromatic number. This generalizes a result of Szemer\'edi for graphs and might be of independent interest. Many open problems remain.Comment: 37 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Classification of partial discharge signals by combining adaptive local iterative filtering and entropy features

Author: Alan Nesbitt
Bishop
Brian Stewart
Gordon Morison
Imene Mitiche
Michael Hughes-Narborough
Philip Boreham
Rathie
Vapnik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

Electro-Magnetic Interference (EMI) is a measurement technique for Partial Discharge (PD) signals which arise in operating electrical machines, generators and other auxiliary equipment due to insulation degradation. Assessment of PD can help to reduce machine downtime and circumvent high replacement and maintenance costs. EMI signals can be complex to analyze due to their nonstationary nature. In this paper, a software condition-monitoring model is presented and a novel feature extraction technique, suitable for nonstationary EMI signals, is developed. This method maps multiple discharge sources signals, including PD, from the time domain to a feature space which aids interpretation of subsequent fault information. Results show excellent performance in classifying the different discharge sources

Multidisciplinary Digital Publishing Institute

Crossref

University of Strathclyde Institutional Repository

Directory of Open Access Journals

ResearchOnline@GCU