5,278 research outputs found
Robust Learning from Bites
Many robust statistical procedures have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets. Secondly, robust confidence intervals for the estimated parameters or robust predictions according to the fitted models are often unknown. Here, we propose a general method to overcome these problems of robust estimation in the context of huge data sets. The method is scalable to the memory of the computer, can be distributed on several processors if available, and can help to reduce the computation time substantially. The method additionally offers distribution-free confidence intervals for the median of the predictions. The method is illustrated for two situations: robust estimation in linear regression and kernel logistic regression from statistical machine learning. --
On a strategy to develop robust and simple tariffs from motor vehicle insurance data
The goals of this paper are twofold: we describe common features in data sets from motor vehicle insurance companies and we investigate a general strategy which exploits the knowledge of such features. The results of the strategy are a basis to develop insurance tariffs. The strategy is applied to a data set from motor vehicle insurance companies. We use a nonparametric approach based on a combination of kernel logistic regression and ¡support vector regression. --Classification,Data Mining,Insurance tariffs,Kernel logistic regression,Machine learning,Regression,Robustness,Simplicity,Support Vector Machine,Support Vector Regression
Regression depth and support vector machine
The regression depth method (RDM) proposed by Rousseeuw and Hubert [RH99] plays an important role in the area of robust regression for a continuous response variable. Christmann and Rousseeuw [CR01] showed that RDM is also useful for the case of binary regression. Vapnik?s convex risk minimization principle [Vap98] has a dominating role in statistical machine learning theory. Important special cases are the support vector machine (SVM), [epsilon]-support vector regression and kernel logistic regression. In this paper connections between these methods from different disciplines are investigated for the case of pattern recognition. Some results concerning the robustness of the SVM and other kernel based methods are given. --
No judge, no job!: Judicial discretion and incomplete labor contracts
The decision making of judges is prone to error and misapprehension. Consequently, the prevailing literature ties the economic function of courts to dispute resolution and minimization of rule making costs. In contrast to previous research, this analysis applies a contract theoretic perspective to the ruling of courts and keeps the focus on the implemented market transactions. Using labor contracts as institutional setting, performance and limitations of judicial law making are formally investigated and compared to the effects of specific legislation. It is shown that the efficient relation of legislative law making and judicial discretion is defined by the characteristics of the particular field of law and the actual market structure. The model also suggests a mutual dependency between legislation and adjudication to establish efficiency in law, contradicting the traditional legal doctrines of exclusive legislation or sole case-law. --incomplete contracts,judicial law making,legislation
Getting paid for sex is my kick: a qualitative study of male sex workers
As with its female counterpart, male sex work (MSW) has generally been regarded as deeply problematic, either because of negative societal attitudes to the selling of sex or the prevalence of psychosocial and economic problems amongst those attracted to MSW and the attendant health risks and dangers encountered whilst engaged in it. While the phenomenon of female sex work has received a great deal of criminological scrutiny, there has been comparatively less attention paid to male sex workers (MSWs). The research which we report on in this chapter aimed to further our understanding of the motivations of MSWs, the risks they face, their engagement with support agencies and their intentions for the future
Qualitative Robustness of Support Vector Machines
Support vector machines have attracted much attention in theoretical and in
applied statistics. Main topics of recent interest are consistency, learning
rates and robustness. In this article, it is shown that support vector machines
are qualitatively robust. Since support vector machines can be represented by a
functional on the set of all probability measures, qualitative robustness is
proven by showing that this functional is continuous with respect to the
topology generated by weak convergence of probability measures. Combined with
the existence and uniqueness of support vector machines, our results show that
support vector machines are the solutions of a well-posed mathematical problem
in Hadamard's sense
Browsing through 3D representations of unstructured picture collections: an empirical study
The paper presents a 3D interactive representation of fairly large picture
collections which facilitates browsing through unstructured sets of icons or
pictures. Implementation of this representation implies choosing between two
visualization strategies: users may either manipulate the view (OV) or be
immersed in it (IV). The paper first presents this representation, then
describes an empirical study (17 participants) aimed at assessing the utility
and usability of each view. Subjective judgements in questionnaires and
debriefings were varied: 7 participants preferred the IV view, 4 the OV one,
and 6 could not choose between the two. Visual acuity and visual exploration
strategies seem to have exerted a greater influence on participants'
preferences than task performance or feeling of immersion.Comment: 4 page
Estimating conditional quantiles with the help of the pinball loss
The so-called pinball loss for estimating conditional quantiles is a
well-known tool in both statistics and machine learning. So far, however, only
little work has been done to quantify the efficiency of this tool for
nonparametric approaches. We fill this gap by establishing inequalities that
describe how close approximate pinball risk minimizers are to the corresponding
conditional quantile. These inequalities, which hold under mild assumptions on
the data-generating distribution, are then used to establish so-called variance
bounds, which recently turned out to play an important role in the statistical
analysis of (regularized) empirical risk minimization approaches. Finally, we
use both types of inequalities to establish an oracle inequality for support
vector machines that use the pinball loss. The resulting learning rates are
min--max optimal under some standard regularity assumptions on the conditional
quantile.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ267 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …