5,278 research outputs found

    Robust Learning from Bites

    Get PDF
    Many robust statistical procedures have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets. Secondly, robust confidence intervals for the estimated parameters or robust predictions according to the fitted models are often unknown. Here, we propose a general method to overcome these problems of robust estimation in the context of huge data sets. The method is scalable to the memory of the computer, can be distributed on several processors if available, and can help to reduce the computation time substantially. The method additionally offers distribution-free confidence intervals for the median of the predictions. The method is illustrated for two situations: robust estimation in linear regression and kernel logistic regression from statistical machine learning. --

    On a strategy to develop robust and simple tariffs from motor vehicle insurance data

    Get PDF
    The goals of this paper are twofold: we describe common features in data sets from motor vehicle insurance companies and we investigate a general strategy which exploits the knowledge of such features. The results of the strategy are a basis to develop insurance tariffs. The strategy is applied to a data set from motor vehicle insurance companies. We use a nonparametric approach based on a combination of kernel logistic regression and ¡support vector regression. --Classification,Data Mining,Insurance tariffs,Kernel logistic regression,Machine learning,Regression,Robustness,Simplicity,Support Vector Machine,Support Vector Regression

    Regression depth and support vector machine

    Get PDF
    The regression depth method (RDM) proposed by Rousseeuw and Hubert [RH99] plays an important role in the area of robust regression for a continuous response variable. Christmann and Rousseeuw [CR01] showed that RDM is also useful for the case of binary regression. Vapnik?s convex risk minimization principle [Vap98] has a dominating role in statistical machine learning theory. Important special cases are the support vector machine (SVM), [epsilon]-support vector regression and kernel logistic regression. In this paper connections between these methods from different disciplines are investigated for the case of pattern recognition. Some results concerning the robustness of the SVM and other kernel based methods are given. --

    No judge, no job!: Judicial discretion and incomplete labor contracts

    Get PDF
    The decision making of judges is prone to error and misapprehension. Consequently, the prevailing literature ties the economic function of courts to dispute resolution and minimization of rule making costs. In contrast to previous research, this analysis applies a contract theoretic perspective to the ruling of courts and keeps the focus on the implemented market transactions. Using labor contracts as institutional setting, performance and limitations of judicial law making are formally investigated and compared to the effects of specific legislation. It is shown that the efficient relation of legislative law making and judicial discretion is defined by the characteristics of the particular field of law and the actual market structure. The model also suggests a mutual dependency between legislation and adjudication to establish efficiency in law, contradicting the traditional legal doctrines of exclusive legislation or sole case-law. --incomplete contracts,judicial law making,legislation

    Getting paid for sex is my kick: a qualitative study of male sex workers

    Get PDF
    As with its female counterpart, male sex work (MSW) has generally been regarded as deeply problematic, either because of negative societal attitudes to the selling of sex or the prevalence of psychosocial and economic problems amongst those attracted to MSW and the attendant health risks and dangers encountered whilst engaged in it. While the phenomenon of female sex work has received a great deal of criminological scrutiny, there has been comparatively less attention paid to male sex workers (MSWs). The research which we report on in this chapter aimed to further our understanding of the motivations of MSWs, the risks they face, their engagement with support agencies and their intentions for the future

    Qualitative Robustness of Support Vector Machines

    Get PDF
    Support vector machines have attracted much attention in theoretical and in applied statistics. Main topics of recent interest are consistency, learning rates and robustness. In this article, it is shown that support vector machines are qualitatively robust. Since support vector machines can be represented by a functional on the set of all probability measures, qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of support vector machines, our results show that support vector machines are the solutions of a well-posed mathematical problem in Hadamard's sense

    Browsing through 3D representations of unstructured picture collections: an empirical study

    Get PDF
    The paper presents a 3D interactive representation of fairly large picture collections which facilitates browsing through unstructured sets of icons or pictures. Implementation of this representation implies choosing between two visualization strategies: users may either manipulate the view (OV) or be immersed in it (IV). The paper first presents this representation, then describes an empirical study (17 participants) aimed at assessing the utility and usability of each view. Subjective judgements in questionnaires and debriefings were varied: 7 participants preferred the IV view, 4 the OV one, and 6 could not choose between the two. Visual acuity and visual exploration strategies seem to have exerted a greater influence on participants' preferences than task performance or feeling of immersion.Comment: 4 page

    Estimating conditional quantiles with the help of the pinball loss

    Full text link
    The so-called pinball loss for estimating conditional quantiles is a well-known tool in both statistics and machine learning. So far, however, only little work has been done to quantify the efficiency of this tool for nonparametric approaches. We fill this gap by establishing inequalities that describe how close approximate pinball risk minimizers are to the corresponding conditional quantile. These inequalities, which hold under mild assumptions on the data-generating distribution, are then used to establish so-called variance bounds, which recently turned out to play an important role in the statistical analysis of (regularized) empirical risk minimization approaches. Finally, we use both types of inequalities to establish an oracle inequality for support vector machines that use the pinball loss. The resulting learning rates are min--max optimal under some standard regularity assumptions on the conditional quantile.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ267 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
    corecore