22 research outputs found
Statistical Properties of the log-cosh Loss Function Used in Machine Learning
This paper analyzes a popular loss function used in machine learning called
the log-cosh loss function. A number of papers have been published using this
loss function but, to date, no statistical analysis has been presented in the
literature. In this paper, we present the distribution function from which the
log-cosh loss arises. We compare it to a similar distribution, called the
Cauchy distribution, and carry out various statistical procedures that
characterize its properties. In particular, we examine its associated pdf, cdf,
likelihood function and Fisher information. Side-by-side we consider the Cauchy
and Cosh distributions as well as the MLE of the location parameter with
asymptotic bias, asymptotic variance, and confidence intervals. We also provide
a comparison of robust estimators from several other loss functions, including
the Huber loss function and the rank dispersion function. Further, we examine
the use of the log-cosh function for quantile regression. In particular, we
identify a quantile distribution function from which a maximum likelihood
estimator for quantile regression can be derived. Finally, we compare a
quantile M-estimator based on log-cosh with robust monotonicity against another
approach to quantile regression based on convolutional smoothing.Comment: 10 pages, 17 figure
Estimation of the slope parameter for linear regression model with uncertain prior information
The estimation of the
slope parameter of the linear regression model with normal error
is considered in this paper when uncertain prior information on
the value of the slope is available. Several alternative
estimators are defined to incorporate both the sample as well as
the non-sample information in the estimation process. Some
important statistical properties of the restricted, preliminary
test, and shrinkage estimators are investigated. The performances
of the estimators are compared based on the criteria of
unbiasedness and mean square error. Both analytical and graphical
methods are explored. None of the estimators is found to be
uniformly superior over the others. However, if the non-sample
information regarding the value of the slope is close to its true
value, the shrinkage estimator over performs the rest of the
estimators
Comparison of estimators of means based on p-samples from multivariate Student-t population
Different strategies for the estimation of the mean for samples from p multivariate Student-t populations in presence of uncertain prior information on the value of the mean in the form of a null hypothesis is investigated. Based on the likelihood function and the uncertain prior information, four different estimators, namely, the unrestricted, restricted, pre-test and shrinkage esti¬mators of the location parameter for a location-scale model are defined. The expressions for the bias, mean square error and risk under quadratic loss function are obtained for each of the estimators. Comparison of the performances of the estimators are made with respect to the mean square error, relative efficiency and quadratic risk under the null as we as the alternative hypotheses. Conclusions regarding the relative performance, dominance picture and inadmissibility of the estimators are also provided
Robustified version of Stein's multivariate location estimation
We study a subclass of the Stein class of estimators for multivariate normal mean minimax with respect to quadratic loss. Besides being minimax for normal distribution, these estimators uniformly dominate the naive estimator X also for a distribution normal inside an interval and exponential in tails.Minimax estimator Bayes estimator superharmonic function quadratic loss