1,607 research outputs found

    Monotonicity preserving approximation of multivariate scattered data

    Full text link
    This paper describes a new method of monotone interpolation and smoothing of multivariate scattered data. It is based on the assumption that the function to be approximated is Lipschitz continuous. The method provides the optimal approximation in the worst case scenario and tight error bounds. Smoothing of noisy data subject to monotonicity constraints is converted into a quadratic programming problem. Estimation of the unknown Lipschitz constant from the data by sample splitting and cross-validation is described. Extension of the method for locally Lipschitz functions is presented.<br /

    Functional Data Analysis in Electronic Commerce Research

    Full text link
    This paper describes opportunities and challenges of using functional data analysis (FDA) for the exploration and analysis of data originating from electronic commerce (eCommerce). We discuss the special data structures that arise in the online environment and why FDA is a natural approach for representing and analyzing such data. The paper reviews several FDA methods and motivates their usefulness in eCommerce research by providing a glimpse into new domain insights that they allow. We argue that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask (and subsequently answer) new research questions that classical statistical methods are not able to address, and also by expanding on research questions beyond the ones traditionally asked in the offline environment. We describe several applications originating from online transactions which are new to the statistics literature, and point out statistical challenges accompanied by some solutions. We also discuss some promising future directions for joint research efforts between researchers in eCommerce and statistics.Comment: Published at http://dx.doi.org/10.1214/088342306000000132 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Smoothing sparse and unevenly sampled curves using semiparametric mixed models: An application to online auctions

    Get PDF
    Functional data analysis can be challenging when the functional objects are sampled only very sparsely and unevenly. Most approaches rely on smoothing to recover the underlying functional object from the data which can be difficult if the data is irregularly distributed. In this paper we present a new approach that can overcome this challenge. The approach is based on the ideas of mixed models. Specifically, we propose a semiparametric mixed model with boosting to recover the functional object. While the model can handle sparse and unevenly distributed data, it also results in conceptually more meaningful functional objects. In particular, we motivate our method within the framework of eBay's online auctions. Online auctions produce monotonic increasing price curves that are often correlated across two auctions. The semiparametric mixed model accounts for this correlation in a parsimonious way. It also estimates the underlying increasing trend from the data without imposing model-constraints. Our application shows that the resulting functional objects are conceptually more appealing. Moreover, when used to forecast the outcome of an online auction, our approach also results in more accurate price predictions compared to standard approaches. We illustrate our model on a set of 183 closed auctions for Palm M515 personal digital assistants

    L1 Control Theoretic Smoothing Splines

    Get PDF
    In this paper, we propose control theoretic smoothing splines with L1 optimality for reducing the number of parameters that describes the fitted curve as well as removing outlier data. A control theoretic spline is a smoothing spline that is generated as an output of a given linear dynamical system. Conventional design requires exactly the same number of base functions as given data, and the result is not robust against outliers. To solve these problems, we propose to use L1 optimality, that is, we use the L1 norm for the regularization term and/or the empirical risk term. The optimization is described by a convex optimization, which can be efficiently solved via a numerical optimization software. A numerical example shows the effectiveness of the proposed method.Comment: Accepted for publication in IEEE Signal Processing Letters. 4 pages (twocolumn), 5 figure

    Shape preserving approximation using least squares splines

    Full text link
    Least squares polynomial splines are an effective tool for data fitting, but they may fail to preserve essential properties of the underlying function, such as monotonicity or convexity. The shape restrictions are translated into linear inequality conditions on spline coefficients. The basis functions are selected in such a way that these conditions take a simple form, and the problem becomes non-negative least squares problem, for which effecitive and robust methods of solution exist. Multidimensional monotone approximation is achieved by using tensor-product splines with the appropriate restrictions. Additional inter polation conditions can also be introduced. The conversion formulas to traditional B-spline representation are provided. <br /

    Conditional Transformation Models

    Full text link
    The ultimate goal of regression analysis is to obtain information about the conditional distribution of a response given a set of explanatory variables. This goal is, however, seldom achieved because most established regression models only estimate the conditional mean as a function of the explanatory variables and assume that higher moments are not affected by the regressors. The underlying reason for such a restriction is the assumption of additivity of signal and noise. We propose to relax this common assumption in the framework of transformation models. The novel class of semiparametric regression models proposed herein allows transformation functions to depend on explanatory variables. These transformation functions are estimated by regularised optimisation of scoring rules for probabilistic forecasts, e.g. the continuous ranked probability score. The corresponding estimated conditional distribution functions are consistent. Conditional transformation models are potentially useful for describing possible heteroscedasticity, comparing spatially varying distributions, identifying extreme events, deriving prediction intervals and selecting variables beyond mean regression effects. An empirical investigation based on a heteroscedastic varying coefficient simulation model demonstrates that semiparametric estimation of conditional distribution functions can be more beneficial than kernel-based non-parametric approaches or parametric generalised additive models for location, scale and shape

    Monotone approximation of aggregation operators using least squares splines

    Full text link
    The need for monotone approximation of scattered data often arises in many problems of regression, when the monotonicity is semantically important. One such domain is fuzzy set theory, where membership functions and aggregation operators are order preserving. Least squares polynomial splines provide great flexbility when modeling non-linear functions, but may fail to be monotone. Linear restrictions on spline coefficients provide necessary and sufficient conditions for spline monotonicity. The basis for splines is selected in such a way that these restrictions take an especially simple form. The resulting non-negative least squares problem can be solved by a variety of standard proven techniques. Additional interpolation requirements can also be imposed in the same framework. The method is applied to fuzzy systems, where membership functions and aggregation operators are constructed from empirical data.<br /
    corecore