24,483 research outputs found

    Better estimates from binned income data: Interpolated CDFs and mean-matching

    Full text link
    Researchers often estimate income statistics from summaries that report the number of incomes in bins such as \$0-10,000, \$10,001-20,000,...,\$200,000+. Some analysts assign incomes to bin midpoints, but this treats income as discrete. Other analysts fit a continuous parametric distribution, but the distribution may not fit well. We fit nonparametric continuous distributions that reproduce the bin counts perfectly by interpolating the cumulative distribution function (CDF). We also show how both midpoints and interpolated CDFs can be constrained to reproduce the mean of income when it is known. We compare the methods' accuracy in estimating the Gini coefficients of all 3,221 US counties. Fitting parametric distributions is very slow. Fitting interpolated CDFs is much faster and slightly more accurate. Both interpolated CDFs and midpoints give dramatically better estimates if constrained to match a known mean. We have implemented interpolated CDFs in the binsmooth package for R. We have implemented the midpoint method in the rpme command for Stata. Both implementations can be constrained to match a known mean.Comment: 20 pages (including Appendix), 3 tables, 2 figures (+2 in Appendix

    Short and long-term wind turbine power output prediction

    Get PDF
    In the wind energy industry, it is of great importance to develop models that accurately forecast the power output of a wind turbine, as such predictions are used for wind farm location assessment or power pricing and bidding, monitoring, and preventive maintenance. As a first step, and following the guidelines of the existing literature, we use the supervisory control and data acquisition (SCADA) data to model the wind turbine power curve (WTPC). We explore various parametric and non-parametric approaches for the modeling of the WTPC, such as parametric logistic functions, and non-parametric piecewise linear, polynomial, or cubic spline interpolation functions. We demonstrate that all aforementioned classes of models are rich enough (with respect to their relative complexity) to accurately model the WTPC, as their mean squared error (MSE) is close to the MSE lower bound calculated from the historical data. We further enhance the accuracy of our proposed model, by incorporating additional environmental factors that affect the power output, such as the ambient temperature, and the wind direction. However, all aforementioned models, when it comes to forecasting, seem to have an intrinsic limitation, due to their inability to capture the inherent auto-correlation of the data. To avoid this conundrum, we show that adding a properly scaled ARMA modeling layer increases short-term prediction performance, while keeping the long-term prediction capability of the model

    B-spline techniques for volatility modeling

    Full text link
    This paper is devoted to the application of B-splines to volatility modeling, specifically the calibration of the leverage function in stochastic local volatility models and the parameterization of an arbitrage-free implied volatility surface calibrated to sparse option data. We use an extension of classical B-splines obtained by including basis functions with infinite support. We first come back to the application of shape-constrained B-splines to the estimation of conditional expectations, not merely from a scatter plot but also from the given marginal distributions. An application is the Monte Carlo calibration of stochastic local volatility models by Markov projection. Then we present a new technique for the calibration of an implied volatility surface to sparse option data. We use a B-spline parameterization of the Radon-Nikodym derivative of the underlying's risk-neutral probability density with respect to a roughly calibrated base model. We show that this method provides smooth arbitrage-free implied volatility surfaces. Finally, we sketch a Galerkin method with B-spline finite elements to the solution of the partial differential equation satisfied by the Radon-Nikodym derivative.Comment: 25 page

    Sparse implicitization by interpolation: Characterizing non-exactness and an application to computing discriminants

    Get PDF
    We revisit implicitization by interpolation in order to examine its properties in the context of sparse elimination theory. Based on the computation of a superset of the implicit support, implicitization is reduced to computing the nullspace of a numeric matrix. The approach is applicable to polynomial and rational parameterizations of curves and (hyper)surfaces of any dimension, including the case of parameterizations with base points. Our support prediction is based on sparse (or toric) resultant theory, in order to exploit the sparsity of the input and the output. Our method may yield a multiple of the implicit equation: we characterize and quantify this situation by relating the nullspace dimension to the predicted support and its geometry. In this case, we obtain more than one multiples of the implicit equation; the latter can be obtained via multivariate polynomial gcd (or factoring). All of the above techniques extend to the case of approximate computation, thus yielding a method of sparse approximate implicitization, which is important in tackling larger problems. We discuss our publicly available Maple implementation through several examples, including the benchmark of bicubic surface. For a novel application, we focus on computing the discriminant of a multivariate polynomial, which characterizes the existence of multiple roots and generalizes the resultant of a polynomial system. This yields an efficient, output-sensitive algorithm for computing the discriminant polynomial

    Probing spatial homogeneity with LTB models: a detailed discussion

    Full text link
    Do current observational data confirm the assumptions of the cosmological principle, or is there statistical evidence for deviations from spatial homogeneity on large scales? To address these questions, we developed a flexible framework based on spherically symmetric, but radially inhomogeneous Lemaitre-Tolman-Bondi (LTB) models with synchronous Big Bang. We expanded the (local) matter density profile in terms of flexible interpolation schemes and orthonormal polynomials. A Monte Carlo technique in combination with recent observational data was used to systematically vary the shape of these profiles. In the first part of this article, we reconsider giant LTB voids without dark energy to investigate whether extremely fine-tuned mass profiles can reconcile these models with current data. While the local Hubble rate and supernovae can easily be fitted without dark energy, however, model-independent constraints from the Planck 2013 data require an unrealistically low local Hubble rate, which is strongly inconsistent with the observed value; this result agrees well with previous studies. In the second part, we explain why it seems natural to extend our framework by a non-zero cosmological constant, which then allows us to perform general tests of the cosmological principle. Moreover, these extended models facilitate explorating whether fluctuations in the local matter density profile might potentially alleviate the tension between local and global measurements of the Hubble rate, as derived from Cepheid-calibrated type Ia supernovae and CMB experiments, respectively. We show that current data provide no evidence for deviations from spatial homogeneity on large scales. More accurate constraints are required to ultimately confirm the validity of the cosmological principle, however.Comment: 18 pages, 12 figures, 2 tables; accepted for publication in A&

    GRID2D/3D: A computer program for generating grid systems in complex-shaped two- and three-dimensional spatial domains. Part 1: Theory and method

    Get PDF
    An efficient computer program, called GRID2D/3D was developed to generate single and composite grid systems within geometrically complex two- and three-dimensional (2- and 3-D) spatial domains that can deform with time. GRID2D/3D generates single grid systems by using algebraic grid generation methods based on transfinite interpolation in which the distribution of grid points within the spatial domain is controlled by stretching functions. All single grid systems generated by GRID2D/3D can have grid lines that are continuous and differentiable everywhere up to the second-order. Also, grid lines can intersect boundaries of the spatial domain orthogonally. GRID2D/3D generates composite grid systems by patching together two or more single grid systems. The patching can be discontinuous or continuous. For continuous composite grid systems, the grid lines are continuous and differentiable everywhere up to the second-order except at interfaces where different single grid systems meet. At interfaces where different single grid systems meet, the grid lines are only differentiable up to the first-order. For 2-D spatial domains, the boundary curves are described by using either cubic or tension spline interpolation. For 3-D spatial domains, the boundary surfaces are described by using either linear Coon's interpolation, bi-hyperbolic spline interpolation, or a new technique referred to as 3-D bi-directional Hermite interpolation. Since grid systems generated by algebraic methods can have grid lines that overlap one another, GRID2D/3D contains a graphics package for evaluating the grid systems generated. With the graphics package, the user can generate grid systems in an interactive manner with the grid generation part of GRID2D/3D. GRID2D/3D is written in FORTRAN 77 and can be run on any IBM PC, XT, or AT compatible computer. In order to use GRID2D/3D on workstations or mainframe computers, some minor modifications must be made in the graphics part of the program; no modifications are needed in the grid generation part of the program. This technical memorandum describes the theory and method used in GRID2D/3D
    corecore