77,113 research outputs found

    Use of the geometric mean as a statistic for the scale of the coupled Gaussian distributions

    Full text link
    The geometric mean is shown to be an appropriate statistic for the scale of a heavy-tailed coupled Gaussian distribution or equivalently the Student's t distribution. The coupled Gaussian is a member of a family of distributions parameterized by the nonlinear statistical coupling which is the reciprocal of the degree of freedom and is proportional to fluctuations in the inverse scale of the Gaussian. Existing estimators of the scale of the coupled Gaussian have relied on estimates of the full distribution, and they suffer from problems related to outliers in heavy-tailed distributions. In this paper, the scale of a coupled Gaussian is proven to be equal to the product of the generalized mean and the square root of the coupling. From our numerical computations of the scales of coupled Gaussians using the generalized mean of random samples, it is indicated that only samples from a Cauchy distribution (with coupling parameter one) form an unbiased estimate with diminishing variance for large samples. Nevertheless, we also prove that the scale is a function of the geometric mean, the coupling term and a harmonic number. Numerical experiments show that this estimator is unbiased with diminishing variance for large samples for a broad range of coupling values.Comment: 17 pages, 5 figure

    Deformed SPDE models with an application to spatial modeling of significant wave height

    Full text link
    A non-stationary Gaussian random field model is developed based on a combination of the stochastic partial differential equation (SPDE) approach and the classical deformation method. With the deformation method, a stationary field is defined on a domain which is deformed so that the field becomes non-stationary. We show that if the stationary field is a Mat'ern field defined as a solution to a fractional SPDE, the resulting non-stationary model can be represented as the solution to another fractional SPDE on the deformed domain. By defining the model in this way, the computational advantages of the SPDE approach can be combined with the deformation method's more intuitive parameterisation of non-stationarity. In particular it allows for independent control over the non-stationary practical correlation range and the variance, which has not been possible with previously proposed non-stationary SPDE models. The model is tested on spatial data of significant wave height, a characteristic of ocean surface conditions which is important when estimating the wear and risks associated with a planned journey of a ship. The model parameters are estimated to data from the north Atlantic using a maximum likelihood approach. The fitted model is used to compute wave height exceedance probabilities and the distribution of accumulated fatigue damage for ships traveling a popular shipping route. The model results agree well with the data, indicating that the model could be used for route optimization in naval logistics.Comment: 22 pages, 12 figure

    Noise Variance Estimation In Signal Processing

    Get PDF
    We present a new method of estimating noise variance. The method is applicable for 1D and 2D signal processing. The essence of this method is estimation of the scatter of normally distributed data with high level of outliers. The method is applicable to data with the majority of the data points having no signal present. The method is based on the shortest half sample method. The mean of the shortest half sample (shorth) and the location of the least median of squares are among the most robust measures of the location of the mode. The length of the shortest half sample has been used as the measurement of the data scatter of uncontaminated data. We show that computing the length of several sub samples of varying sizes provides the necessary information to estimate both the scatter and the number of uncontaminated data points in a sample. We derive the system of equations to solve for the data scatter and the number of uncontaminated data points for the Gaussian distribution. The data scatter is the measure of the noise variance. The method can be extended to other distributions

    Nearshore wave forecasting and hindcasting by dynamical and statistical downscaling

    Full text link
    A high-resolution nested WAM/SWAN wave model suite aimed at rapidly establishing nearshore wave forecasts as well as a climatology and return values of the local wave conditions with Rapid Enviromental Assessment (REA) in mind is described. The system is targeted at regions where local wave growth and partial exposure to complex open-ocean wave conditions makes diagnostic wave modelling difficult. SWAN is set up on 500 m resolution and is nested in a 10 km version of WAM. A model integration of more than one year is carried out to map the spatial distribution of the wave field. The model correlates well with wave buoy observations (0.96) but overestimates the wave height somewhat (18%, bias 0.29 m). To estimate wave height return values a much longer time series is required and running SWAN for such a period is unrealistic in a REA setting. Instead we establish a direction-dependent transfer function between an already existing coarse open-ocean hindcast dataset and the high-resolution nested SWAN model. Return values are estimated using ensemble estimates of two different extreme-value distributions based on the full 52 years of statistically downscaled hindcast data. We find good agreement between downscaled wave height and wave buoy observations. The cost of generating the statistically downscaled hindcast time series is negligible and can be redone for arbitrary locations within the SWAN domain, although the sectors must be carefully chosen for each new location. The method is found to be well suited to rapidly providing detailed wave forecasts as well as hindcasts and return values estimates of partly sheltered coastal regions.Comment: 20 pages, 7 figures and 2 tables, MREA07 special issue on Marine rapid environmental assessmen

    Characterization of the frequency of extreme events by the Generalized Pareto Distribution

    Full text link
    Based on recent results in extreme value theory, we use a new technique for the statistical estimation of distribution tails. Specifically, we use the Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for peak-over-threshold values in the form of the Generalized Pareto Distribution (GPD). Useful in finance, insurance, hydrology, we investigate here the earthquake energy distribution described by the Gutenberg-Richter seismic moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is found to approximate the tails of the seismic moment distributions quite well above moment-magnitudes larger than mW=5.3 and no statistically significant regional difference is found for subduction and transform seismic zones. We confirm that the b-value is very different in mid-ocean ridges compared to other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law exponent close to 1 versus 2/3) with a very high statistical confidence. We propose a physical mechanism for this, contrasting slow healing ruptures in mid-ocean ridges with fast healing ruptures in other zones. Deviations from the GPD at the very end of the tail are detected in the sample containing earthquakes from all major subduction zones (sample size of 4985 events). We propose a new statistical test of significance of such deviations based on the bootstrap method. The number of events deviating from the tails of GPD in the studied data sets (15-20 at most) is not sufficient for determining the functional form of those deviations. Thus, it is practically impossible to give preference to one of the previously suggested parametric families describing the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one file giving the regionalizatio

    Uncertainty Propagation and Feature Selection for Loss Estimation in Performance-based Earthquake Engineering

    Get PDF
    This report presents a new methodology, called moment matching, of propagating the uncertainties in estimating repair costs of a building due to future earthquake excitation, which is required, for example, when assessing a design in performance-based earthquake engineering. Besides excitation uncertainties, other uncertain model variables are considered, including uncertainties in the structural model parameters and in the capacity and repair costs of structural and non-structural components. Using the first few moments of these uncertain variables, moment matching requires only a few well-chosen point estimates to propagate the uncertainties to estimate the first few moments of the repair costs with high accuracy. Furthermore, the use of moment matching to estimate the exceedance probability of the repair costs is also addressed. These examples illustrate that the moment-matching approach is quite general; for example, it can be applied to any decision variable in performance-based earthquake engineering. Two buildings are chosen as illustrative examples to demonstrate the use of moment matching, a hypothetical three-story shear building and a real seven-story hotel building. For these two examples, the assembly-based vulnerability approach is employed when calculating repair costs. It is shown that the moment-matching technique is much more accurate than the well-known First-Order-Second-Moment approach when propagating the first two moments, while the resulting computational cost is of the same order. The repair-cost moments and exceedance probability estimated by the moment-matching technique are also compared with those by Monte Carlo simulation. It is concluded that as long as the order of the moment matching is sufficient, the comparison is satisfactory. Furthermore, the amount of computation for moment matching scales only linearly with the number of uncertain input variables. Last but not least, a procedure for feature selection is presented and illustrated for the second example. The conclusion is that the most important uncertain input variables among the many influencing the uncertainty in future repair costs are, in order of importance, ground-motion spectral acceleration, component capacity, ground-motion details and unit repair costs

    Convergence of large deviation estimators

    Full text link
    We study the convergence of statistical estimators used in the estimation of large deviation functions describing the fluctuations of equilibrium, nonequilibrium, and manmade stochastic systems. We give conditions for the convergence of these estimators with sample size, based on the boundedness or unboundedness of the quantity sampled, and discuss how statistical errors should be defined in different parts of the convergence region. Our results shed light on previous reports of 'phase transitions' in the statistics of free energy estimators and establish a general framework for reliably estimating large deviation functions from simulation and experimental data and identifying parameter regions where this estimation converges.Comment: 13 pages, 6 figures. v2: corrections focusing the paper on large deviations; v3: minor corrections, close to published versio

    Ordinary kriging for on-demand average wind interpolation of in-situ wind sensor data

    No full text
    We have developed a domain agnostic ordinary kriging algorithm accessible via a standards-based service-oriented architecture for sensor networks. We exploit the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) standards. We need on-demand interpolation maps so runtime performance is a major priority.Our sensor data comes from wind in-situ observation stations in an area approximately 200km by 125km. We provide on-demand average wind interpolation maps. These spatial estimates can then be compared with the results of other estimation models in order to identify spurious results that sometimes occur in wind estimation.Our processing is based on ordinary kriging with automated variogram model selection (AVMS). This procedure can smooth time point wind measurements to obtain average wind by using a variogram model that reflects the wind phenomenon characteristics. Kriging is enabled for wind direction estimation by a simple but effective solution to the problem of estimating periodic variables, based on vector rotation and stochastic simulation.In cases where for the region of interest all wind directions span 180 degrees, we rotate them so they lie between 90 and 270 degrees and apply ordinary kriging with AVMS directly to the meteorological angle. Else, we transform the meteorological angle to Cartesian space, apply ordinary kriging with AVMS and use simulation to transform the kriging estimates back to meteorological angle.Tests run on a 50 by 50 grid using standard hardware takes about 5 minutes to execute backward transformation with a sample size of 100,000. This is acceptable for our on-demand processing service requirements
    corecore