77,113 research outputs found
Use of the geometric mean as a statistic for the scale of the coupled Gaussian distributions
The geometric mean is shown to be an appropriate statistic for the scale of a
heavy-tailed coupled Gaussian distribution or equivalently the Student's t
distribution. The coupled Gaussian is a member of a family of distributions
parameterized by the nonlinear statistical coupling which is the reciprocal of
the degree of freedom and is proportional to fluctuations in the inverse scale
of the Gaussian. Existing estimators of the scale of the coupled Gaussian have
relied on estimates of the full distribution, and they suffer from problems
related to outliers in heavy-tailed distributions. In this paper, the scale of
a coupled Gaussian is proven to be equal to the product of the generalized mean
and the square root of the coupling. From our numerical computations of the
scales of coupled Gaussians using the generalized mean of random samples, it is
indicated that only samples from a Cauchy distribution (with coupling parameter
one) form an unbiased estimate with diminishing variance for large samples.
Nevertheless, we also prove that the scale is a function of the geometric mean,
the coupling term and a harmonic number. Numerical experiments show that this
estimator is unbiased with diminishing variance for large samples for a broad
range of coupling values.Comment: 17 pages, 5 figure
Deformed SPDE models with an application to spatial modeling of significant wave height
A non-stationary Gaussian random field model is developed based on a
combination of the stochastic partial differential equation (SPDE) approach and
the classical deformation method. With the deformation method, a stationary
field is defined on a domain which is deformed so that the field becomes
non-stationary. We show that if the stationary field is a Mat'ern field defined
as a solution to a fractional SPDE, the resulting non-stationary model can be
represented as the solution to another fractional SPDE on the deformed domain.
By defining the model in this way, the computational advantages of the SPDE
approach can be combined with the deformation method's more intuitive
parameterisation of non-stationarity. In particular it allows for independent
control over the non-stationary practical correlation range and the variance,
which has not been possible with previously proposed non-stationary SPDE
models.
The model is tested on spatial data of significant wave height, a
characteristic of ocean surface conditions which is important when estimating
the wear and risks associated with a planned journey of a ship. The model
parameters are estimated to data from the north Atlantic using a maximum
likelihood approach. The fitted model is used to compute wave height exceedance
probabilities and the distribution of accumulated fatigue damage for ships
traveling a popular shipping route. The model results agree well with the data,
indicating that the model could be used for route optimization in naval
logistics.Comment: 22 pages, 12 figure
Noise Variance Estimation In Signal Processing
We present a new method of estimating noise
variance. The method is applicable for 1D and 2D signal
processing. The essence of this method is estimation of the scatter
of normally distributed data with high level of outliers. The
method is applicable to data with the majority of the data points
having no signal present. The method is based on the shortest
half sample method. The mean of the shortest half sample
(shorth) and the location of the least median of squares are
among the most robust measures of the location of the mode. The
length of the shortest half sample has been used as the
measurement of the data scatter of uncontaminated data. We
show that computing the length of several sub samples of varying
sizes provides the necessary information to estimate both the
scatter and the number of uncontaminated data points in a
sample. We derive the system of equations to solve for the data
scatter and the number of uncontaminated data points for the
Gaussian distribution. The data scatter is the measure of the
noise variance. The method can be extended to other
distributions
Nearshore wave forecasting and hindcasting by dynamical and statistical downscaling
A high-resolution nested WAM/SWAN wave model suite aimed at rapidly
establishing nearshore wave forecasts as well as a climatology and return
values of the local wave conditions with Rapid Enviromental Assessment (REA) in
mind is described. The system is targeted at regions where local wave growth
and partial exposure to complex open-ocean wave conditions makes diagnostic
wave modelling difficult.
SWAN is set up on 500 m resolution and is nested in a 10 km version of WAM. A
model integration of more than one year is carried out to map the spatial
distribution of the wave field. The model correlates well with wave buoy
observations (0.96) but overestimates the wave height somewhat (18%, bias 0.29
m).
To estimate wave height return values a much longer time series is required
and running SWAN for such a period is unrealistic in a REA setting. Instead we
establish a direction-dependent transfer function between an already existing
coarse open-ocean hindcast dataset and the high-resolution nested SWAN model.
Return values are estimated using ensemble estimates of two different
extreme-value distributions based on the full 52 years of statistically
downscaled hindcast data. We find good agreement between downscaled wave height
and wave buoy observations. The cost of generating the statistically downscaled
hindcast time series is negligible and can be redone for arbitrary locations
within the SWAN domain, although the sectors must be carefully chosen for each
new location.
The method is found to be well suited to rapidly providing detailed wave
forecasts as well as hindcasts and return values estimates of partly sheltered
coastal regions.Comment: 20 pages, 7 figures and 2 tables, MREA07 special issue on Marine
rapid environmental assessmen
Characterization of the frequency of extreme events by the Generalized Pareto Distribution
Based on recent results in extreme value theory, we use a new technique for
the statistical estimation of distribution tails. Specifically, we use the
Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for
peak-over-threshold values in the form of the Generalized Pareto Distribution
(GPD). Useful in finance, insurance, hydrology, we investigate here the
earthquake energy distribution described by the Gutenberg-Richter seismic
moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the
Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is
found to approximate the tails of the seismic moment distributions quite well
above moment-magnitudes larger than mW=5.3 and no statistically significant
regional difference is found for subduction and transform seismic zones. We
confirm that the b-value is very different in mid-ocean ridges compared to
other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law
exponent close to 1 versus 2/3) with a very high statistical confidence. We
propose a physical mechanism for this, contrasting slow healing ruptures in
mid-ocean ridges with fast healing ruptures in other zones. Deviations from the
GPD at the very end of the tail are detected in the sample containing
earthquakes from all major subduction zones (sample size of 4985 events). We
propose a new statistical test of significance of such deviations based on the
bootstrap method. The number of events deviating from the tails of GPD in the
studied data sets (15-20 at most) is not sufficient for determining the
functional form of those deviations. Thus, it is practically impossible to give
preference to one of the previously suggested parametric families describing
the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one
file giving the regionalizatio
Uncertainty Propagation and Feature Selection for Loss Estimation in Performance-based Earthquake Engineering
This report presents a new methodology, called moment matching, of propagating the uncertainties in estimating repair costs of a building due to future earthquake excitation, which is required, for example, when assessing a design in performance-based earthquake engineering. Besides excitation uncertainties, other uncertain model variables are considered, including uncertainties in the structural model parameters and in the capacity and repair costs of structural and non-structural components. Using the first few moments of these uncertain variables, moment matching requires only a few well-chosen point estimates to propagate the uncertainties to estimate the first few moments of the repair costs with high accuracy. Furthermore, the use of moment matching to estimate the exceedance probability of the repair costs is also addressed. These examples illustrate that the moment-matching approach is quite general; for example, it can be applied to any decision variable in performance-based earthquake engineering.
Two buildings are chosen as illustrative examples to demonstrate the use of moment matching, a hypothetical three-story shear building and a real seven-story hotel building. For these two examples, the assembly-based vulnerability approach is employed when calculating repair costs. It is shown that the moment-matching technique is much more accurate than the well-known First-Order-Second-Moment approach when propagating the first two moments, while the resulting computational cost is of the same order. The repair-cost moments and exceedance probability estimated by the moment-matching technique are also compared with those by Monte Carlo simulation. It is concluded that as long as the order of the moment matching is sufficient, the comparison is satisfactory. Furthermore, the amount of computation for moment matching scales only linearly with the number of uncertain input variables.
Last but not least, a procedure for feature selection is presented and illustrated for the second example. The conclusion is that the most important uncertain input variables among the many influencing the uncertainty in future repair costs are, in order of importance, ground-motion spectral acceleration, component capacity, ground-motion details and unit repair costs
Convergence of large deviation estimators
We study the convergence of statistical estimators used in the estimation of
large deviation functions describing the fluctuations of equilibrium,
nonequilibrium, and manmade stochastic systems. We give conditions for the
convergence of these estimators with sample size, based on the boundedness or
unboundedness of the quantity sampled, and discuss how statistical errors
should be defined in different parts of the convergence region. Our results
shed light on previous reports of 'phase transitions' in the statistics of free
energy estimators and establish a general framework for reliably estimating
large deviation functions from simulation and experimental data and identifying
parameter regions where this estimation converges.Comment: 13 pages, 6 figures. v2: corrections focusing the paper on large
deviations; v3: minor corrections, close to published versio
Ordinary kriging for on-demand average wind interpolation of in-situ wind sensor data
We have developed a domain agnostic ordinary kriging algorithm accessible via a standards-based service-oriented architecture for sensor networks. We exploit the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) standards. We need on-demand interpolation maps so runtime performance is a major priority.Our sensor data comes from wind in-situ observation stations in an area approximately 200km by 125km. We provide on-demand average wind interpolation maps. These spatial estimates can then be compared with the results of other estimation models in order to identify spurious results that sometimes occur in wind estimation.Our processing is based on ordinary kriging with automated variogram model selection (AVMS). This procedure can smooth time point wind measurements to obtain average wind by using a variogram model that reflects the wind phenomenon characteristics. Kriging is enabled for wind direction estimation by a simple but effective solution to the problem of estimating periodic variables, based on vector rotation and stochastic simulation.In cases where for the region of interest all wind directions span 180 degrees, we rotate them so they lie between 90 and 270 degrees and apply ordinary kriging with AVMS directly to the meteorological angle. Else, we transform the meteorological angle to Cartesian space, apply ordinary kriging with AVMS and use simulation to transform the kriging estimates back to meteorological angle.Tests run on a 50 by 50 grid using standard hardware takes about 5 minutes to execute backward transformation with a sample size of 100,000. This is acceptable for our on-demand processing service requirements
- …