161,787 research outputs found
Risk bounds in linear regression through PAC-Bayesian truncation
We consider the problem of predicting as well as the best linear combination
of d given functions in least squares regression, and variants of this problem
including constraints on the parameters of the linear combination. When the
input distribution is known, there already exists an algorithm having an
expected excess risk of order d/n, where n is the size of the training data.
Without this strong assumption, standard results often contain a multiplicative
log n factor, and require some additional assumptions like uniform boundedness
of the d-dimensional input representation and exponential moments of the
output. This work provides new risk bounds for the ridge estimator and the
ordinary least squares estimator, and their variants. It also provides
shrinkage procedures with convergence rate d/n (i.e., without the logarithmic
factor) in expectation and in deviations, under various assumptions. The key
common surprising factor of these results is the absence of exponential moment
condition on the output distribution while achieving exponential deviations.
All risk bounds are obtained through a PAC-Bayesian analysis on truncated
differences of losses. Finally, we show that some of these results are not
particular to the least squares loss, but can be generalized to similar
strongly convex loss functions.Comment: 78 page
Characterization of the frequency of extreme events by the Generalized Pareto Distribution
Based on recent results in extreme value theory, we use a new technique for
the statistical estimation of distribution tails. Specifically, we use the
Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for
peak-over-threshold values in the form of the Generalized Pareto Distribution
(GPD). Useful in finance, insurance, hydrology, we investigate here the
earthquake energy distribution described by the Gutenberg-Richter seismic
moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the
Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is
found to approximate the tails of the seismic moment distributions quite well
above moment-magnitudes larger than mW=5.3 and no statistically significant
regional difference is found for subduction and transform seismic zones. We
confirm that the b-value is very different in mid-ocean ridges compared to
other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law
exponent close to 1 versus 2/3) with a very high statistical confidence. We
propose a physical mechanism for this, contrasting slow healing ruptures in
mid-ocean ridges with fast healing ruptures in other zones. Deviations from the
GPD at the very end of the tail are detected in the sample containing
earthquakes from all major subduction zones (sample size of 4985 events). We
propose a new statistical test of significance of such deviations based on the
bootstrap method. The number of events deviating from the tails of GPD in the
studied data sets (15-20 at most) is not sufficient for determining the
functional form of those deviations. Thus, it is practically impossible to give
preference to one of the previously suggested parametric families describing
the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one
file giving the regionalizatio
Long term memories of developed and emerging markets: using the scaling analysis to characterize their stage of development
The scaling properties encompass in a simple analysis many of the volatility
characteristics of financial markets. That is why we use them to probe the
different degree of markets development. We empirically study the scaling
properties of daily Foreign Exchange rates, Stock Market indices and fixed
income instruments by using the generalized Hurst approach. We show that the
scaling exponents are associated with characteristics of the specific markets
and can be used to differentiate markets in their stage of development. The
robustness of the results is tested by both Monte-Carlo studies and a
computation of the scaling in the frequency-domain.Comment: 46 pages, 7 figures, accepted for publication in Journal of Banking &
Financ
GENERALIZED VECTOR RISK FUNCTIONS
The paper introduces a new notion of vector-valued risk function. Both deviations and expectation bounded coherent risk measures are defined and analyzed. The relationships with both scalar and vector risk functions of previous literature are discussed, and it is pointed out that this new approach seems to appropriately integrate several preceding point of view. The framework of the study is the general setting of Banach lattices and Bochner integrable vector-valued random variables. Sub-gradient linked representation theorems, as well as portfolio choice problems, are also addressed, and general optimization methods are presented. Finally, practical examples are provided.
Rigorous statistical detection and characterization of a deviation from the Gutenberg-Richter distribution above magnitude 8 in subduction zones
We present a quantitative statistical test for the presence of a crossover c0
in the Gutenberg-Richter distribution of earthquake seismic moments, separating
the usual power law regime for seismic moments less than c0 from another faster
decaying regime beyond c0. Our method is based on the transformation of the
ordered sample of seismic moments into a series with uniform distribution under
condition of no crossover. The bootstrap method allows us to estimate the
statistical significance of the null hypothesis H0 of an absence of crossover
(c0=infinity). When H0 is rejected, we estimate the crossover c0 using two
different competing models for the second regime beyond c0 and the bootstrap
method. For the catalog obtained by aggregating 14 subduction zones of the
Circum Pacific Seismic Belt, our estimate of the crossover point is log(c0)
=28.14 +- 0.40 (c0 in dyne-cm), corresponding to a crossover magnitude mW=8.1
+- 0.3. For separate subduction zones, the corresponding estimates are much
more uncertain, so that the null hypothesis of an identical crossover for all
subduction zones cannot be rejected. Such a large value of the crossover
magnitude makes it difficult to associate it directly with a seismogenic
thickness as proposed by many different authors in the past. Our measure of c0
may substantiate the concept that the localization of strong shear deformation
could propagate significantly in the lower crust and upper mantle, thus
increasing the effective size beyond which one should expect a change of
regime.Comment: pdf document of 40 pages including 5 tables and 19 figure
Conditional Spectrum-Based Ground Motion Selection. Part I: Hazard Consistency for Risk-Based Assessments
The conditional spectrum (CS, with mean and variability) is a target response spectrum that links nonlinear dynamic analysis back to probabilistic seismic hazard analysis for ground motion selection. The CS is computed on the basis of a specified conditioning period, whereas structures under consideration may be sensitive to response spectral amplitudes at multiple periods of excitation. Questions remain regarding the appropriate choice of conditioning period when utilizing the CS as the target spectrum. This paper focuses on risk-based assessments, which estimate the annual rate of exceeding a specified structural response amplitude. Seismic hazard analysis, ground motion selection, and nonlinear dynamic analysis are performed, using the conditional spectra with varying conditioning periods, to assess the performance of a 20-story reinforced concrete frame structure. It is shown here that risk-based assessments are relatively insensitive to the choice of conditioning period when the ground motions are carefully selected to ensure hazard consistency. This observed insensitivity to the conditioning period comes from the fact that, when CS-based ground motion selection is used, the distributions of response spectra of the selected ground motions are consistent with the site ground motion hazard curves at all relevant periods; this consistency with the site hazard curves is independent of the conditioning period. The importance of an exact CS (which incorporates multiple causal earthquakes and ground motion prediction models) to achieve the appropriate spectral variability at periods away from the conditioning period is also highlighted. The findings of this paper are expected theoretically but have not been empirically demonstrated previously
Fused kernel-spline smoothing for repeatedly measured outcomes in a generalized partially linear model with functional single index
We propose a generalized partially linear functional single index risk score
model for repeatedly measured outcomes where the index itself is a function of
time. We fuse the nonparametric kernel method and regression spline method, and
modify the generalized estimating equation to facilitate estimation and
inference. We use local smoothing kernel to estimate the unspecified
coefficient functions of time, and use B-splines to estimate the unspecified
function of the single index component. The covariance structure is taken into
account via a working model, which provides valid estimation and inference
procedure whether or not it captures the true covariance. The estimation method
is applicable to both continuous and discrete outcomes. We derive large sample
properties of the estimation procedure and show a different convergence rate
for each component of the model. The asymptotic properties when the kernel and
regression spline methods are combined in a nested fashion has not been studied
prior to this work, even in the independent data case.Comment: Published at http://dx.doi.org/10.1214/15-AOS1330 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A longitudinal high-risk study of adolescent anxiety, depression and parent-severity on the developmental course of risk-adjustment
Background
Adolescence is associated with developments in the reward system and increased rates of emotional disorders. Familial risk for depression may be associated with disruptions in the reward system. However, it is unclear how symptoms of depression and anxiety influence the development of reward-processing over adolescence and whether variation in the severity of parental depression is associated with hyposensitivity to reward in a high-risk sample.
Methods
We focused on risk-adjustment (adjusting decisions about reward according to the probability of obtaining reward) as this was hypothesized to improve over adolescence. In a one-year longitudinal sample (N = 197) of adolescent offspring of depressed parents, we examined how symptoms of depression and anxiety (generalized anxiety and social anxiety) influenced the development of risk-adjustment. We also examined how parental depression severity influenced adolescent risk-adjustment.
Results
Risk-adjustment improved over the course of the study indicating improved adjustment of reward-seeking to shifting contingencies. Depressive symptoms were associated with decreases in risk-adjustment over time while social anxiety symptoms were associated with increases in risk-adjustment over time. Specifically, depression was associated with reductions in reward-seeking at favourable reward probabilities only, whereas social anxiety (but not generalized anxiety) led to reductions in reward-seeking at low reward probabilities only. Parent depression severity was associated with lowered risk-adjustment in offspring and also influenced the longitudinal relationship between risk-adjustment and offspring depression.
Conclusions
Anxiety and depression distinctly alter the pattern of longitudinal change in reward-processing. Severity of parent depression was associated with alterations in adolescent offspring reward-processing in a high-risk sample
- …