161,787 research outputs found

    Risk bounds in linear regression through PAC-Bayesian truncation

    Get PDF
    We consider the problem of predicting as well as the best linear combination of d given functions in least squares regression, and variants of this problem including constraints on the parameters of the linear combination. When the input distribution is known, there already exists an algorithm having an expected excess risk of order d/n, where n is the size of the training data. Without this strong assumption, standard results often contain a multiplicative log n factor, and require some additional assumptions like uniform boundedness of the d-dimensional input representation and exponential moments of the output. This work provides new risk bounds for the ridge estimator and the ordinary least squares estimator, and their variants. It also provides shrinkage procedures with convergence rate d/n (i.e., without the logarithmic factor) in expectation and in deviations, under various assumptions. The key common surprising factor of these results is the absence of exponential moment condition on the output distribution while achieving exponential deviations. All risk bounds are obtained through a PAC-Bayesian analysis on truncated differences of losses. Finally, we show that some of these results are not particular to the least squares loss, but can be generalized to similar strongly convex loss functions.Comment: 78 page

    Characterization of the frequency of extreme events by the Generalized Pareto Distribution

    Full text link
    Based on recent results in extreme value theory, we use a new technique for the statistical estimation of distribution tails. Specifically, we use the Gnedenko-Pickands-Balkema-de Haan theorem, which gives a natural limit law for peak-over-threshold values in the form of the Generalized Pareto Distribution (GPD). Useful in finance, insurance, hydrology, we investigate here the earthquake energy distribution described by the Gutenberg-Richter seismic moment-frequency law and analyze shallow earthquakes (depth h < 70 km) in the Harvard catalog over the period 1977-2000 in 18 seismic zones. The whole GPD is found to approximate the tails of the seismic moment distributions quite well above moment-magnitudes larger than mW=5.3 and no statistically significant regional difference is found for subduction and transform seismic zones. We confirm that the b-value is very different in mid-ocean ridges compared to other zones (b=1.50=B10.09 versus b=1.00=B10.05 corresponding to a power law exponent close to 1 versus 2/3) with a very high statistical confidence. We propose a physical mechanism for this, contrasting slow healing ruptures in mid-ocean ridges with fast healing ruptures in other zones. Deviations from the GPD at the very end of the tail are detected in the sample containing earthquakes from all major subduction zones (sample size of 4985 events). We propose a new statistical test of significance of such deviations based on the bootstrap method. The number of events deviating from the tails of GPD in the studied data sets (15-20 at most) is not sufficient for determining the functional form of those deviations. Thus, it is practically impossible to give preference to one of the previously suggested parametric families describing the ends of tails of seismic moment distributions.Comment: pdf document of 21 pages + 2 tables + 20 figures (ps format) + one file giving the regionalizatio

    Long term memories of developed and emerging markets: using the scaling analysis to characterize their stage of development

    Full text link
    The scaling properties encompass in a simple analysis many of the volatility characteristics of financial markets. That is why we use them to probe the different degree of markets development. We empirically study the scaling properties of daily Foreign Exchange rates, Stock Market indices and fixed income instruments by using the generalized Hurst approach. We show that the scaling exponents are associated with characteristics of the specific markets and can be used to differentiate markets in their stage of development. The robustness of the results is tested by both Monte-Carlo studies and a computation of the scaling in the frequency-domain.Comment: 46 pages, 7 figures, accepted for publication in Journal of Banking & Financ

    GENERALIZED VECTOR RISK FUNCTIONS

    Get PDF
    The paper introduces a new notion of vector-valued risk function. Both deviations and expectation bounded coherent risk measures are defined and analyzed. The relationships with both scalar and vector risk functions of previous literature are discussed, and it is pointed out that this new approach seems to appropriately integrate several preceding point of view. The framework of the study is the general setting of Banach lattices and Bochner integrable vector-valued random variables. Sub-gradient linked representation theorems, as well as portfolio choice problems, are also addressed, and general optimization methods are presented. Finally, practical examples are provided.

    Rigorous statistical detection and characterization of a deviation from the Gutenberg-Richter distribution above magnitude 8 in subduction zones

    Full text link
    We present a quantitative statistical test for the presence of a crossover c0 in the Gutenberg-Richter distribution of earthquake seismic moments, separating the usual power law regime for seismic moments less than c0 from another faster decaying regime beyond c0. Our method is based on the transformation of the ordered sample of seismic moments into a series with uniform distribution under condition of no crossover. The bootstrap method allows us to estimate the statistical significance of the null hypothesis H0 of an absence of crossover (c0=infinity). When H0 is rejected, we estimate the crossover c0 using two different competing models for the second regime beyond c0 and the bootstrap method. For the catalog obtained by aggregating 14 subduction zones of the Circum Pacific Seismic Belt, our estimate of the crossover point is log(c0) =28.14 +- 0.40 (c0 in dyne-cm), corresponding to a crossover magnitude mW=8.1 +- 0.3. For separate subduction zones, the corresponding estimates are much more uncertain, so that the null hypothesis of an identical crossover for all subduction zones cannot be rejected. Such a large value of the crossover magnitude makes it difficult to associate it directly with a seismogenic thickness as proposed by many different authors in the past. Our measure of c0 may substantiate the concept that the localization of strong shear deformation could propagate significantly in the lower crust and upper mantle, thus increasing the effective size beyond which one should expect a change of regime.Comment: pdf document of 40 pages including 5 tables and 19 figure

    Conditional Spectrum-Based Ground Motion Selection. Part I: Hazard Consistency for Risk-Based Assessments

    Get PDF
    The conditional spectrum (CS, with mean and variability) is a target response spectrum that links nonlinear dynamic analysis back to probabilistic seismic hazard analysis for ground motion selection. The CS is computed on the basis of a specified conditioning period, whereas structures under consideration may be sensitive to response spectral amplitudes at multiple periods of excitation. Questions remain regarding the appropriate choice of conditioning period when utilizing the CS as the target spectrum. This paper focuses on risk-based assessments, which estimate the annual rate of exceeding a specified structural response amplitude. Seismic hazard analysis, ground motion selection, and nonlinear dynamic analysis are performed, using the conditional spectra with varying conditioning periods, to assess the performance of a 20-story reinforced concrete frame structure. It is shown here that risk-based assessments are relatively insensitive to the choice of conditioning period when the ground motions are carefully selected to ensure hazard consistency. This observed insensitivity to the conditioning period comes from the fact that, when CS-based ground motion selection is used, the distributions of response spectra of the selected ground motions are consistent with the site ground motion hazard curves at all relevant periods; this consistency with the site hazard curves is independent of the conditioning period. The importance of an exact CS (which incorporates multiple causal earthquakes and ground motion prediction models) to achieve the appropriate spectral variability at periods away from the conditioning period is also highlighted. The findings of this paper are expected theoretically but have not been empirically demonstrated previously

    Fused kernel-spline smoothing for repeatedly measured outcomes in a generalized partially linear model with functional single index

    Full text link
    We propose a generalized partially linear functional single index risk score model for repeatedly measured outcomes where the index itself is a function of time. We fuse the nonparametric kernel method and regression spline method, and modify the generalized estimating equation to facilitate estimation and inference. We use local smoothing kernel to estimate the unspecified coefficient functions of time, and use B-splines to estimate the unspecified function of the single index component. The covariance structure is taken into account via a working model, which provides valid estimation and inference procedure whether or not it captures the true covariance. The estimation method is applicable to both continuous and discrete outcomes. We derive large sample properties of the estimation procedure and show a different convergence rate for each component of the model. The asymptotic properties when the kernel and regression spline methods are combined in a nested fashion has not been studied prior to this work, even in the independent data case.Comment: Published at http://dx.doi.org/10.1214/15-AOS1330 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A longitudinal high-risk study of adolescent anxiety, depression and parent-severity on the developmental course of risk-adjustment

    Get PDF
    Background Adolescence is associated with developments in the reward system and increased rates of emotional disorders. Familial risk for depression may be associated with disruptions in the reward system. However, it is unclear how symptoms of depression and anxiety influence the development of reward-processing over adolescence and whether variation in the severity of parental depression is associated with hyposensitivity to reward in a high-risk sample. Methods We focused on risk-adjustment (adjusting decisions about reward according to the probability of obtaining reward) as this was hypothesized to improve over adolescence. In a one-year longitudinal sample (N = 197) of adolescent offspring of depressed parents, we examined how symptoms of depression and anxiety (generalized anxiety and social anxiety) influenced the development of risk-adjustment. We also examined how parental depression severity influenced adolescent risk-adjustment. Results Risk-adjustment improved over the course of the study indicating improved adjustment of reward-seeking to shifting contingencies. Depressive symptoms were associated with decreases in risk-adjustment over time while social anxiety symptoms were associated with increases in risk-adjustment over time. Specifically, depression was associated with reductions in reward-seeking at favourable reward probabilities only, whereas social anxiety (but not generalized anxiety) led to reductions in reward-seeking at low reward probabilities only. Parent depression severity was associated with lowered risk-adjustment in offspring and also influenced the longitudinal relationship between risk-adjustment and offspring depression. Conclusions Anxiety and depression distinctly alter the pattern of longitudinal change in reward-processing. Severity of parent depression was associated with alterations in adolescent offspring reward-processing in a high-risk sample
    corecore