6,339 research outputs found

    Effects of sampling skewness of the importance-weighted risk estimator on model selection

    Full text link
    Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for datasets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to suboptimal regularization parameters when used for importance-weighted validation.Comment: Conference paper, 6 pages, 5 figure

    Heavy-Tailed Features and Empirical Analysis of the Limit Order Book Volume Profiles in Futures Markets

    Full text link
    This paper poses a few fundamental questions regarding the attributes of the volume profile of a Limit Order Books stochastic structure by taking into consideration aspects of intraday and interday statistical features, the impact of different exchange features and the impact of market participants in different asset sectors. This paper aims to address the following questions: 1. Is there statistical evidence that heavy-tailed sub-exponential volume profiles occur at different levels of the Limit Order Book on the bid and ask and if so does this happen on intra or interday time scales ? 2.In futures exchanges, are heavy tail features exchange (CBOT, CME, EUREX, SGX and COMEX) or asset class (government bonds, equities and precious metals) dependent and do they happen on ultra-high (<1sec) or mid-range (1sec -10min) high frequency data? 3.Does the presence of stochastic heavy-tailed volume profile features evolve in a manner that would inform or be indicative of market participant behaviors, such as high frequency algorithmic trading, quote stuffing and price discovery intra-daily? 4. Is there statistical evidence for a need to consider dynamic behavior of the parameters of models for Limit Order Book volume profiles on an intra-daily time scale ? Progress on aspects of each question is obtained via statistically rigorous results to verify the empirical findings for an unprecedentedly large set of futures market LOB data. The data comprises several exchanges, several futures asset classes and all trading days of 2010, using market depth (Type II) order book data to 5 levels on the bid and ask

    On Regularization Parameter Estimation under Covariate Shift

    Full text link
    This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data. The problem is that if one decides to use source validation data, the regularization parameter is underestimated. One possible solution is to scale the source validation data through importance weighting, but we show that this correction is not sufficient. We conclude the paper with an empirical analysis of the effect of several importance weight estimators on the estimation of the regularization parameter.Comment: 6 pages, 2 figures, 2 tables. Accepted to ICPR 201

    High-Frequency and Model-Free Volatility Estimators

    Get PDF
    This paper focuses on volatility of financial markets, which is one of the most important issues in finance, especially with regard to modeling high-frequency data. Risk management, asset pricing and option valuation techniques are the areas where the concept of volatility estimators (consistent, unbiased and the most efficient) is of crucial concern. Our intention was to find the best estimator of true volatility taking into account the latest investigations in finance literature. Basing on the methodology presented in Parkinson (1980), Garman and Klass (1980), Rogers and Satchell (1991), Yang and Zhang (2000), Andersen et al. (1997, 1998, 1999a, 199b), Hansen and Lunde (2005, 2006b) and Martens (2007), we computed the various model-free volatility estimators and compared them with classical volatility estimator, most often used in financial models. In order to reveal the information set hidden in high-frequency data, we utilized the concept of realized volatility and realized range. Calculating our estimator, we carefully focused on Δ (the interval used in calculation), n (the memory of the process) and q (scaling factor for scaled estimators). Our results revealed that the appropriate selection of Δ and n plays a crucial role when we try to answer the question concerning the estimator efficiency, as well as its accuracy. Having nine estimators of volatility, we found that for optimal n (measured in days) and Δ (in minutes) we obtain the most efficient estimator. Our findings confirmed that the best estimator should include information contained not only in closing prices but in the price range as well (range estimators). What is more important, we focused on the properties of the formula itself, independently of the interval used, comparing the estimator with the same Δ, n and q parameter. We observed that the formula of volatility estimator is not as important as the process of selection of the optimal parameter n or Δ. Finally, we focused on the asymmetry between market turmoil and adjustments of volatility. Next, we put stress on the implications of our results for well-known financial models which utilize classical volatility estimator as the main input variable.financial market volatility, high-frequency financial data, realized volatility and correlation, volatility forecasting, microstructure bias, the opening jump effect, the bid-ask bounce, autocovariance bias, daily patterns of volatility, emerging markets

    Semi-parametric estimation of joint large movements of risky assets

    Get PDF
    The classical approach to modelling the occurrence of joint large movements of asset returns is to assume multivariate normality for the distribution of asset returns. This implies independence between large returns. However, it is now recognised by both academics and practitioners that large movements of assets returns do not occur independently. This fact encourages the modelling joint large movements of asset returns as non-normal, a non trivial task mainly due to the natural scarcity of such extreme events. This paper shows how to estimate the probability of joint large movements of asset prices using a semi-parametric approach borrowed from extreme value theory (EVT). It helps to understand the contribution of individual assets to large portfolio losses in terms of joint large movements. The advantages of this approach are that it does not require the assumption of a specific parametric form for the dependence structure of the joint large movements, avoiding the model misspecification; it addresses specifically the scarcity of data which is a problem for the reliable fitting of fully parametric models; and it is applicable to portfolios of many assets: there is no dimension explosion. The paper includes an empirical analysis of international equity data showing how to implement semi-parametric EVT modelling and how to exploit its strengths to help understand the probability of joint large movements. We estimate the probability of joint large losses in a portfolio composed of the FTSE 100, Nikkei 250 and S&P 500 indices. Each of the index returns is found to be heavy tailed. The S&P 500 index has a much stronger effect on large portfolio losses than the FTSE 100, although having similar univariate tail heaviness

    Evaluating Yield Models for Crop Insurance Rating

    Get PDF
    Generated crop insurance rates depend critically on the distributional assumptions of the underlying crop yield loss model. Using farm level corn yield data from 1972-2008, we revisit the problem of examining in-sample goodness-of-fit measures across a set of flexible parametric, semi-parametric, and non-parametric distributions. Simulations are also conducted to investigate the out-of-sample efficiency properties of several competing distributions. The results indicate that more parameterized distributional forms fit the data better in-sample due to the fact that they have more parameters, but are generally less efficient out-of-sample–and in some cases more biased–than more parsimonious forms which also fit the data adequately, such as the Weibull. The results highlight the relative advantages of alternative distributions in terms of the bias-efficiency tradeoff in both in- and out-of-sample frameworks.Yield distributions, Crop Insurance, Weibull Distribution, Beta Distribution, Mixture Distribution, Out-of-Sample Efficiency, Goodness-of-Fit, Insurance Rating Efficiency, Farm Management, Financial Economics, Land Economics/Use,

    Innovation Behaviour At Farm Level – Selection And Identification

    Get PDF
    Using a squential logit model and a mixed-effects logistic regression approach this empirical study investigates factors for the adoption of automatic milking technology (AMS) at the farm level accounting for problems of sequential sample selection and behaviour identification. The results suggest the importance of the farmer’s risk perception, significant effects of peer-group behaviour, and a positive impact of previous innovation experiences.Technology Adoption, Mixed-Effects Regression, Risk, Agricultural and Food Policy, Farm Management, Land Economics/Use,

    Innovation behaviour at farm level: Selection and identification

    Get PDF
    Using a squential logit model and a mixed-effects logistic regression approach this empirical study investigates factors for the adoption of automatic milking technology (AMS) at the farm level accounting for problems of sequential sample selection and behaviour identification. The results suggest the importance of the farmer’s risk perception, significant effects of peer-group behaviour, and a positive impact of previous innovation experiences.squential logit model, automatic milking technology (AMS), Livestock Production/Industries, Research Methods/ Statistical Methods, Risk and Uncertainty,
    • …
    corecore