61,369 research outputs found

    Measuring measuring: Toward a theory of proficiency with the Constructing Measures framework

    Get PDF
    This paper is relevant to measurement educators who are interested in the variability of understanding and use of the four building blocks in the Constructing Measures framework (Wilson, 2005). It proposes a uni-dimensional structure for understanding Wilson’s framework, and explores the evidence for and against this conceptualization. Constructed and fixed choice response items are utilized to collect responses from 72 participants who range in experience and expertise with constructing measures. The data was scored by two raters and was analyzed with the Rasch partial credit model using ConQuest (1998). Guided by the 1999 Testing Standards, analyses of validity and reliability evidence provide support for the construct theory and limited uses of the instrument pending item design modifications

    Backtesting Expected Shortfall: a simple recipe?

    Full text link
    We propose a new backtesting framework for Expected Shortfall that could be used by the regulator. Instead of looking at the estimated capital reserve and the realised cash-flow separately, one could bind them into the secured position, for which risk measurement is much easier. Using this simple concept combined with monotonicity of Expected Shortfall with respect to its target confidence level we introduce a natural and efficient backtesting framework. Our test statistics is given by the biggest number of worst realisations for the secured position that add up to a negative total. Surprisingly, this simple quantity could be used to construct an efficient backtesting framework for unconditional coverage of Expected Shortfall in a natural extension of the regulatory traffic-light approach for Value-at-Risk. While being easy to calculate, the test statistic is based on the underlying duality between coherent risk measures and scale-invariant performance measures

    Edge-weighting of gene expression graphs

    Get PDF
    In recent years, considerable research efforts have been directed to micro-array technologies and their role in providing simultaneous information on expression profiles for thousands of genes. These data, when subjected to clustering and classification procedures, can assist in identifying patterns and providing insight on biological processes. To understand the properties of complex gene expression datasets, graphical representations can be used. Intuitively, the data can be represented in terms of a bipartite graph, with weighted edges corresponding to gene-sample node couples in the dataset. Biologically meaningful subgraphs can be sought, but performance can be influenced both by the search algorithm, and, by the graph-weighting scheme and both merit rigorous investigation. In this paper, we focus on edge-weighting schemes for bipartite graphical representation of gene expression. Two novel methods are presented: the first is based on empirical evidence; the second on a geometric distribution. The schemes are compared for several real datasets, assessing efficiency of performance based on four essential properties: robustness to noise and missing values, discrimination, parameter influence on scheme efficiency and reusability. Recommendations and limitations are briefly discussed

    Evolution of statistical analysis in empirical software engineering research: Current state and steps forward

    Full text link
    Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To investigate the practices and trends of statistical analysis in empirical software engineering (ESE), this paper presents a review of a large pool of papers from top-ranked software engineering journals. First, we manually reviewed 161 papers and in the second phase of our method, we conducted a more extensive semi-automatic classification of papers spanning the years 2001--2015 and 5,196 papers. Results from both review steps was used to: i) identify and analyze the predominant practices in ESE (e.g., using t-test or ANOVA), as well as relevant trends in usage of specific statistical methods (e.g., nonparametric tests and effect size measures) and, ii) develop a conceptual model for a statistical analysis workflow with suggestions on how to apply different statistical methods as well as guidelines to avoid pitfalls. Lastly, we confirm existing claims that current ESE practices lack a standard to report practical significance of results. We illustrate how practical significance can be discussed in terms of both the statistical analysis and in the practitioner's context.Comment: journal submission, 34 pages, 8 figure

    Are Smell-Based Metrics Actually Useful in Effort-Aware Structural Change-Proneness Prediction? An Empirical Study

    Get PDF
    Bad code smells (also named as code smells) are symptoms of poor design choices in implementation. Existing studies empirically confirmed that the presence of code smells increases the likelihood of subsequent changes (i.e., change-proness). However, to the best of our knowledge, no prior studies have leveraged smell-based metrics to predict particular change type (i.e., structural changes). Moreover, when evaluating the effectiveness of smell-based metrics in structural change-proneness prediction, none of existing studies take into account of the effort inspecting those change-prone source code. In this paper, we consider five smell-based metrics for effort-aware structural change-proneness prediction and compare these metrics with a baseline of well-known CK metrics in predicting particular categories of change types. Specifically, we first employ univariate logistic regression to analyze the correlation between each smellbased metric and structural change-proneness. Then, we build multivariate prediction models to examine the effectiveness of smell-based metrics in effort-aware structural change-proneness prediction when used alone and used together with the baseline metrics, respectively. Our experiments are conducted on six Java open-source projects with up to 60 versions and results indicate that: (1) all smell-based metrics are significantly related to structural change-proneness, except metric ANS in hive and SCM in camel after removing confounding effect of file size; (2) in most cases, smell-based metrics outperform the baseline metrics in predicting structural change-proneness; and (3) when used together with the baseline metrics, the smell-based metrics are more effective to predict change-prone files with being aware of inspection effort

    GFC-Robust Risk Management Under the Basel Accord Using Extreme Value Methodologies

    Get PDF
    In McAleer et al. (2010b), a robust risk management strategy to the Global Financial Crisis (GFC) was proposed under the Basel II Accord by selecting a Value-at-Risk (VaR) forecast that combines the forecasts of different VaR models. The robust forecast was based on the median of the point VaR forecasts of a set of conditional volatility models. In this paper we provide further evidence on the suitability of the median as a GFC-robust strategy by using an additional set of new extreme value forecasting models and by extending the sample period for comparison. These extreme value models include DPOT and Conditional EVT. Such models might be expected to be useful in explaining financial data, especially in the presence of extreme shocks that arise during a GFC. Our empirical results confirm that the median remains GFC-robust even in the presence of these new extreme value models. This is illustrated by using the S&P500 index before, during and after the 2008-09 GFC. We investigate the performance of a variety of single and combined VaR forecasts in terms of daily capital requirements and violation penalties under the Basel II Accord, as well as other criteria, including several tests for independence of the violations. The strategy based on the median, or more generally, on combined forecasts of single models, is straightforward to incorporate into existing computer software packages that are used by banks and other financial institutions.Value-at-Risk (VaR); DPOT; daily capital charges; robust forecasts; violation penalties; optimizing strategy; aggressive risk management; conservative risk management; Basel; global financial crisis
    • 

    corecore