61,369 research outputs found
Measuring measuring: Toward a theory of proficiency with the Constructing Measures framework
This paper is relevant to measurement educators who are interested in the variability of understanding and use of the four building blocks in the Constructing Measures framework (Wilson, 2005). It proposes a uni-dimensional structure for understanding Wilsonâs framework, and explores the evidence for and against this conceptualization. Constructed and fixed choice response items are utilized to collect responses from 72 participants who range in experience and expertise with constructing measures. The data was scored by two raters and was analyzed with the Rasch partial credit model using ConQuest (1998). Guided by the 1999 Testing Standards, analyses of validity and reliability evidence provide support for the construct theory and limited uses of the instrument pending item design modifications
Backtesting Expected Shortfall: a simple recipe?
We propose a new backtesting framework for Expected Shortfall that could be
used by the regulator. Instead of looking at the estimated capital reserve and
the realised cash-flow separately, one could bind them into the secured
position, for which risk measurement is much easier. Using this simple concept
combined with monotonicity of Expected Shortfall with respect to its target
confidence level we introduce a natural and efficient backtesting framework.
Our test statistics is given by the biggest number of worst realisations for
the secured position that add up to a negative total. Surprisingly, this simple
quantity could be used to construct an efficient backtesting framework for
unconditional coverage of Expected Shortfall in a natural extension of the
regulatory traffic-light approach for Value-at-Risk. While being easy to
calculate, the test statistic is based on the underlying duality between
coherent risk measures and scale-invariant performance measures
Edge-weighting of gene expression graphs
In recent years, considerable research efforts have been directed to micro-array technologies and their role in providing simultaneous information on expression profiles for thousands of genes. These data, when subjected to clustering and classification procedures, can assist in identifying patterns and providing insight on biological processes. To understand the properties of complex gene expression datasets, graphical representations can be used. Intuitively, the data can be represented in terms of a bipartite graph, with weighted edges corresponding to gene-sample node couples in the dataset. Biologically meaningful subgraphs can be sought, but performance can be influenced both by the search algorithm, and, by the graph-weighting scheme and both merit rigorous investigation. In this paper, we focus on edge-weighting schemes for bipartite graphical representation of gene expression. Two novel methods are presented: the first is based on empirical evidence; the second on a geometric distribution. The schemes are compared for several real datasets, assessing efficiency of performance based on four essential properties: robustness to noise and missing values, discrimination, parameter influence on scheme efficiency and reusability. Recommendations and limitations are briefly discussed
Evolution of statistical analysis in empirical software engineering research: Current state and steps forward
Software engineering research is evolving and papers are increasingly based
on empirical data from a multitude of sources, using statistical tests to
determine if and to what degree empirical evidence supports their hypotheses.
To investigate the practices and trends of statistical analysis in empirical
software engineering (ESE), this paper presents a review of a large pool of
papers from top-ranked software engineering journals. First, we manually
reviewed 161 papers and in the second phase of our method, we conducted a more
extensive semi-automatic classification of papers spanning the years 2001--2015
and 5,196 papers. Results from both review steps was used to: i) identify and
analyze the predominant practices in ESE (e.g., using t-test or ANOVA), as well
as relevant trends in usage of specific statistical methods (e.g.,
nonparametric tests and effect size measures) and, ii) develop a conceptual
model for a statistical analysis workflow with suggestions on how to apply
different statistical methods as well as guidelines to avoid pitfalls. Lastly,
we confirm existing claims that current ESE practices lack a standard to report
practical significance of results. We illustrate how practical significance can
be discussed in terms of both the statistical analysis and in the
practitioner's context.Comment: journal submission, 34 pages, 8 figure
Are Smell-Based Metrics Actually Useful in Effort-Aware Structural Change-Proneness Prediction? An Empirical Study
Bad code smells (also named as code smells) are symptoms of poor design choices in implementation. Existing studies empirically confirmed that the presence of code smells increases the likelihood of subsequent changes (i.e., change-proness). However, to the best of our knowledge, no prior studies have leveraged smell-based metrics to predict particular change type (i.e., structural changes). Moreover, when evaluating the effectiveness of smell-based metrics in structural change-proneness prediction, none of existing studies take into account of the effort inspecting those change-prone source code. In this paper, we consider five smell-based metrics for effort-aware structural change-proneness prediction and compare these metrics with a baseline of well-known CK metrics in predicting particular categories of change types. Specifically, we first employ univariate logistic regression to analyze the correlation between each smellbased metric and structural change-proneness. Then, we build multivariate prediction models to examine the effectiveness of smell-based metrics in effort-aware structural change-proneness prediction when used alone and used together with the baseline metrics, respectively. Our experiments are conducted on six Java open-source projects with up to 60 versions and results indicate that: (1) all smell-based metrics are significantly related to structural change-proneness, except metric ANS in hive and SCM in camel after removing confounding effect of file size; (2) in most cases, smell-based metrics outperform the baseline metrics in predicting structural change-proneness; and (3) when used together with the baseline metrics, the smell-based metrics are more effective to predict change-prone files with being aware of inspection effort
GFC-Robust Risk Management Under the Basel Accord Using Extreme Value Methodologies
In McAleer et al. (2010b), a robust risk management strategy to the Global Financial Crisis (GFC) was proposed under the Basel II Accord by selecting a Value-at-Risk (VaR) forecast that combines the forecasts of different VaR models. The robust forecast was based on the median of the point VaR forecasts of a set of conditional volatility models. In this paper we provide further evidence on the suitability of the median as a GFC-robust strategy by using an additional set of new extreme value forecasting models and by extending the sample period for comparison. These extreme value models include DPOT and Conditional EVT. Such models might be expected to be useful in explaining financial data, especially in the presence of extreme shocks that arise during a GFC. Our empirical results confirm that the median remains GFC-robust even in the presence of these new extreme value models. This is illustrated by using the S&P500 index before, during and after the 2008-09 GFC. We investigate the performance of a variety of single and combined VaR forecasts in terms of daily capital requirements and violation penalties under the Basel II Accord, as well as other criteria, including several tests for independence of the violations. The strategy based on the median, or more generally, on combined forecasts of single models, is straightforward to incorporate into existing computer software packages that are used by banks and other financial institutions.Value-at-Risk (VaR); DPOT; daily capital charges; robust forecasts; violation penalties; optimizing strategy; aggressive risk management; conservative risk management; Basel; global financial crisis
- âŠ