7,357 research outputs found
The consistency of empirical comparisons of regression and analogy-based software project cost prediction
OBJECTIVE - to determine the consistency within and between results in empirical studies of software engineering cost estimation. We focus on regression and analogy techniques as these are commonly used. METHOD â we conducted an exhaustive search using predefined inclusion and exclusion criteria and identified 67 journal papers and 104 conference papers. From this sample we identified 11 journal papers and 9 conference papers that used both methods. RESULTS â our analysis found that about 25% of studies were internally inconclusive. We also found that there is approximately equal evidence in favour of, and against analogy-based methods. CONCLUSIONS â we confirm the lack of consistency in the findings and argue that this inconsistent pattern from 20 different studies comparing regression and analogy is somewhat disturbing. It suggests that we need to ask more detailed questions than just: âWhat is the best prediction system?
Software project economics: A roadmap
The objective of this paper is to consider research progress in the field of software project economics with a view to identifying important challenges and promising research directions. I argue that this is an important sub-discipline since this will underpin any cost-benefit analysis used to justify the resourcing, or otherwise, of a software project. To accomplish this I conducted a bibliometric analysis of peer reviewed research articles to identify major areas of activity. My results indicate that the primary goal of more accurate cost prediction systems remains largely unachieved. However, there are a number of new and promising avenues of research including: how we can combine results from primary studies, integration of multiple predictions and applying greater emphasis upon the human aspects of prediction tasks. I conclude that the field is likely to remain very challenging due to the people-centric nature of software engineering, since it is in essence a design task. Nevertheless the need for good economic models will grow rather than diminish as software becomes increasingly ubiquitous
Evaluating prediction systems in software project estimation
This is the Pre-print version of the Article - Copyright @ 2012 ElsevierContext: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results.
Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems.
Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random âpredictionsâ, that is guessing, and (3) calculation of effect sizes.
Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing.
Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.Martin Shepperd was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/H050329
Daytime sensible heat flux estimation over heterogeneous surfaces using multitemporal landâsurface temperature observations
Equations based on surface renewal (SR) analysis to estimate the sensible heat flux (H) require as input the mean ramp amplitude and period observed in the rampâlike pattern of the air temperature measured at high frequency. A SRâbased method to estimate sensible heat flux (HSRâLST) requiring only lowâfrequency measurements of the air temperature, horizontal mean wind speed, and landâsurface temperature as input was derived and tested under unstable conditions over a heterogeneous canopy (olive grove). HSRâLST assumes that the mean ramp amplitude can be inferred from the difference between landâsurface temperature and mean air temperature through a linear relationship and that the ramp frequency is related to a wind shear scale characteristic of the canopy flow. The landâsurface temperature was retrieved by integrating in situ sensing measures of thermal infrared energy emitted by the surface. The performance of HSRâLST was analyzed against flux tower measurements collected at two heights (close to and well above the canopy top). Crucial parameters involved in HSRâLST, which define the above mentioned linear relationship, were explained using the canopy height and the land surface temperature observed at sunrise and sunset. Although the olive grove can behave as either an isothermal or anisothermal surface, HSRâLST performed close to H measured using the eddy covariance and the Bowen ratio energy balance methods. Root mean square differences between HSRâLST and measured H were of about 55 W mâ2. Thus, by using multitemporal thermal acquisitions, HSRâLST appears to bypass inconsistency between land surface temperature and the mean aerodynamic temperature. The oneâsource bulk transfer formulation for estimating H performed reliable after calibration against the eddy covariance method. After calibration, the latter performed similar to the proposed SRâLST method.This research was funded by project CGL2012â37416âC04â01 and CGL2015â65627âC3â1âR (Ministerio de Ciencia y InnovaciĂłn of Spain), CEI Iberus, 2014 (Proyecto financiado por el Ministerio de EducaciĂłn en el marco del Programa Campus de Excelencia Internacional of Spain), and Ayuda para estancias en centros extranjeros (Ministerio de EducaciĂłn, Cultura y Deporte of Spain)
The Log of Gravity
Although economists have long been aware of Jensen's inequality, many econometric applications have neglected an important implication of it: the standard practice of interpreting the parameters of log-linearized models estimated by ordinary least squares as elasticities can be highly misleading in the presence of heteroskedasticity. This paper explains why this problem arises and proposes an appropriate estimator. Our criticism to conventional practices and the solution we propose extends to a broad range of economic applications where the equation under study is log-linearized. We develop the argument using one particular illustration, the gravity equation for trade, and apply the proposed technique to provide new estimates of this equation. We find significant differences between estimates obtained with the proposed estimator and those obtained with the traditional method. These discrepancies persist even when the gravity equation takes into account multilateral resistance terms or fixed effectsElasticities, Gravity equation, Heteroskedasticity, Jensens inequality, Poisson regression, Preferential-trade agreements
Regression Discontinuity Designs Using Covariates
We study regression discontinuity designs when covariates are included in the
estimation. We examine local polynomial estimators that include discrete or
continuous covariates in an additive separable way, but without imposing any
parametric restrictions on the underlying population regression functions. We
recommend a covariate-adjustment approach that retains consistency under
intuitive conditions, and characterize the potential for estimation and
inference improvements. We also present new covariate-adjusted mean squared
error expansions and robust bias-corrected inference procedures, with
heteroskedasticity-consistent and cluster-robust standard errors. An empirical
illustration and an extensive simulation study is presented. All methods are
implemented in \texttt{R} and \texttt{Stata} software packages
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
We empirically show that Bayesian inference can be inconsistent under
misspecification in simple linear regression problems, both in a model
averaging/selection and in a Bayesian ridge regression setting. We use the
standard linear model, which assumes homoskedasticity, whereas the data are
heteroskedastic, and observe that the posterior puts its mass on ever more
high-dimensional models as the sample size increases. To remedy the problem, we
equip the likelihood in Bayes' theorem with an exponent called the learning
rate, and we propose the Safe Bayesian method to learn the learning rate from
the data. SafeBayes tends to select small learning rates as soon the standard
posterior is not `cumulatively concentrated', and its results on our data are
quite encouraging.Comment: 70 pages, 20 figure
- âŚ