We review the methods used in many papers to evaluate DSGE models by comparing their simulated moments and other features with data equivalents. We note that they select, scale and characterise the shocks without reference to the data; crucially they fail to use the joint distribution of the features under comparison. We illustrate this point by recomputing an assessment of a two-country model in a recent paper; we find that the paper's conclusions are essentially reversed

Cardiff University

Le, Vo Phuong Mai

Minford, Patrick

Wickens, Michael

Online Research @ Cardiff

                       Cardiff Economics  Working Papers Vo Phuong Mai Le, Patrick Minford and Michael Wickens Some problems in the testing of DSGE models E2009/31 CARDIFF BUSINESS SCHOOL WORKING PAPER SERIES  This working paper is produced for discussion purpose only. These working papers are expected to be published in due course, in revised form, and should not be quoted or cited without the author’s written permission. Cardiff Economics Working Papers are available online from: http://www.cardiff.ac.uk/carbs/econ/workingpapers Enquiries: EconWP@cardiff.ac.uk  ISSN 1749-6101 Updated December 2009   Cardiff Business School Cardiff University Colum Drive Cardiff CF10 3EU United Kingdom t: +44 (0)29 2087 4000 f: +44 (0)29 2087 4419 www.cardiff.ac.uk/carbs  Some problems in the testing of DSGE modelsVo Phuong Mai Le (Cardi¤ University)yPatrick Minford (Cardi¤ University and CEPR)zMichael Wickens (Cardi¤ University, University of York and CEPR)xDecember 18, 2009AbstractWe review the methods used in many papers to evaluate DSGE models by comparing theirsimulated moments and other features with data equivalents. We note that they select, scale andcharacterise the shocks without reference to the data; crucially they fail to use the joint distributionof the features under comparison. We illustrate this point by recomputing an assessment of a two-country model in a recent paper; we nd that the papers conclusions are essentially reversed.JEL Classication: C12, C32, C52, E1,Keywords: Bootstrap, US-EU model, DSGE, VAR, indirect inference, Wald statistic, anomaly,puzzleWe thank Arnab Bhattarachee, Huw Dixon, George Evans, Paul Levine, David Meenagh, Jo Pearlman, ChristianThoenissen, Jiang Yue and participants at the CDMA Conference in St. Andrews 2009, for helpful comments. This workwas supported by the UKs Economic and Social Research Council under grants RES-165-25-0020 and PTA-026-27-1623.yLevp@cf.ac.uk; Cardi¤ Business School, Cardi¤ University, Aberconway Building, Colum Drive, Cardi¤, CF10 3EU,UKzPatrick.minford@btinternet.com; Cardi¤ Business School, Cardi¤ University, Aberconway Building, Colum Drive,Cardi¤, CF10 3EU, UKxCardi¤ Business School, Cardi¤ University, Aberconway Building, Colum Drive, Cardi¤, CF10 3EU, UK11 Evaluating DSGE models some problems with existing pro-ceduresA popular way to evaluate macro DSGE models is to ask whether they can replicate factssuch as acorrelation between two variables, like consumption and the real exchange rate generally a reduced formrelationship that is reliably supposed to be observed in various available samples of data. Thus a model,when subjected to the shocks that under its theory are supposed to strike it, should generate undersuch stochastic simulation a relationship on average close to the one in the data, while also similarlygenerating other relationships close to those found in business cycle data.The question we address in this paper is whether this procedure, as widely carried out at present, isbeing correctly implemented from the viewpoint of statistical testing. We highlight three aspects of thisprocedure:1) It is usual to limit the shocks applied to the DSGE model to a few: for example productivity, moneysupply and government spending shocks. Yet consider a DSGE model of say 5 endogenous variables and5 equations. When confronted with the data this models implied residuals can be estimated togetherwith the residualstime-series properties. These residuals are the shocksit implies. Furthermore, unlessthe model includes all 5 of these shocks, stochastic singularityoccurs, whereby at least one variable is adeterministic combination of the others necessarily this will be false unless the shockin one equationis zero. While one or more shocks might be treated as non-stochastic (eg as measurement errors), thismust be justied on the grounds that such were one-o¤in nature; then the data needs to be adjustedfor their e¤ect before the comparison of the stochastic simulations with the facts.2) One or more of these shocks may also be scaled to t some basic variances, such as that of GDP;also their time-series properties may be imposed by the modeller. Yet when we know as above the trueshocks conditional on the model being tested (the null), any arbitrary choice of scale and time-seriesproperties will violate the model and data. For example it is possible that the model greatly over- orunder-predicts the data moments: by scaling the errors this discrepancy can be removed. Yet in e¤ectone is multiplying the true errors by a set of impact parameters that now enter the model in addition toexisting parameters; these parameters also impact on other variables.3) There is no consideration of the joint distributions of the chosen descriptors. It is usual to show thedistributions of the descriptors individually. Yet the test is whether as a group they are consistent withthe model; for this it is necessary to use their joint distribution since the model will imply covariancesbetween these descriptors. Thus for example the Fisher equation in a DSGE model will imply that thepersistence of ination and interest rates, two favourite descriptors, will be highly correlated: thus insamples created by the DSGE model from its shocks where ination is persistent, so will interest ratesbe; and vice versa.2 The problems in practice in a well-known two-country DSGEmodelWe now illustrate the problems we have highlighted by reworking a well-known paper modelling a two-country world of the US and EU, Chari, Kehoe and McGrattan (2002), CKM; their work is fairlyrepresentative see also for example Henriksen, Kydland and Sustek (2008), and Kollmann (2009).In their paper CKM calibrate a model of the US and Europe in which wage and price rigidity,adjustment costs in investment, habit persistence in consumption, and incomplete asset markets areall assumed at various stages. CKMs aim is primarily to replicate the autocorrelation and volatilityof prices, the nominal and the real exchange rate in the data but they are also concerned to replicatesome business cycle moments variability of consumption/investment/net exports relative to GDP, theirautocorrelations and their cross-correlations. Their conclusion is that they can match the business cyclebehaviour reasonably but are left with a minor anomaly that the model cannot quite generate enoughpesistence in the real exchange rate and a key anomaly, that the real exchange rate is highly positivelycorrelated with relative consumption in the model but hardly at all in the data.Our aim is to examine these claims by applying the methods of CKM to an essentially similarmodel Le et al. (2009) constructed based on the work of Smets and Wouters (2003, 2007; SW). CKMemphasise that their claims are robust to a wide range of model specications, including adding a varietyof shocks, varying the monetary rule between a money supply rule and a Taylor Rule, including completeinternational risk sharing as opposed to uncovered interest parity in bonds. It is therefore natural to makeuse of the work Le et al. did on SWs model where they addressed the full range of issues noted above;they replaced assumed shocks with actual and estimated shocks backed out of the model, they included2all observable shocks, unscaled; and they computed the distributions (both individual and joint) of thedescriptors or factsunder the null hypothesis of the model itself. We are able, using their methods,largely to replicate the nature of the CKM results as reported, though we nd that their minor anomalyis no longer an anomaly. Thus allowing for the trueshocks makes some di¤erence. However, what wewill show is that when joint distributions are allowed for, the implications of these results are quiteimportantly di¤erent.2.1 Replicating CKMs ndingsLet us rst show how our procedures replicate or alter CKMs ndings: a) that the model matchesbusiness cycle behaviour b) that it just fails to match the real exchange rates persistence c) that it failsmassively to match the absence of correlation between the real exchange rate and relative consumption.a) We nd that the SW model we have used to represent CKMs broadly replicates their resultson business cycle variables. Thus the Table below nds as they do in Table 6 of their paper that theSW model matches a set of business cycle moments. For all but two of the moments shown the modelmean bootstrap estimate is within two (model-generated) standard deviations of the data values. Thisis actually rather better than CKM nd for their model, so if anything it strengthens their claim thatmodels of this general type do well in matching the business cycle individual measures.DataModel bootstrapEstimate(Model bootstrapStdev)Standard deviations Relative to GDPUSConsumption 1:1324 1:0411 (0:1414)Investment 4:1514 2:4789 (0:4564)Net exports 3:3390 3:7587 (1:1972)EUConsumption 1:2313 1:0692 (0:0881)Investment 3:2080 3:9733 (0:5338)AutocorrelationsUSGDP 0:9362 0:9331 (0:0340)Consumption 0:9484 0:9400 (0:0345)Investment 0:9697 0:9379 (0:0290)Net exports 0:9356 0:9132 (0:0460)EUGDP 0:9502 0:8724 (0:0598)Consumption 0:9651 0:8643 (0:0624)Investment 0:9638 0:9269 (0:0353)Cross-Correlationsbetween foreign and domesticGDP 0:4449 0:1282 (0:3551)Consumption 0:0261 0:0096 (0:3749)Investment  0:2342 0:0611 (0:3339)between net exports and US GDP  0:1043  0:4916 (0:2842)between real exchange rate andUS GDP  0:2312  0:3687 (0:3185)Net exports  0:1653 0:9786 (0:0177) outside 2-standard-deviation boundsTable 1: Business cycle statistics for the model3With b) we nd in fact that the model can comfortably match the persistence of the real exchangerate, taken on its own, at all lags; thus CKMs minor anomaly is as they imply not at all serious. Theyget close and we nd that when full stochastic simulation of all shocks is carried out, they would havesucceeded. The reason appears to be simpy a failure to generate the models bounds using all the shockscoming from the data. This is the main alteration we nd due to using the true shocks implied by themodel.1 2 3 4 5 6 7 8 9 10-0.100.10.20.30.40.50.60.70.80.9RXR v.RXR(-i)Figure 1: RXR auto-correlationFinally, we consider c) CKMs key anomaly. Here as with a) we replicate CKMs results with theSW model. It too generates powerful positive correlations between the real exchange rate and relativeconsumption at all leads and lags. The lower 95% bound of these does indeed lie well above the datacross-correlation.2 4 6 8 10-0.200.20.40.60.8CEU-CUS v.RXR(-i+1)2 4 6 8 10-0.200.20.40.60.8CEU-CUS v.RXR(+i-1)Figure 2: Correlation of CEU   CUS with RXRThus we can see that the SW version of CKM does replicate qualitatively the results they report intheir work, with the marginal exception of the real exchange rates persistence where we suggest theirminor anomalywas due to not including the full range of shocks backed out of the data and the model.42.2 Augmenting and reinterpreting CKMs ndingsCKMs conclusion from these results is that even though the model matches business cycle data ithas a particular aw in being unable to match the real exchange rate/relative consumption behaviour.Thus they suggest future research should look for a mechanism that remedies this aw (possibly in theinternational risk-sharing area) such as later for example Kollmann (2009).However, now consider the joint distribution of the business cycle features in Table 1. In contrastto CKMs claims, while the SW model of the US and the EU matches these business cycle momentsindividually as shown, it does not does not match the joint behaviour of real business cycle variables atall well. The Wald statistic for the joint distribution of all these descriptors is 100 (ie the joint data valuesfall in the 100th percentile of the bootstrapped distribution), indicating strong joint rejection. Even ifwe exclude the two features (asterisked) that individually do not t from this distribution, the Waldstatistic remains at 100. The normalised Mahalanobis distances are respectively 1593 and 54 (against a95% value of 1.65).This strong rejection is general across whatever measures one likes to use of the models t to businesscycle data. We report in Le et al. (2009) a wide variety of measures that this model fails to t jointly,including data variances, VAR parameters and Impulse Response Functions.Since these features are generally within their 95% bounds individually, the reason for this rejectionseems to be that the joint distributions are far more demanding than the totality of the single distributionsbecause these chosen features are highly correlated according to the DSGE model being tested.Now if we turn to the major anomaly highlighted by CKM we see it in a rather di¤erent light. Itnow appears that this particular correlation rejects the model in common with general business cyclebehaviour. Notice that the joint behaviour of variables in the model in general sets up tight restrictions onthe joint distribution of data features. Whenever several variables are interacting, their joint behaviouraccording to the model follows particular patterns and these are in general not found in the data. Thusthe failure to match the complex joint corrrelation between two countries consumption and the realexchange rate is an unsurprising implication of a failure to match consumptions joint behaviour withother business cycle variables.3 ConclusionsOur aim here has been to highlight the pitfalls of current procedures used to assess DSGE model per-formance. We have argued that the models shocks should be estimated from the data, not imposed,and that all the shocks should be used; but more importantly that model properties should be assessedagainst the data using their joint distributions which generally pose more stringent requirements thantheir single distributions viewed collectively. We illustrated this in a DSGE model of the US and EUcharacterised by a high degree of nominal rigidity; we found for such a model that while indeed datafeatures could be matched individually they rejected the model jointly, in particular consumptions be-haviour jointly with other business cycle variables including the real exchange rate. The general point isnot that this model failed in some unique respect but rather that it failed generally by imposing ligreerestrictions on joint variable behaviour that were just not found in the data.References[1] Chari, V., P. J. Kehoe and E. McGrattan (2002) Can sticky price models generate volatile andpersistent real exchange rates?Sta¤Report 277 from Federal Reserve Bank of Minneapolis, publishedin Review of Economic Studies, Vol. 69, No. 3, July 2002, pp. 533-563.[2] Henriksen, E., Finn E. Kydland, and Roman Sustek (2008) The High Cross-Country Correlations ofPrices and Interest Rates, mimeo, Bank of England[3] Kollmann, Robert (2009) Household Heterogeneity and the Real Exchange Rate: Still a Puzzle,CEPR Discussion Paper No 7301, C.E.P.R., London.[4] Le, M., D. Meenagh, P. Minford and M. Wickens (2009) Two Orthogonal Continents: Testing aTwo-country DSGE Model of the US and EU Using Indirect Inference, Cardi¤University EconomicsWorking Paper http://www.cf.ac.uk/carbs/econ/workingpapers/papers/E2009_3.pdf, forthcomingin Open Economies Review.5[5] Smets, F. & R. Wouters, 2003. An Estimated Dynamic Stochastic General Equilibrium Model of theEuro Area, Journal of the European Economic Association, MIT Press, vol. 1(5), pages 1123-1175,09.[6] Smets, F. & R. Wouters, 2007. Shocks and Frictions in US Business Cycles: A Bayesian DSGEApproach, American Economic Review, American Economic Association, vol. 97(3), pages 586-606,June6

Some problems in the testing of DSGE models

https://orca.cardiff.ac.uk/id/eprint/77849/1/e2009_31.pdf

Some problems in the testing of DSGE models

Abstract

Similar works

Full text

Available Versions

Online Research @ Cardiff