37,133 research outputs found
Flexible Tweedie regression models for continuous data
Tweedie regression models provide a flexible family of distributions to deal
with non-negative highly right-skewed data as well as symmetric and heavy
tailed data and can handle continuous data with probability mass at zero. The
estimation and inference of Tweedie regression models based on the maximum
likelihood method are challenged by the presence of an infinity sum in the
probability function and non-trivial restrictions on the power parameter space.
In this paper, we propose two approaches for fitting Tweedie regression models,
namely, quasi- and pseudo-likelihood. We discuss the asymptotic properties of
the two approaches and perform simulation studies to compare our methods with
the maximum likelihood method. In particular, we show that the quasi-likelihood
method provides asymptotically efficient estimation for regression parameters.
The computational implementation of the alternative methods is faster and
easier than the orthodox maximum likelihood, relying on a simple Newton scoring
algorithm. Simulation studies showed that the quasi- and pseudo-likelihood
approaches present estimates, standard errors and coverage rates similar to the
maximum likelihood method. Furthermore, the second-moment assumptions required
by the quasi- and pseudo-likelihood methods enables us to extend the Tweedie
regression models to the class of quasi-Tweedie regression models in the
Wedderburn's style. Moreover, it allows to eliminate the non-trivial
restriction on the power parameter space, and thus provides a flexible
regression model to deal with continuous data. We provide \texttt{R}
implementation and illustrate the application of Tweedie regression models
using three data sets.Comment: 34 pages, 8 figure
Recommended from our members
Econometrics: A bird's eye view
As a unified discipline, econometrics is still relatively young and has been transforming and expanding very rapidly over the past few decades. Major advances have taken place in the analysis of cross sectional data by means of semi-parametric and non-parametric techniques. Heterogeneity of economic relations across individuals, firms and industries is increasingly acknowledge and attempts have been made to take them into account either by integrating out their effects or by modeling the sources of heterogeneity when suitable panel data exists. The counterfactual considerations that underlie policy analysis and treatment evaluation have been given a more satisfactory foundation. New time series econometric techniques have been developed and employed extensively in the areas of macroeconometrics and finance. Non-linear econometric techniques are used increasingly in the analysis of cross section and time series observations. Applications of Bayesian techniques to econometric problems have been given new impetus largely thanks to advances in computer power and computational techniques. The use of Bayesian techniques have in turn provided the investigators with a unifying framework where the tasks and forecasting, decision making, model evaluation and learning can be considered as parts of the same interactive and iterative process; thus paving the way for establishing the foundation of the "real time econometrics". This paper attempts to provide an overview of some of these developments
Challenges of Big Data Analysis
Big Data bring new opportunities to modern society and challenges to data
scientists. On one hand, Big Data hold great promises for discovering subtle
population patterns and heterogeneities that are not possible with small-scale
data. On the other hand, the massive sample size and high dimensionality of Big
Data introduce unique computational and statistical challenges, including
scalability and storage bottleneck, noise accumulation, spurious correlation,
incidental endogeneity, and measurement errors. These challenges are
distinguished and require new computational and statistical paradigm. This
article give overviews on the salient features of Big Data and how these
features impact on paradigm change on statistical and computational methods as
well as computing architectures. We also provide various new perspectives on
the Big Data analysis and computation. In particular, we emphasis on the
viability of the sparsest solution in high-confidence set and point out that
exogeneous assumptions in most statistical methods for Big Data can not be
validated due to incidental endogeneity. They can lead to wrong statistical
inferences and consequently wrong scientific conclusions
Forecasting using a large number of predictors: Is Bayesian regression a valid alternative to principal components?
This paper considers Bayesian regression with normal and doubleexponential priors as forecasting methods based on large panels of time series. We show that, empirically, these forecasts are highly correlated with principal component forecasts and that they perform equally well for a wide range of prior choices. Moreover, we study the asymptotic properties of the Bayesian regression under Gaussian prior under the assumption that data are quasi collinear to establish a criterion for setting parameters in a large cross-section. JEL Classification: C11, C13, C33, C53Bayesian VAR, large cross-sections, Lasso regression, principal components, ridge regression
Volatility forecasting
Volatility has been one of the most active and successful areas of research in time series econometrics and economic forecasting in recent decades. This chapter provides a selective survey of the most important theoretical developments and empirical insights to emerge from this burgeoning literature, with a distinct focus on forecasting applications. Volatility is inherently latent, and Section 1 begins with a brief intuitive account of various key volatility concepts. Section 2 then discusses a series of different economic situations in which volatility plays a crucial role, ranging from the use of volatility forecasts in portfolio allocation to density forecasting in risk management. Sections 3, 4 and 5 present a variety of alternative procedures for univariate volatility modeling and forecasting based on the GARCH, stochastic volatility and realized volatility paradigms, respectively. Section 6 extends the discussion to the multivariate problem of forecasting conditional covariances and correlations, and Section 7 discusses volatility forecast evaluation methods in both univariate and multivariate cases. Section 8 concludes briefly. JEL Klassifikation: C10, C53, G1
Modeling Factor Demands with SEM and VAR: An Empirical Comparison
The empirical analysis of the economic interactions between factors of production, output and corresponding prices has received much attention over the last two decades. Most contributions in this area have agreed on the neoclassical principle of a representative optimizing firm and typically use theory-based structural equation models (SEM). A popular alternative to SEM is given by the vector autoregression (VAR) methodology. The most recent attempts to link the SEM approach with VAR analysis in the area of factor demands concentrate on single-equation models, whereas no effort has been devoted to compare these alternative approaches when a firm is assumed to face a multi-factor technology and to decide simultaneously the optimal quantity for each input. This paper bridges this gap. First, we illustrate how the SEM and the VAR approaches can both represent valid alternatives to model systems of dynamic factor demands. Second, we show how to apply both methodologies to estimate dynamic factor demands derived from a cost-minimizing capital-labour-energy-materials (KLEM) technology with adjustment costs (ADC) on the quasi-fixed capital factor. Third, we explain how to use both models to calculate some widely accepted indicators of the production structure of an economic sector, such as price and quantity elasticities, and alternative measures of ADC. In particular, we propose and discuss some theoretical and empirical justifications of the differences between observed elasticities, measures of ADC, and the assumption of exogeneity of output and/or input prices. Finally, we offer some suggestions for the applied researcher.Simultaneous equation models, Vector autoregression models, Factor demands, Dynamic duality
Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data
In domains like bioinformatics, information retrieval and social network
analysis, one can find learning tasks where the goal consists of inferring a
ranking of objects, conditioned on a particular target object. We present a
general kernel framework for learning conditional rankings from various types
of relational data, where rankings can be conditioned on unseen data objects.
We propose efficient algorithms for conditional ranking by optimizing squared
regression and ranking loss functions. We show theoretically, that learning
with the ranking loss is likely to generalize better than with the regression
loss. Further, we prove that symmetry or reciprocity properties of relations
can be efficiently enforced in the learned models. Experiments on synthetic and
real-world data illustrate that the proposed methods deliver state-of-the-art
performance in terms of predictive power and computational efficiency.
Moreover, we also show empirically that incorporating symmetry or reciprocity
properties can improve the generalization performance
Explaining Growth in Dutch Agriculture: Prices, Public R&D, and Technological Change
This paper analyzes the sources of growth of Dutch agriculture (arable, meat, and dairy sectors). Because the time series data (1950-1997) are non-stationary and not cointegrated, it is argued that a model estimated in first differences should be used. Estimated price elasticities turn out to be very inelastic, both in the short-run and the long-run. The direct distortionary effect of price support has therefore been rather limited. However, price support has an important indirect effect by improving the sectors investment possibilities and therewith the capital stock. Public R&D expenditure mainly affected agriculture by contributing to yield improvement therewith favoring intensification of production.growth, technology, cointegration, non-stationarity, agricultural policy, Agribusiness, Q18, O13,
- …