37,133 research outputs found

    Flexible Tweedie regression models for continuous data

    Full text link
    Tweedie regression models provide a flexible family of distributions to deal with non-negative highly right-skewed data as well as symmetric and heavy tailed data and can handle continuous data with probability mass at zero. The estimation and inference of Tweedie regression models based on the maximum likelihood method are challenged by the presence of an infinity sum in the probability function and non-trivial restrictions on the power parameter space. In this paper, we propose two approaches for fitting Tweedie regression models, namely, quasi- and pseudo-likelihood. We discuss the asymptotic properties of the two approaches and perform simulation studies to compare our methods with the maximum likelihood method. In particular, we show that the quasi-likelihood method provides asymptotically efficient estimation for regression parameters. The computational implementation of the alternative methods is faster and easier than the orthodox maximum likelihood, relying on a simple Newton scoring algorithm. Simulation studies showed that the quasi- and pseudo-likelihood approaches present estimates, standard errors and coverage rates similar to the maximum likelihood method. Furthermore, the second-moment assumptions required by the quasi- and pseudo-likelihood methods enables us to extend the Tweedie regression models to the class of quasi-Tweedie regression models in the Wedderburn's style. Moreover, it allows to eliminate the non-trivial restriction on the power parameter space, and thus provides a flexible regression model to deal with continuous data. We provide \texttt{R} implementation and illustrate the application of Tweedie regression models using three data sets.Comment: 34 pages, 8 figure

    Challenges of Big Data Analysis

    Full text link
    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

    Forecasting using a large number of predictors: Is Bayesian regression a valid alternative to principal components?

    Get PDF
    This paper considers Bayesian regression with normal and doubleexponential priors as forecasting methods based on large panels of time series. We show that, empirically, these forecasts are highly correlated with principal component forecasts and that they perform equally well for a wide range of prior choices. Moreover, we study the asymptotic properties of the Bayesian regression under Gaussian prior under the assumption that data are quasi collinear to establish a criterion for setting parameters in a large cross-section. JEL Classification: C11, C13, C33, C53Bayesian VAR, large cross-sections, Lasso regression, principal components, ridge regression

    Volatility forecasting

    Get PDF
    Volatility has been one of the most active and successful areas of research in time series econometrics and economic forecasting in recent decades. This chapter provides a selective survey of the most important theoretical developments and empirical insights to emerge from this burgeoning literature, with a distinct focus on forecasting applications. Volatility is inherently latent, and Section 1 begins with a brief intuitive account of various key volatility concepts. Section 2 then discusses a series of different economic situations in which volatility plays a crucial role, ranging from the use of volatility forecasts in portfolio allocation to density forecasting in risk management. Sections 3, 4 and 5 present a variety of alternative procedures for univariate volatility modeling and forecasting based on the GARCH, stochastic volatility and realized volatility paradigms, respectively. Section 6 extends the discussion to the multivariate problem of forecasting conditional covariances and correlations, and Section 7 discusses volatility forecast evaluation methods in both univariate and multivariate cases. Section 8 concludes briefly. JEL Klassifikation: C10, C53, G1

    Modeling Factor Demands with SEM and VAR: An Empirical Comparison

    Get PDF
    The empirical analysis of the economic interactions between factors of production, output and corresponding prices has received much attention over the last two decades. Most contributions in this area have agreed on the neoclassical principle of a representative optimizing firm and typically use theory-based structural equation models (SEM). A popular alternative to SEM is given by the vector autoregression (VAR) methodology. The most recent attempts to link the SEM approach with VAR analysis in the area of factor demands concentrate on single-equation models, whereas no effort has been devoted to compare these alternative approaches when a firm is assumed to face a multi-factor technology and to decide simultaneously the optimal quantity for each input. This paper bridges this gap. First, we illustrate how the SEM and the VAR approaches can both represent valid alternatives to model systems of dynamic factor demands. Second, we show how to apply both methodologies to estimate dynamic factor demands derived from a cost-minimizing capital-labour-energy-materials (KLEM) technology with adjustment costs (ADC) on the quasi-fixed capital factor. Third, we explain how to use both models to calculate some widely accepted indicators of the production structure of an economic sector, such as price and quantity elasticities, and alternative measures of ADC. In particular, we propose and discuss some theoretical and empirical justifications of the differences between observed elasticities, measures of ADC, and the assumption of exogeneity of output and/or input prices. Finally, we offer some suggestions for the applied researcher.Simultaneous equation models, Vector autoregression models, Factor demands, Dynamic duality

    Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

    Full text link
    In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance

    Explaining Growth in Dutch Agriculture: Prices, Public R&D, and Technological Change

    Get PDF
    This paper analyzes the sources of growth of Dutch agriculture (arable, meat, and dairy sectors). Because the time series data (1950-1997) are non-stationary and not cointegrated, it is argued that a model estimated in first differences should be used. Estimated price elasticities turn out to be very inelastic, both in the short-run and the long-run. The direct distortionary effect of price support has therefore been rather limited. However, price support has an important indirect effect by improving the sectors investment possibilities and therewith the capital stock. Public R&D expenditure mainly affected agriculture by contributing to yield improvement therewith favoring intensification of production.growth, technology, cointegration, non-stationarity, agricultural policy, Agribusiness, Q18, O13,
    corecore