Search CORE

106 research outputs found

FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters

Author: Bettina Grün
Friedrich Leisch
Publication venue
Publication date
Field of study

flexmix provides infrastructure for flexible fitting of finite mixture models in R using the expectation-maximization (EM) algorithm or one of its variants. The functionality of the package was enhanced. Now concomitant variable models as well as varying and constant parameters for the component specific generalized linear regression models can be fitted. The application of the package is demonstrated on several examples, the implementation described and examples given to illustrate how new drivers for the component specific models and the concomitant variable models can be defined.

Research Papers in Economics

topicmodels: An R Package for Fitting Topic Models

Author: Bettina Grün
Kurt Hornik
Publication venue
Publication date
Field of study

Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R package topicmodels provides basic infrastructure for fitting topic models based on data structures from the text mining package tm. The package includes interfaces to two algorithms for fitting topic models: the variational expectation-maximization algorithm provided by David M. Blei and co-authors and an algorithm using Gibbs sampling by Xuan-Hieu Phan and co-authors.

Research Papers in Economics

Automatic Generation of Exams in R

Author: Achim Zeileis
Bettina Grün
Publication venue
Publication date
Field of study

Package exams provides a framework for automatic generation of standardized statistical exams which is especially useful for large-scale exams. To employ the tools, users just need to supply a pool of exercises and a master file controlling the layout of the final PDF document. The exercises are specified in separate Sweave files (containing R code for data generation and LaTeX code for problem and solution description) and the master file is a LaTeX document with some additional control commands. This paper gives an overview of the main design aims and principles as well as strategies for adaptation and extension. Hands-on illustrations---based on example exercises and control files provided in the package---are presented to get new users started easily.

Research Papers in Economics

Semi-parametric Regression under Model Uncertainty: Economic Applications

Author: Grün Bettina
Hofmarcher Paul
Malsiner-Walli Gertraud
Publication venue: 'Wiley'
Publication date: 19/02/2019
Field of study

Economic theory does not always specify the functional relationship between dependent and explanatory variables, or even isolate a particular set of covariates. This means that model uncertainty is pervasive in empirical economics. In this paper, we indicate how Bayesian semi-parametric regression methods in combination with stochastic search variable selection can be used to address two model uncertainties simultaneously: (i) the uncertainty with respect to the variables which should be included in the model and (ii) the uncertainty with respect to the functional form of their effects. The presented approach enables the simultaneous identification of robust linear and nonlinear effects. The additional insights gained are illustrated on applications in empirical economics, namely willingness to pay for housing, and cross-country growth regression

Paris Lodron University of Salzburg

Elektronische Publikationen der Wirtschaftsuniversität Wien

arules - A Computational Environment for Mining Association Rules and Frequent Item Sets

Author: Bettina Grün
Kurt Hornik
Michael Hahsler
Publication venue
Publication date
Field of study

Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package also includes interfaces to two fast mining algorithms, the popular C implementations of Apriori and Eclat by Christian Borgelt. These algorithms can be used to mine frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules.

Research Papers in Economics

Identifying Mixtures of Mixtures Using Bayesian Estimation

Author: Frühwirth-Schnatter Sylvia
Grün Bettina
Malsiner-Walli Gertraud
Publication venue: 'Informa UK Limited'
Publication date: 20/06/2016
Field of study

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition this prior allows to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semi-parametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark data sets.Comment: 49 page

arXiv.org e-Print Archive

Elektronische Publikationen der Wirtschaftsuniversität Wien

FigShare

Automatic Generation of Exams in R

Author: Grün Bettina
Zeileis Achim
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/01/2009
Field of study

Package exams provides a framework for automatic generation of standardized statistical exams which is especially useful for large-scale exams. To employ the tools, users just need to supply a pool of exercises and a master file controlling the layout of the final PDF document. The exercises are specified in separate Sweave files (containing R code for data generation and LaTeX code for problem and solution description) and the master file is a LaTeX document with some additional control commands. This paper gives an overview of the main design aims and principles as well as strategies for adaptation and extension. Hands-on illustrations - based on example exercises and control files provided in the package - are presented to get new users started easily

Crossref

Directory of Open Access Journals

Elektronische Publikationen der Wirtschaftsuniversität Wien

Journal of Statistical Software

Research Online