120 research outputs found
Optimizing B2B Product Offers with Machine Learning, Mixed Logit, and Nonlinear Programming
In B2B markets, value-based pricing and selling has become an important
alternative to discounting. This study outlines a modeling method that uses
customer data (product offers made to each current or potential customer,
features, discounts, and customer purchase decisions) to estimate a mixed logit
choice model. The model is estimated via hierarchical Bayes and machine
learning, delivering customer-level parameter estimates. Customer-level
estimates are input into a nonlinear programming next-offer maximization
problem to select optimal features and discount level for customer segments,
where segments are based on loyalty and discount elasticity. The mixed logit
model is integrated with economic theory (the random utility model), and it
predicts both customer perceived value for and response to alternative future
sales offers. The methodology can be implemented to support value-based pricing
and selling efforts.
Contributions to the literature include: (a) the use of customer-level
parameter estimates from a mixed logit model, delivered via a hierarchical
Bayes estimation procedure, to support value-based pricing decisions; (b)
validation that mixed logit customer-level modeling can deliver strong
predictive accuracy, not as high as random forest but comparing favorably; and
(c) a nonlinear programming problem that uses customer-level mixed logit
estimates to select optimal features and discounts
Polycentric governance and the impact of special districts on fiscal common pools
Local government services are increasingly being provided in fragmented polycentric systems where the overlapping jurisdictions draw resources from the same fiscal base. Developing optimal policies for the efficient management of fiscal resources requires a consideration of the total underlying fiscal pool. In this study, we evaluate the impact that special purpose districts have on debt ratios at the county “common pool” level in the State of Georgia (U.S.) between 2005-2014. Empirical findings suggest that inclusion of all general government and special purpose debt for each county may at times result in a greater burden on fiscal common pool than existing rules permit. These results call into question the efficacy of fiscal policies in a polycentric governance system that neglect to account for debt levels for all actors within the confines of a single fiscal common pool unit. Results also show that total debt ratios are significantly affected by special districts that operate within boundaries of a single county. We find no evidence that independent special districts have a differential impact on fiscal common pools compared to their dependent counterparts
Recommended from our members
Appropriate, accessible and appealing probabilistic graphical models
Appropriate - Many multivariate probabilistic models either use independent distributions or dependent Gaussian distributions. Yet, many real-world datasets contain count-valued or non-negative skewed data, e.g. bag-of-words text data and biological sequencing data. Thus, we develop novel probabilistic graphical models for use on count-valued and non-negative data including Poisson graphical models and multinomial graphical models. We develop one generalization that allows for triple-wise or k-wise graphical models going beyond the normal pairwise formulation. Furthermore, we also explore Gaussian-copula graphical models and derive closed-form solutions for the conditional distributions and marginal distributions (both before and after conditioning). Finally, we derive mixture and admixture, or topic model, generalizations of these graphical models to introduce more power and interpretability.
Accessible - Previous multivariate models, especially related to text data, often have complex dependencies without a closed form and require complex inference algorithms that have limited theoretical justification. For example, hierarchical Bayesian models often require marginalizing over many latent variables. We show that our novel graphical models (even the k-wise interaction models) have simple and intuitive estimation procedures based on node-wise regressions that likely have similar theoretical guarantees as previous work in graphical models. For the copula-based graphical models, we show that simple approximations could still provide useful models; these copula models also come with closed-form conditional and marginal distributions, which make them amenable to exploratory inspection and manipulation. The parameters of these models are easy to interpret and thus may be accessible to a wide audience.
Appealing - High-level visualization and interpretation of graphical models with even 100 variables has often been difficult even for a graphical model expert---despite visualization being one of the original motivators for graphical models. This difficulty is likely due to the lack of collaboration between graphical model experts and visualization experts. To begin bridging this gap, we develop a novel "what if?" interaction that manipulates and leverages the probabilistic power of graphical models. Our approach defines: the probabilistic mechanism via conditional probability; the query language to map text input to a conditional probability query; and the formal underlying probabilistic model. We then propose to visualize these query-specific probabilistic graphical models by combining the intuitiveness of force-directed layouts with the beauty and readability of word clouds, which pack many words into valuable screen space while ensuring words do not overlap via pixel-level collision detection. Although both the force-directed layout and the pixel-level packing problems are challenging in their own right, we approximate both simultaneously via adaptive simulated annealing starting from careful initialization. For visualizing mixture distributions, we also design a meaningful mapping from the properties of the mixture distribution to a color in the perceptually uniform CIELUV color space. Finally, we demonstrate our approach via illustrative visualizations of several real-world datasets.Computer Science
Time-varying coefficient models and measurement error
This thesis is concerned with presenting and developing modeling approaches which allow for a time-varying effect of covariates by using time-varying coefficients. The different approaches are compared in simulation studies. Thereby, we investigate how well different components of the simulated models can be identified. The models performing best in the simulation study are then applied to data collected within the study "Improved Air Quality and its Influences on Short-Term Health Effects in Erfurt, Eastern Germany". One specific aspect in this analysis is to assess the necessity of a time-varying estimate compared to a more parsimonious, time-constant fit.
A further topic is the estimation of time-varying coefficient models in the presence of measurement errors in the exposure variable. We specify a measurement error model
and present methods to estimate parameters and measurement error variances of the model in the case of autocorrelated latent exposure as well as measurement errors. Furthermore,
two methods adjusting for measurement errors in the context of time-varying coefficients are developed. The first one is based on a hierarchical Bayesian model and the Bayesian error correction principle. The second method is an extension of the well-known regression calibration approach to the case of autocorrelated data. The obtained estimated true values can then be included into the main model to assess the effect of the variable of interest. Finally, the approaches are again applied to the Erfurt data
Monitoring dugongs within the Reef 2050 Integrated Monitoring and Reporting Program: final report of the dugong team in the megafauna expert group
The objectives of this report are to determine for the dugong:
An assessment of the current status of the relevant elements of the Great Barrier Reef (the Reef), including an evaluation of primary drivers, pressures and responses using the Driving Forces, Pressures, States, Impacts, Responses (DPSIR) Framework;
Identification of priority indicators for monitoring the key values associated with these elements;
Summary of potential sources of data;
Evaluation of adequacy of existing monitoring activities within each theme to achieve the objectives and requirements of RIMReP;
Recommendations for the design of an integrated monitoring program as a component of RIMReP, specifically considering:
The information requirements for each key element of the Reef to ensure that appropriate data and information are being collected to meet the fundamental objectives of RIMReP;
The spatial and temporal sampling design to ensure that greatest value can be extracted from the data collected;
The logistics of the design to ensure that it can be implemented efficiently;
Likely funding required to implement the recommended monitoring design.An accessible copy of this report is not yet available from this repository, please contact [email protected] for more information
Safety Investigation of Traffic Crashes Incorporating Spatial Correlation Effects
One main interest in crash frequency modeling is to predict crash counts over a spatial domain of interest (e.g., traffic analysis zones (TAZs)). The macro-level crash prediction models can assist transportation planners with a comprehensive perspective to consider safety in the long-range transportation planning process. Most of the previous studies that have examined traffic crashes at the macro-level are related to high-income countries, whereas there is a lack of similar studies among lower- and middle-income countries where most road traffic deaths (90%) occur. This includes Middle Eastern countries, necessitating a thorough investigation and diagnosis of the issues and factors instigating traffic crashes in the region in order to reduce these serious traffic crashes. Since pedestrians are more vulnerable to traffic crashes compared to other road users, especially in this region, a safety investigation of pedestrian crashes is crucial to improving traffic safety. Riyadh, Saudi Arabia, which is one of the largest Middle East metropolises, is used as an example to reflect the representation of these countries\u27 characteristics, where Saudi Arabia has a rather distinct situation in that it is considered a high-income country, and yet it has the highest rate of traffic fatalities compared to their high-income counterparts. Therefore, in this research, several statistical methods are used to investigate the association between traffic crash frequency and contributing factors of crash data, which are characterized by 1) geographical referencing (i.e., observed at specific locations) or spatially varying over geographic units when modeled; 2) correlation between different response variables (e.g., crash counts by severity or type levels); and 3) temporally correlated. A Bayesian multivariate spatial model is developed for predicting crash counts by severity and type. Therefore, based on the findings of this study, policy makers would be able to suggest appropriate safety countermeasures for each type of crash in each zone
- …