16 research outputs found
Variational Bayesian Decision-making for Continuous Utilities
Bayesian decision theory outlines a rigorous framework for making optimal decisions based on maximizing expected utility over a model posterior. However, practitioners often do not have access to the full posterior and resort to approximate inference strategies. In such cases, taking the eventual decision-making task into account while performing the inference allows for calibrating the posterior approximation to maximize the utility. We present an automatic pipeline that co-opts continuous utilities into variational inference algorithms to account for decision-making. We provide practical strategies for approximating and maximizing the gain, and empirically demonstrate consistent improvement when calibrating approximations for specific utilities.Peer reviewe
Uplift Modeling with High Class Imbalance
Uplift modeling refers to estimating the causal effect of a treatment on an individual ob- servation, used for instance to identify customers worth targeting with a discount in e- commerce. We introduce a simple yet effective undersampling strategy for dealing with the prevalent problem of high class imbalance (low conversion rate) in such applications. Our strategy is agnostic to the base learners and produces a 6.5% improvement over the best published benchmark for the largest public uplift data which incidentally exhibits high class imbalance. We also introduce a new metric on calibration for uplift modeling and present a strategy to improve the calibration of the proposed method.Peer reviewe
Estimating glycemic impact of cooking recipes via online crowdsourcing and machine learning
National Research Foundation (NRF) Singapore under its International Research Centres in Singapore Funding Initiativ
1,2,3,4,6 penta-O -galloyl-ÎČ-D-glucose modulates perivascular inflammation and prevents vascular dysfunction in angiotensin II-induced hypertension
Background and Purpose:
Hypertension is a multifactorial disease, manifested by vascular dysfunction, increased superoxide production and perivascular inflammation. In this study, we have hypothesized that 1,2,3,4,6 PentaâOâGalloylâÎČâDâGlucose (PGG) would inhibit vascular inflammation and protect from vascular dysfunction in an experimental model of hypertension.
Experimental Approach:
PGG was administered every two days in a dose of 10 mg·kgâ1 i.p during 14âdays of Ang II infusion and was used in a final concentration of 20 ÎŒM for in vitro studies.
Key Results:
Ang II administration increased leukocyte and T cell content in perivascular adipose tissue (pVAT) and administration of PGG significantly decreased total leukocyte and T cell infiltration in pVAT (1640±150 vs. 1028±57, p<0.01; 321±22 vs 158±18, cells/mg; p<0.01, respectively). This effect was observed in relation to all T cell subsets. PGG also decreased the content of T cells bearing CD25, CCR5 and CD44 receptors and the expression of both MCPâ1 in aorta and RANTES in pVAT. PGG administration decreased the content of TNF+ and IFNâÎł+ CD8 T cells and ILâ17A+ CD4+ and CD3+CD4âCD8â cells. Importantly, these effects of PGG were associated with improved vascular function and decreased ROS production in the aortas of Ang IIâinfused animals independently of blood pressure increase. Mechanistically, PGG (20 ÎŒM) directly inhibited CD25 and CCR5 expression in cultured T cells. It also decreased the content of IFNâÎł+ by CD8+ and CD3+CD4âCD8â cells and ILâ17A+ by CD3+CD4âCD8â cells.
Conclusion and Implication:
PGG may constitute an interesting immunomodulating strategy in the regulation of vascular dysfunction and hypertension
Correcting Predictions for Approximate Bayesian Inference
Bayesian models quantify uncertainty and facilitate optimal decision-making in downstream applications. For most models, however, practitioners are forced to use approximate inference techniques that lead to sub-optimal decisions due to incorrect posterior predictive distributions. We present a novel approach that corrects for inaccuracies in posterior inference by altering the decision-making process. We train a separate model to make optimal decisions under the approximate posterior, combining interpretable Bayesian modeling with optimization of direct predictive accuracy in a principled fashion. The solution is generally applicable as a plug-in module for predictive decision-making for arbitrary probabilistic programs, irrespective of the posterior inference strategy. We demonstrate the approach empirically in several problems, confirming its potential.Peer reviewe
Investigating and predicting online food recipe upload behavior
Studying online food behavior has recently become an active field of research. While there is a growing body of studies that investigate, for example, how online recipes are consumed, in the form of views or ratings, little effort has been devoted yet to understand how they are created. In order to contribute to this lack of knowledge in the area, we present in this paper the results of a large-scale study of nearly 200k users posting over 400k recipes in the online recipe platform Kochbar.de. The main objective of this study is (i) to reveal to what extent recipe upload patterns can be explained by socio-demographic features and (ii) to what extent they can be predicted. To do so, we investigate the utility of several features such as user history, social connections of the users, temporal aspects as well as geographic embedding of the users. Statistical analysis confirms that recipe uploads can be explained by socio-demographic features. Extensive simulations show that among all features investigated, the social signal, in the form of friendship connections to other users, appears to be the strongest one and henceforth is the best to predict what type of recipe will be uploaded and what ingredients will be used in the future. The research conducted in this work contributes to a better understanding in online food behavior and is relevant for researchers working on online social information systems and engineers interested in predictive modeling and recommender systems
Harnessing Natural Experiments to Quantify the Causal Effect of Badges
A wide variety of online platforms use digital badges to encourage users to
take certain types of desirable actions. However, despite their growing
popularity, their causal effect on users' behavior is not well understood. This
is partly due to the lack of counterfactual data and the myriad of complex
factors that influence users' behavior over time. As a consequence, their
design and deployment lacks general principles.
In this paper, we focus on first-time badges, which are awarded after a user
takes a particular type of action for the first time, and study their causal
effect by harnessing the delayed introduction of several badges in a popular
Q&A website. In doing so, we introduce a novel causal inference framework for
badges whose main technical innovations are a robust survival-based hypothesis
testing procedure, which controls for the utility heterogeneity across users,
and a bootstrap difference-in-differences method, which controls for the random
fluctuations in users' behavior over time. We find that first-time badges steer
users' behavior if the utility a user obtains from taking the corresponding
action is sufficiently low, otherwise, the badge does not have a significant
effect. Moreover, for badges that successfully steered user behavior, we
perform a counterfactual analysis and show that they significantly improved the
functioning of the site at a community level
Prior Specification for Bayesian Matrix Factorization via Prior Predictive Matching
The behavior of many Bayesian models used in machine learning critically depends on the choice of prior distributions, controlled by some hyperparameters typically selected through Bayesian optimization or cross-validation. This requires repeated, costly, posterior inference. We provide an alternative for selecting good priors without carrying out posterior inference, building on the prior predictive distribution that marginalizes the model parameters. We estimate virtual statistics for data generated by the prior predictive distribution and then optimize over the hyperparameters to learn those for which the virtual statistics match the target values provided by the user or estimated from (a subset of) the observed data. We apply the principle for probabilistic matrix factorization, for which good solutions for prior selection have been missing. We show that for Poisson factorization models we can analytically determine the hyperparameters, including the number of factors, that best replicate the target statistics, and we empirically study the sensitivity of the approach for the model mismatch. We also present a model-independent procedure that determines the hyperparameters for general models by stochastic optimization and demonstrate this extension in the context of hierarchical matrix factorization models.Peer reviewe
Can Circulating Cardiac Biomarkers Be Helpful in the Assessment of LMNA Mutation Carriers?
Mutations in the lamin A/C gene are variably phenotypically expressed; however, it is unclear whether circulating cardiac biomarkers are helpful in the detection and risk assessment of cardiolaminopathies. We sought to assess (1) clinical characteristics including serum biomarkers: high sensitivity troponin T (hsTnT) and N-terminal prohormone brain natriuretic peptide (NT-proBNP) in clinically stable cardiolaminopathy patients, and (2) outcome among pathogenic/likely pathogenic lamin A/C gene (LMNA) mutation carriers. Our single-centre cohort included 53 patients from 21 families. Clinical, laboratory, follow-up data were analysed. Median follow-up was 1522 days. The earliest abnormality, emerging in the second and third decades of life, was elevated hsTnT (in 12% and in 27% of patients, respectively), followed by the presence of atrioventricular block, heart failure, and malignant ventricular arrhythmia (MVA). In patients with missense vs. other mutations, we found no difference in MVA occurrence and, surprisingly, worse transplant-free survival. Increased levels of both hsTnT and NT-proBNP were strongly associated with MVA occurrence (HR > 13, p ≤ 0.02 in both) in univariable analysis. In multivariable analysis, NT-proBNP level > 150 pg/mL was the only independent indicator of MVA. We conclude that assessment of circulating cardiac biomarkers may help in the detection and risk assessment of cardiolaminopathies