Search CORE

1,200 research outputs found

Automatic Variational Inference in Stan

Author: Blei David M.
Gelman Andrew
Kucukelbir Alp
Ranganath Rajesh
Publication venue
Publication date: 01/01/2015
Field of study

Variational inference is a scalable technique for approximate Bayesian inference. Deriving variational inference algorithms requires tedious model-specific calculations; this makes it difficult to automate. We propose an automatic variational inference algorithm, automatic differentiation variational inference (ADVI). The user only provides a Bayesian model and a dataset; nothing else. We make no conjugacy assumptions and support a broad class of models. The algorithm automatically determines an appropriate variational family and optimizes the variational objective. We implement ADVI in Stan (code available now), a probabilistic programming framework. We compare ADVI to MCMC sampling across hierarchical generalized linear models, nonconjugate matrix factorization, and a mixture model. We train the mixture model on a quarter million images. With ADVI we can use variational inference on any model we write in Stan

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Automatic Differentiation Variational Inference

Author: Blei David M.
Gelman Andrew
Kucukelbir Alp
Ranganath Rajesh
Tran Dustin
Publication venue
Publication date: 02/03/2016
Field of study

Probabilistic modeling is iterative. A scientist posits a simple model, fits it to her data, refines it according to her analysis, and repeats. However, fitting complex models to large data is a bottleneck in this process. Deriving algorithms for new models can be both mathematically and computationally challenging, which makes it difficult to efficiently cycle through the steps. To this end, we develop automatic differentiation variational inference (ADVI). Using our method, the scientist only provides a probabilistic model and a dataset, nothing else. ADVI automatically derives an efficient variational inference algorithm, freeing the scientist to refine and explore many models. ADVI supports a broad class of models-no conjugacy assumptions are required. We study ADVI across ten different models and apply it to a dataset with millions of observations. ADVI is integrated into Stan, a probabilistic programming system; it is available for immediate use

arXiv.org e-Print Archive

Princeton University Open Access Repository

Abandon Statistical Significance

Author: Gal David
Gelman Andrew
McShane Blakeley B.
Robert Christian
Tackett Jennifer L.
Publication venue
Publication date: 08/09/2018
Field of study

We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain unresolved by proposals involving modified p-value thresholds, confidence intervals, and Bayes factors. We then discuss our own proposal, which is to abandon statistical significance. We recommend dropping the NHST paradigm--and the p-value thresholds intrinsic to it--as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with currently subordinate factors (e.g., related prior evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain) as just one among many pieces of evidence. We have no desire to "ban" p-values or other purely statistical measures. Rather, we believe that such measures should not be thresholded and that, thresholded or not, they should not take priority over the currently subordinate factors. We also argue that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. We offer recommendations for how our proposal can be implemented in the scientific publication process as well as in statistical decision making more broadly

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Efficient computation of the first passage time distribution of the generalized master equation by steady-state relaxation

Author: Anton K. Faradjian
David Shalloway
Gelman A. B.
Kubo R.
Van Kampen N. G.
Zwanzig R.
Publication venue: 'AIP Publishing'
Publication date: 24/10/2005
Field of study

The generalized master equation or the equivalent continuous time random walk equations can be used to compute the macroscopic first passage time distribution (FPTD) of a complex stochastic system from short-term microscopic simulation data. The computation of the mean first passage time and additional low-order FPTD moments can be simplified by directly relating the FPTD moment generating function to the moments of the local FPTD matrix. This relationship can be physically interpreted in terms of steady-state relaxation, an extension of steady-state flow. Moreover, it is amenable to a statistical error analysis that can be used to significantly increase computational efficiency. The efficiency improvement can be extended to the FPTD itself by modelling it using a Gamma distribution or rational function approximation to its Laplace transform

arXiv.org e-Print Archive

Crossref

CERN Document Server

Logical Segmentation of Source Code

Author: Dormuth Jacob
Gelman Ben
Moore Jessica
Slater David
Publication venue: 'KSI Research Inc.'
Publication date: 18/07/2019
Field of study

Many software analysis methods have come to rely on machine learning approaches. Code segmentation - the process of decomposing source code into meaningful blocks - can augment these methods by featurizing code, reducing noise, and limiting the problem space. Traditionally, code segmentation has been done using syntactic cues; current approaches do not intentionally capture logical content. We develop a novel deep learning approach to generate logical code segments regardless of the language or syntactic correctness of the code. Due to the lack of logically segmented source code, we introduce a unique data set construction technique to approximate ground truth for logically segmented code. Logical code segmentation can improve tasks such as automatically commenting code, detecting software vulnerabilities, repairing bugs, labeling code functionality, and synthesizing new code.Comment: SEKE2019 Conference Full Pape

arXiv.org e-Print Archive

Crossref

A mitotic kinase scaffold depleted in testicular seminomas impacts spindle orientation in germ line stem cells.

Author: Bucko Paula
Canton David
Gelman Irwin
Hehnly Heidi
Langeberg Lorene K
Ogier Leah
Santana L Fernando
Scott John D
Wordeman Linda
Publication venue: eScholarship, University of California
Publication date: 01/09/2015
Field of study

Correct orientation of the mitotic spindle in stem cells underlies organogenesis. Spindle abnormalities correlate with cancer progression in germ line-derived tumors. We discover a macromolecular complex between the scaffolding protein Gravin/AKAP12 and the mitotic kinases, Aurora A and Plk1, that is down regulated in human seminoma. Depletion of Gravin correlates with an increased mitotic index and disorganization of seminiferous tubules. Biochemical, super-resolution imaging, and enzymology approaches establish that this Gravin scaffold accumulates at the mother spindle pole during metaphase. Manipulating elements of the Gravin-Aurora A-Plk1 axis prompts mitotic delay and prevents appropriate assembly of astral microtubules to promote spindle misorientation. These pathological responses are conserved in seminiferous tubules from Gravin(-/-) mice where an overabundance of Oct3/4 positive germ line stem cells displays randomized orientation of mitotic spindles. Thus, we propose that Gravin-mediated recruitment of Aurora A and Plk1 to the mother (oldest) spindle pole contributes to the fidelity of symmetric cell division

PubMed Central

eScholarship - University of California

Recommended from our members

Rich state, poor state, red state, blue state: What's the matter with Connecticut?

Author: Bafumi Joseph
Gelman Andrew E.
Park David K.
Shor Boris
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2007
Field of study

For decades, the Democrats have been viewed as the party of the poor, with the Republicans representing the rich. Recent presidential elections, however, have shown a reverse pattern, with Democrats performing well in the richer blue states in the northeast and coasts, and Republicans dominating in the red states in the middle of the country and the south. Through multilevel modeling of individual-level survey data and county- and state-level demographic and electoral data, we reconcile these patterns. Furthermore, we find that income matters more in red America than in blue America. In poor states, rich people are much more likely than poor people to vote for the Republican presidential candidate, but in rich states (such as Connecticut), income has a very low correlation with vote preference

Columbia University Academic Commons

A Language-Agnostic Model for Semantic Source Code Labeling

Author: Gelman Ben
Hoyle Bryan
Moore Jessica
Saxe Joshua
Slater David
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/06/2019
Field of study

Code search and comprehension have become more difficult in recent years due to the rapid expansion of available source code. Current tools lack a way to label arbitrary code at scale while maintaining up-to-date representations of new programming languages, libraries, and functionalities. Comprehensive labeling of source code enables users to search for documents of interest and obtain a high-level understanding of their contents. We use Stack Overflow code snippets and their tags to train a language-agnostic, deep convolutional neural network to automatically predict semantic labels for source code documents. On Stack Overflow code snippets, we demonstrate a mean area under ROC of 0.957 over a long-tailed list of 4,508 tags. We also manually validate the model outputs on a diverse set of unlabeled source code documents retrieved from Github, and we obtain a top-1 accuracy of 86.6%. This strongly indicates that the model successfully transfers its knowledge from Stack Overflow snippets to arbitrary source code documents.Comment: MASES 2018 Publicatio

arXiv.org e-Print Archive

Crossref

Fossa Navicularis Strictures Due to 22F Catheters Used in Robotic Radical Prostatectomy

Author: Ahlering Thomas E.
Gelman Joel
Skarecky Douglas W.
Yee David S.
Publication venue: Society of Laparoendoscopic Surgeons
Publication date: 01/07/2007
Field of study

Background and objectivesFossa navicularis strictures following radical prostatectomy are reported infrequently. We recently experienced a series of fossa strictures following robot-assisted laparoscopic radical prostatectomy. Fossa strictures are usually procedure-induced, arising from urethral trauma or infection; catheter size has not been reported as a factor. We describe herein our experience to determine and prevent fossa navicularis stricture development.MethodsFrom June 2002 until February 2005, 248 patients underwent robot-assisted laparoscopic prostatectomy with the da Vinci surgical system at our institution. Fossa strictures were diagnosed based on acute onset of obstructive voiding symptoms, IPSS and flow pattern changes, and bougie calibration. During our series, we switched from an 18F to a 22F catheter to avoid inadvertent stapling of the urethra when dividing the dorsal venous complex. All patients had an 18F catheter placed after the anastomosis for 1 week. Parameters were evaluated using Fisher's exact test and the Student t test for means.ResultsThe 18F catheter group (n=117) developed 1 fossa stricture, whereas the 22F catheter group (n=131) developed 9 fossa strictures (P=0.02). The fossa stricture rate in the 18F group was 0.9% versus 6.9% in the 22F group. The 2 groups had no differences in age, body mass index, cardiovascular disease, International Prostate Symptom Score, urinary bother score, SHIM score, preoperative PSA, operative time, estimated blood loss, cautery use, prostate size, or catheterization time.ConclusionsUsing a larger urethral catheter size during intraoperative dissection appears to increase the risk 8-fold for fossa stricture as compared with the 18F catheter. The pneumoperitoneum and prolonged extreme Trendelenberg position could potentially contribute to local urethral ischemia

PubMed Central

eScholarship - University of California

The World Congress on Engineering 2015, WCE 2015

Author: Ao A.I.
Gelman Len
Hukins David WL
Hunter Andrew
Korsunsky A.M.
Publication venue: Newswood Ltd
Publication date: 01/07/2015
Field of study

The University of Manchester - Institutional Repository