1,200 research outputs found
Automatic Variational Inference in Stan
Variational inference is a scalable technique for approximate Bayesian
inference. Deriving variational inference algorithms requires tedious
model-specific calculations; this makes it difficult to automate. We propose an
automatic variational inference algorithm, automatic differentiation
variational inference (ADVI). The user only provides a Bayesian model and a
dataset; nothing else. We make no conjugacy assumptions and support a broad
class of models. The algorithm automatically determines an appropriate
variational family and optimizes the variational objective. We implement ADVI
in Stan (code available now), a probabilistic programming framework. We compare
ADVI to MCMC sampling across hierarchical generalized linear models,
nonconjugate matrix factorization, and a mixture model. We train the mixture
model on a quarter million images. With ADVI we can use variational inference
on any model we write in Stan
Automatic Differentiation Variational Inference
Probabilistic modeling is iterative. A scientist posits a simple model, fits
it to her data, refines it according to her analysis, and repeats. However,
fitting complex models to large data is a bottleneck in this process. Deriving
algorithms for new models can be both mathematically and computationally
challenging, which makes it difficult to efficiently cycle through the steps.
To this end, we develop automatic differentiation variational inference (ADVI).
Using our method, the scientist only provides a probabilistic model and a
dataset, nothing else. ADVI automatically derives an efficient variational
inference algorithm, freeing the scientist to refine and explore many models.
ADVI supports a broad class of models-no conjugacy assumptions are required. We
study ADVI across ten different models and apply it to a dataset with millions
of observations. ADVI is integrated into Stan, a probabilistic programming
system; it is available for immediate use
Abandon Statistical Significance
We discuss problems the null hypothesis significance testing (NHST) paradigm
poses for replication and more broadly in the biomedical and social sciences as
well as how these problems remain unresolved by proposals involving modified
p-value thresholds, confidence intervals, and Bayes factors. We then discuss
our own proposal, which is to abandon statistical significance. We recommend
dropping the NHST paradigm--and the p-value thresholds intrinsic to it--as the
default statistical paradigm for research, publication, and discovery in the
biomedical and social sciences. Specifically, we propose that the p-value be
demoted from its threshold screening role and instead, treated continuously, be
considered along with currently subordinate factors (e.g., related prior
evidence, plausibility of mechanism, study design and data quality, real world
costs and benefits, novelty of finding, and other factors that vary by research
domain) as just one among many pieces of evidence. We have no desire to "ban"
p-values or other purely statistical measures. Rather, we believe that such
measures should not be thresholded and that, thresholded or not, they should
not take priority over the currently subordinate factors. We also argue that it
seldom makes sense to calibrate evidence as a function of p-values or other
purely statistical measures. We offer recommendations for how our proposal can
be implemented in the scientific publication process as well as in statistical
decision making more broadly
Efficient computation of the first passage time distribution of the generalized master equation by steady-state relaxation
The generalized master equation or the equivalent continuous time random walk
equations can be used to compute the macroscopic first passage time
distribution (FPTD) of a complex stochastic system from short-term microscopic
simulation data. The computation of the mean first passage time and additional
low-order FPTD moments can be simplified by directly relating the FPTD moment
generating function to the moments of the local FPTD matrix. This relationship
can be physically interpreted in terms of steady-state relaxation, an extension
of steady-state flow. Moreover, it is amenable to a statistical error analysis
that can be used to significantly increase computational efficiency. The
efficiency improvement can be extended to the FPTD itself by modelling it using
a Gamma distribution or rational function approximation to its Laplace
transform
Logical Segmentation of Source Code
Many software analysis methods have come to rely on machine learning
approaches. Code segmentation - the process of decomposing source code into
meaningful blocks - can augment these methods by featurizing code, reducing
noise, and limiting the problem space. Traditionally, code segmentation has
been done using syntactic cues; current approaches do not intentionally capture
logical content. We develop a novel deep learning approach to generate logical
code segments regardless of the language or syntactic correctness of the code.
Due to the lack of logically segmented source code, we introduce a unique data
set construction technique to approximate ground truth for logically segmented
code. Logical code segmentation can improve tasks such as automatically
commenting code, detecting software vulnerabilities, repairing bugs, labeling
code functionality, and synthesizing new code.Comment: SEKE2019 Conference Full Pape
A mitotic kinase scaffold depleted in testicular seminomas impacts spindle orientation in germ line stem cells.
Correct orientation of the mitotic spindle in stem cells underlies organogenesis. Spindle abnormalities correlate with cancer progression in germ line-derived tumors. We discover a macromolecular complex between the scaffolding protein Gravin/AKAP12 and the mitotic kinases, Aurora A and Plk1, that is down regulated in human seminoma. Depletion of Gravin correlates with an increased mitotic index and disorganization of seminiferous tubules. Biochemical, super-resolution imaging, and enzymology approaches establish that this Gravin scaffold accumulates at the mother spindle pole during metaphase. Manipulating elements of the Gravin-Aurora A-Plk1 axis prompts mitotic delay and prevents appropriate assembly of astral microtubules to promote spindle misorientation. These pathological responses are conserved in seminiferous tubules from Gravin(-/-) mice where an overabundance of Oct3/4 positive germ line stem cells displays randomized orientation of mitotic spindles. Thus, we propose that Gravin-mediated recruitment of Aurora A and Plk1 to the mother (oldest) spindle pole contributes to the fidelity of symmetric cell division
Recommended from our members
Rich state, poor state, red state, blue state: What's the matter with Connecticut?
For decades, the Democrats have been viewed as the party of the poor, with the Republicans representing the rich. Recent presidential elections, however, have shown a reverse pattern, with Democrats performing well in the richer blue states in the northeast and coasts, and Republicans dominating in the red states in the middle of the country and the south. Through multilevel modeling of individual-level survey data and county- and state-level demographic and electoral data, we reconcile these patterns. Furthermore, we find that income matters more in red America than in blue America. In poor states, rich people are much more likely than poor people to vote for the Republican presidential candidate, but in rich states (such as Connecticut), income has a very low correlation with vote preference
A Language-Agnostic Model for Semantic Source Code Labeling
Code search and comprehension have become more difficult in recent years due
to the rapid expansion of available source code. Current tools lack a way to
label arbitrary code at scale while maintaining up-to-date representations of
new programming languages, libraries, and functionalities. Comprehensive
labeling of source code enables users to search for documents of interest and
obtain a high-level understanding of their contents. We use Stack Overflow code
snippets and their tags to train a language-agnostic, deep convolutional neural
network to automatically predict semantic labels for source code documents. On
Stack Overflow code snippets, we demonstrate a mean area under ROC of 0.957
over a long-tailed list of 4,508 tags. We also manually validate the model
outputs on a diverse set of unlabeled source code documents retrieved from
Github, and we obtain a top-1 accuracy of 86.6%. This strongly indicates that
the model successfully transfers its knowledge from Stack Overflow snippets to
arbitrary source code documents.Comment: MASES 2018 Publicatio
Fossa Navicularis Strictures Due to 22F Catheters Used in Robotic Radical Prostatectomy
Background and objectivesFossa navicularis strictures following radical prostatectomy are reported infrequently. We recently experienced a series of fossa strictures following robot-assisted laparoscopic radical prostatectomy. Fossa strictures are usually procedure-induced, arising from urethral trauma or infection; catheter size has not been reported as a factor. We describe herein our experience to determine and prevent fossa navicularis stricture development.MethodsFrom June 2002 until February 2005, 248 patients underwent robot-assisted laparoscopic prostatectomy with the da Vinci surgical system at our institution. Fossa strictures were diagnosed based on acute onset of obstructive voiding symptoms, IPSS and flow pattern changes, and bougie calibration. During our series, we switched from an 18F to a 22F catheter to avoid inadvertent stapling of the urethra when dividing the dorsal venous complex. All patients had an 18F catheter placed after the anastomosis for 1 week. Parameters were evaluated using Fisher's exact test and the Student t test for means.ResultsThe 18F catheter group (n=117) developed 1 fossa stricture, whereas the 22F catheter group (n=131) developed 9 fossa strictures (P=0.02). The fossa stricture rate in the 18F group was 0.9% versus 6.9% in the 22F group. The 2 groups had no differences in age, body mass index, cardiovascular disease, International Prostate Symptom Score, urinary bother score, SHIM score, preoperative PSA, operative time, estimated blood loss, cautery use, prostate size, or catheterization time.ConclusionsUsing a larger urethral catheter size during intraoperative dissection appears to increase the risk 8-fold for fossa stricture as compared with the 18F catheter. The pneumoperitoneum and prolonged extreme Trendelenberg position could potentially contribute to local urethral ischemia
- …