553 research outputs found
Phase transition in PCA with missing data: Reduced signal-to-noise ratio, not sample size!
How does missing data affect our ability to learn signal structures? It has
been shown that learning signal structure in terms of principal components is
dependent on the ratio of sample size and dimensionality and that a critical
number of observations is needed before learning starts (Biehl and Mietzner,
1993). Here we generalize this analysis to include missing data. Probabilistic
principal component analysis is regularly used for estimating signal structures
in datasets with missing data. Our analytic result suggests that the effect of
missing data is to effectively reduce signal-to-noise ratio rather than - as
generally believed - to reduce sample size. The theory predicts a phase
transition in the learning curves and this is indeed found both in simulation
data and in real datasets.Comment: Accepted to ICML 2019. This version is the submitted pape
Calculation of pressure fields from arbitrarily shaped, apodized, and excited ultrasound transducers
Statistical modelling of conidial discharge of entomophthoralean fungi using a newly discovered Pandora species
Entomophthoralean fungi are insect pathogenic fungi and are characterized by
their active discharge of infective conidia that infect insects. Our aim was to
study the effects of temperature on the discharge and to characterize the
variation in the associated temporal pattern of a newly discovered Pandora
species with focus on peak location and shape of the discharge. Mycelia were
incubated at various temperatures in darkness, and conidial discharge was
measured over time. We used a novel modification of a statistical model
(pavpop), that simultaneously estimates phase and amplitude effects, into a
setting of generalized linear models. This model is used to test hypotheses of
peak location and discharge of conidia. The statistical analysis showed that
high temperature leads to an early and fast decreasing peak, whereas there were
no significant differences in total number of discharged conidia. Using the
proposed model we also quantified the biological variation in the timing of the
peak location at a fixed temperature.Comment: 23 pages including supplementary materia
not-MIWAE: Deep Generative Modelling with Missing not at Random Data
When a missing process depends on the missing values themselves, it needs to
be explicitly modelled and taken into account while doing likelihood-based
inference. We present an approach for building and fitting deep latent variable
models (DLVMs) in cases where the missing process is dependent on the missing
data. Specifically, a deep neural network enables us to flexibly model the
conditional distribution of the missingness pattern given the data. This allows
for incorporating prior information about the type of missingness (e.g.
self-censoring) into the model. Our inference technique, based on
importance-weighted variational inference, involves maximising a lower bound of
the joint likelihood. Stochastic gradients of the bound are obtained by using
the reparameterisation trick both in latent space and data space. We show on
various kinds of data sets and missingness patterns that explicitly modelling
the missing process can be invaluable.Comment: Camera-ready version for ICLR 202
Enterococcus faecalis bacteremia: please do the echo
Infective endocarditis (IE) caused by Enterococcus faecalis (E. faecalis) is a disease of the elderly with an increasing incidence, often health-care associated and with in-hospital mortality rates around 10-20%. E. faecalis IE is notoriously challenging to diagnose due to unspecific symptoms, often presenting with a complex clinical picture with low-grade fever and only moderately elevated infectious parameters. In a newly published prospective multicenter study using echocardiography to screen E. faecalis bacteremia patients, we found an IE prevalence as high as 26%. The 344 included patients with E. faecalis bacteremia had a mean age of 74 (±12) years confirming that it is indeed a disease of the elderly. The key feature of the study was that echocardiography was performed in all patients including transesophageal echocardiography (TEE) in 74%. Transthoracic echocardiography (TTE) missed vegetations in half of the cases where TEE demonstrated vegetations, underlining the importance of TEE
- …