488 research outputs found
Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck
The Symmetric Information Bottleneck (SIB), an extension of the more familiar
Information Bottleneck, is a dimensionality reduction technique that
simultaneously compresses two random variables to preserve information between
their compressed versions. We introduce the Generalized Symmetric Information
Bottleneck (GSIB), which explores different functional forms of the cost of
such simultaneous reduction. We then explore the dataset size requirements of
such simultaneous compression. We do this by deriving bounds and
root-mean-squared estimates of statistical fluctuations of the involved loss
functions. We show that, in typical situations, the simultaneous GSIB
compression requires qualitatively less data to achieve the same errors
compared to compressing variables one at a time. We suggest that this is an
example of a more general principle that simultaneous compression is more data
efficient than independent compression of each of the input variables
Deep Variational Multivariate Information Bottleneck -- A Framework for Variational Losses
Variational dimensionality reduction methods are known for their high
accuracy, generative abilities, and robustness. We introduce a framework to
unify many existing variational methods and design new ones. The framework is
based on an interpretation of the multivariate information bottleneck, in which
an encoder graph, specifying what information to compress, is traded-off
against a decoder graph, specifying a generative model. Using this framework,
we rederive existing dimensionality reduction methods including the deep
variational information bottleneck and variational auto-encoders. The framework
naturally introduces a trade-off parameter extending the deep variational CCA
(DVCCA) family of algorithms to beta-DVCCA. We derive a new method, the deep
variational symmetric informational bottleneck (DVSIB), which simultaneously
compresses two variables to preserve information between their compressed
representations. We implement these algorithms and evaluate their ability to
produce shared low dimensional latent spaces on Noisy MNIST dataset. We show
that algorithms that are better matched to the structure of the data (in our
case, beta-DVCCA and DVSIB) produce better latent spaces as measured by
classification accuracy, dimensionality of the latent variables, and sample
efficiency. We believe that this framework can be used to unify other
multi-view representation learning algorithms and to derive and implement novel
problem-specific loss functions
Fluctuations and response in complex biological systems: Watching stochastic evolutionary and ecological pattern dynamics
My research uses computational and analytical techniques from statistical physics to examine spatial patterns and dynamics in complex biological systems. More specifically I used these techniques to analyze aspects of three different complex biological systems, namely stochastic Turing patterns, transposon and retrotransposon dynamics in live cells, and bistability in ant foraging.
In collaboration with experimentalists at MIT and UIUC, I have shown how noise can stabilize emergent behaviors such as Turing patterns in biofilms. Normally, one would think that noise destroys patterns but we found that fluctuations in the copy number of signaling molecules acting as activator and inhibitors of gene expression leads to pattern formation. Surprisingly, we can show theoretically that these fluctuations increase the range of experimental conditions in which patterns can form.
In collaboration with experimentalists at UIUC, we have observed how evolution acts on variation in time, space, and genome locus by imaging live cells with fluorescent reporters that allow us to track transposon dynamics. Transposons, also known as "jumping genes," are found in all organisms and have activity that can cause mutations and drive evolution. As part of this collaboration I developed the software for image analysis of the cells and analyzed the resulting statistics of events. We discovered that the excision rate of transposons depends on orientation of the element, spatial location of the cell, and some heritable factors.
In a follow-up experiment, I recently developed a model to explain our collaborators' observation that the number of retrotransposon transcripts, transcripts produced by a copy and paste type of mobile genetic element, produces an exponential growth dependence defect. I developed a model for the copy number dynamics of retroelements and the time it takes these elements to be lost from a population of cells depending on the observed growth rate defect, transposition rate, and inactivation rate. This model explains why Group II introns are present in about 30% of bacterial species, while retrotransposons are essentially absent. This research sheds light on the early evolution of the eukaryotic spliceosome, the cellular machinery allowing complex organisms to remove intra-gene junk DNA during gene expression.
I have extended a model for ants foraging from two food sources to include indirect recruitment of ants with pheromones rather than direct recruitment by the ants themselves. This model continues to show bistable foraging for ants when their population is below a critical population size that depends on the deposition rate and evaporation rate of pheromones
Simultaneous Dimensionality Reduction: A Data Efficient Approach for Multimodal Representations Learning
We explore two primary classes of approaches to dimensionality reduction
(DR): Independent Dimensionality Reduction (IDR) and Simultaneous
Dimensionality Reduction (SDR). In IDR methods, of which Principal Components
Analysis is a paradigmatic example, each modality is compressed independently,
striving to retain as much variation within each modality as possible. In
contrast, in SDR, one simultaneously compresses the modalities to maximize the
covariation between the reduced descriptions while paying less attention to how
much individual variation is preserved. Paradigmatic examples include Partial
Least Squares and Canonical Correlations Analysis. Even though these DR methods
are a staple of statistics, their relative accuracy and data set size
requirements are poorly understood. We introduce a generative linear model to
synthesize multimodal data with known variance and covariance structures to
examine these questions. We assess the accuracy of the reconstruction of the
covariance structures as a function of the number of samples, signal-to-noise
ratio, and the number of varying and covarying signals in the data. Using
numerical experiments, we demonstrate that linear SDR methods consistently
outperform linear IDR methods and yield higher-quality, more succinct
reduced-dimensional representations with smaller datasets. Remarkably,
regularized CCA can identify low-dimensional weak covarying structures even
when the number of samples is much smaller than the dimensionality of the data,
which is a regime challenging for all dimensionality reduction methods. Our
work corroborates and explains previous observations in the literature that SDR
can be more effective in detecting covariation patterns in data. These findings
suggest that SDR should be preferred to IDR in real-world data analysis when
detecting covariation is more important than preserving variation.Comment: 12 pages, 6 figures in the main text. 6 pages, 6 figures in the
supplementary materia
Loose packings of frictional spheres
We have produced loose packings of cohesionless, frictional spheres by
sequential deposition of highly-spherical, monodisperse particles through a
fluid. By varying the properties of the fluid and the particles, we have
identified the Stokes number (St) - rather than the buoyancy of the particles
in the fluid - as the parameter controlling the approach to the loose packing
limit. The loose packing limit is attained at a threshold value of St at which
the kinetic energy of a particle impinging on the packing is fully dissipated
by the fluid. Thus, for cohesionless particles, the dynamics of the deposition
process, rather than the stability of the static packing, defines the random
loose packing limit. We have made direct measurements of the interparticle
friction in the fluid, and present an experimental measurement of the loose
packing volume fraction, \phi_{RLP}, as a function of the friction coefficient
\mu_s.Comment: 6 pages, 5 figure
Model for Screened, Charge-Regulated Electrostatics of an Eye Lens Protein: Bovine GammaB-Crystallin
We model screened, site-specific charge regulation of the eye lens protein bovine gammaB-crystallin (γB) and study the probability distributions of its proton occupancy patterns. Using a simplified dielectric model, we solve the linearized Poisson-Boltzmann equation to calculate a 54 × 54 work-of-charging matrix, each entry being the modeled voltage at a given titratable site, due to an elementary charge at another site. The matrix quantifies interactions within patches of sites, including γB charge pairs. We model intrinsic pK values that would occur hypothetically in the absence of other charges, with use of experimental data on the dependence of pK values on aqueous solution conditions, the dielectric model, and literature values. We use Monte Carlo simulations to calculate a model grand-canonical partition function that incorporates both the work-of-charging and the intrinsic pK values for isolated γB molecules and we calculate the probabilities of leading proton occupancy configurations, for 4 \u3c pH \u3c 8 and Debye screening lengths from 6 to 20 A. We select the interior dielectric ˚ value to model γB titration data. At pH 7.1 and Debye length 6.0 A, on a given ˚ γB molecule the predicted top occupancy pattern is present nearly 20% of the time, and 90% of the time one or another of the first 100 patterns will be present. Many of these occupancy patterns differ in net charge sign as well as in surface voltage profile. We illustrate how charge pattern probabilities deviate from the multinomial distribution that would result from use of effective pK values alone and estimate the extents to which γB charge pattern distributions broaden at lower pH and narrow as ionic strength is lowered. These results suggest that for accurate modeling of orientation-dependent γB-γB interactions, consideration of numerous pairs of proton occupancy patterns will be needed
Measuring Controlled-NOT and two-qubit gate operation
Accurate characterisation of two-qubit gates will be critical for any
realisation of quantum computation. We discuss a range of measurements aimed at
characterising a two-qubit gate, specifically the CNOT gate. These measurements
are architecture-independent, and range from simple truth table measurements,
to single figure measures such as the fringe visibility, parity, fidelity, and
entanglement witnesses, through to whole-state and whole-gate measures achieved
respectively via quantum state and process tomography. In doing so, we examine
critical differences between classical and quantum gate operation.Comment: 10 pages (two-column). 1 figur
Neutrino Quasielastic Scattering on Nuclear Targets: Parametrizing Transverse Enhancement (Meson Exchange Currents)
We present a parametrization of the observed enhancement in the transverse
electron quasielastic (QE) response function for nucleons bound in carbon as a
function of the square of the four momentum transfer () in terms of a
correction to the magnetic form factors of bound nucleons. The parametrization
should also be applicable to the transverse cross section in neutrino
scattering. If the transverse enhancement originates from meson exchange
currents (MEC), then it is theoretically expected that any enhancement in the
longitudinal or axial contributions is small. We present the predictions of the
"Transverse Enhancement" model (which is based on electron scattering data
only) for the differential and total QE cross sections
for nucleons bound in carbon. The dependence of the transverse
enhancement is observed to resolve much of the long standing discrepancy in the
QE total cross sections and differential distributions between low energy and
high energy neutrino experiments on nuclear targets.Comment: Revised Version- July 21, 2011: 17 pages, 20 Figures. To be published
in Eur. Phys. J.
Decreased protein binding of moxifloxacin in patients with sepsis?
The mean (SD) unbound fraction of moxifloxacin in plasma from patients with severe sepsis or septic shock was determined by ultrafiltration to 85.5±3.0% (range 81.9 and 91.6%) indicating a decreased protein binding of moxifloxacin in this population compared with the value of 58–60% provided in the Summary of Product Characteristics. However, previous investigations neglected the influence of pH and temperature on the protein binding of moxifloxacin. Maintaining physiological conditions (pH 7.4, 37°C) – as in the present study – the unbound fraction of moxifloxacin in plasma from healthy volunteers was 84%. In contrast, the unbound fraction of moxifloxacin was 77% at 4°C and 66–68% in unbuffered plasma or at pH 8.5 in fair agreement with previously published data. PK/PD parameters e.g. fAUC/MIC or ratios between interstitial fluid and free plasma concentrations, which were obtained assuming a protein binding rate of moxifloxacin of 40% or more, should be revised
Spontaneous emission and level shifts in absorbing disordered dielectrics and dense atomic gases: A Green's function approach
Spontaneous emission and Lamb shift of atoms in absorbing dielectrics are
discussed. A Green's-function approach is used based on the multipolar
interaction Hamiltonian of a collection of atomic dipoles with the quantised
radiation field. The rate of decay and level shifts are determined by the
retarded Green's-function of the interacting electric displacement field, which
is calculated from a Dyson equation describing multiple scattering. The
positions of the atomic dipoles forming the dielectrics are assumed to be
uncorrelated and a continuum approximation is used. The associated unphysical
interactions between different atoms at the same location is eliminated by
removing the point-interaction term from the free-space Green's-function (local
field correction). For the case of an atom in a purely dispersive medium the
spontaneous emission rate is altered by the well-known Lorentz local-field
factor. In the presence of absorption a result different from previously
suggested expressions is found and nearest-neighbour interactions are shown to
be important.Comment: 6 pages no figure
- …