55 research outputs found
Guessing probability distributions from small samples
We propose a new method for the calculation of the statistical properties, as
e.g. the entropy, of unknown generators of symbolic sequences. The probability
distribution of the elements of a population can be approximated by
the frequencies of a sample provided the sample is long enough so that
each element occurs many times. Our method yields an approximation if this
precondition does not hold. For a given we recalculate the Zipf--ordered
probability distribution by optimization of the parameters of a guessed
distribution. We demonstrate that our method yields reliable results.Comment: 10 pages, uuencoded compressed PostScrip
Topological self-similarity on the random binary-tree model
Asymptotic analysis on some statistical properties of the random binary-tree
model is developed. We quantify a hierarchical structure of branching patterns
based on the Horton-Strahler analysis. We introduce a transformation of a
binary tree, and derive a recursive equation about branch orders. As an
application of the analysis, topological self-similarity and its generalization
is proved in an asymptotic sense. Also, some important examples are presented
A high-resolution infrared spectroscopic investigation of the halogen atom-HCN entrance channel complexes solvated in superfluid helium droplets
Rotationally resolved infrared spectra are reported for the X-HCN (X = Cl,
Br, I) binary complexes solvated in helium nanodroplets. These results are
directly compared with that obtained previously for the corresponding X-HF
complexes [J. M. Merritt, J. K\"upper, and R. E. Miller, PCCP, 7, 67 (2005)].
For bromine and iodine atoms complexed with HCN, two linear structures are
observed and assigned to the and ground
electronic states of the nitrogen and hydrogen bound geometries, respectively.
Experiments for HCN + chlorine atoms give rise to only a single band which is
attributed to the nitrogen bound isomer. That the hydrogen bound isomer is not
stabilized is rationalized in terms of a lowering of the isomerization barrier
by spin-orbit coupling. Theoretical calculations with and without spin-orbit
coupling have also been performed and are compared with our experimental
results. The possibility of stabilizing high-energy structures containing
multiple radicals is discussed, motivated by preliminary spectroscopic evidence
for the di-radical Br-HCCCN-Br complex. Spectra for the corresponding molecular
halogen HCN-X complexes are also presented.Comment: 20 pages, 15 figures, 6 tables, RevTe
Non-locality and Communication Complexity
Quantum information processing is the emerging field that defines and
realizes computing devices that make use of quantum mechanical principles, like
the superposition principle, entanglement, and interference. In this review we
study the information counterpart of computing. The abstract form of the
distributed computing setting is called communication complexity. It studies
the amount of information, in terms of bits or in our case qubits, that two
spatially separated computing devices need to exchange in order to perform some
computational task. Surprisingly, quantum mechanics can be used to obtain
dramatic advantages for such tasks.
We review the area of quantum communication complexity, and show how it
connects the foundational physics questions regarding non-locality with those
of communication complexity studied in theoretical computer science. The first
examples exhibiting the advantage of the use of qubits in distributed
information-processing tasks were based on non-locality tests. However, by now
the field has produced strong and interesting quantum protocols and algorithms
of its own that demonstrate that entanglement, although it cannot be used to
replace communication, can be used to reduce the communication exponentially.
In turn, these new advances yield a new outlook on the foundations of physics,
and could even yield new proposals for experiments that test the foundations of
physics.Comment: Survey paper, 63 pages LaTeX. A reformatted version will appear in
Reviews of Modern Physic
Young and Intermediate-age Distance Indicators
Distance measurements beyond geometrical and semi-geometrical methods, rely
mainly on standard candles. As the name suggests, these objects have known
luminosities by virtue of their intrinsic proprieties and play a major role in
our understanding of modern cosmology. The main caveats associated with
standard candles are their absolute calibration, contamination of the sample
from other sources and systematic uncertainties. The absolute calibration
mainly depends on their chemical composition and age. To understand the impact
of these effects on the distance scale, it is essential to develop methods
based on different sample of standard candles. Here we review the fundamental
properties of young and intermediate-age distance indicators such as Cepheids,
Mira variables and Red Clump stars and the recent developments in their
application as distance indicators.Comment: Review article, 63 pages (28 figures), Accepted for publication in
Space Science Reviews (Chapter 3 of a special collection resulting from the
May 2016 ISSI-BJ workshop on Astronomical Distance Determination in the Space
Age
Deciphering the intracellular metabolism of Listeria monocytogenes by mutant screening and modelling
Background: The human pathogen Listeria monocytogenes resides and proliferates within the cytoplasm of epithelial cells. While the virulence factors essentially contributing to this step of the infection cycle are well characterized, the set of listerial genes contributing to intracellular replication remains to be defined on a genome-wide level. Results: A comprehensive library of L. monocytogenes strain EGD knockout mutants was constructed upon insertion-duplication mutagenesis, and 1491 mutants were tested for their phenotypes in rich medium and in a Caco-2 cell culture assay. Following sequencing of the plasmid insertion site, 141 different genes required for invasion of and replication in Caco-2 cells were identified. Ten in-frame deletion mutants were constructed that confirmed the data. The genes with known functions are mainly involved in cellular processes including transport, in the intermediary metabolism of sugars, nucleotides and lipids, and in information pathways such as regulatory functions. No function could be ascribed to 18 genes, and a counterpart of eight genes is missing in the apathogenic species L. innocua. Mice infection studies revealed the in vivo requirement of IspE (Lmo0190) involved in mevalonate synthesis, and of the novel ABC transporter Lmo0135-0137 associated with cysteine transport. Based on the data of this genome-scale screening, an extreme pathway and elementary mode analysis was applied that demonstrates the critical role of glycerol and purine metabolism, of fucose utilization, and of the synthesis of glutathione, aspartate semialdehyde, serine and branched chain amino acids during intracellular replication of L. monocytogenes. Conclusion: The combination of a genetic screening and a modelling approach revealed that a series of transporters help L. monocytogenes to overcome a putative lack of nutrients within cells, and that a high metabolic flexibility contributes to the intracellular replication of this pathogen
Learning Poisson Binomial Distributions
We consider a basic problem in unsupervised learning: learning an unknown
\emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD)
over is the distribution of a sum of independent
Bernoulli random variables which may have arbitrary, potentially non-equal,
expectations. These distributions were first studied by S. Poisson in 1837
\cite{Poisson:37} and are a natural -parameter generalization of the
familiar Binomial Distribution. Surprisingly, prior to our work this basic
learning problem was poorly understood, and known results for it were far from
optimal.
We essentially settle the complexity of the learning problem for this basic
class of distributions. As our first main result we give a highly efficient
algorithm which learns to \eps-accuracy (with respect to the total variation
distance) using \tilde{O}(1/\eps^3) samples \emph{independent of }. The
running time of the algorithm is \emph{quasilinear} in the size of its input
data, i.e., \tilde{O}(\log(n)/\eps^3) bit-operations. (Observe that each draw
from the distribution is a -bit string.) Our second main result is a
{\em proper} learning algorithm that learns to \eps-accuracy using
\tilde{O}(1/\eps^2) samples, and runs in time (1/\eps)^{\poly (\log
(1/\eps))} \cdot \log n. This is nearly optimal, since any algorithm {for this
problem} must use \Omega(1/\eps^2) samples. We also give positive and
negative results for some extensions of this learning problem to weighted sums
of independent Bernoulli random variables.Comment: Revised full version. Improved sample complexity bound of O~(1/eps^2
- …