55 research outputs found

    Guessing probability distributions from small samples

    Full text link
    We propose a new method for the calculation of the statistical properties, as e.g. the entropy, of unknown generators of symbolic sequences. The probability distribution p(k)p(k) of the elements kk of a population can be approximated by the frequencies f(k)f(k) of a sample provided the sample is long enough so that each element kk occurs many times. Our method yields an approximation if this precondition does not hold. For a given f(k)f(k) we recalculate the Zipf--ordered probability distribution by optimization of the parameters of a guessed distribution. We demonstrate that our method yields reliable results.Comment: 10 pages, uuencoded compressed PostScrip

    Topological self-similarity on the random binary-tree model

    Full text link
    Asymptotic analysis on some statistical properties of the random binary-tree model is developed. We quantify a hierarchical structure of branching patterns based on the Horton-Strahler analysis. We introduce a transformation of a binary tree, and derive a recursive equation about branch orders. As an application of the analysis, topological self-similarity and its generalization is proved in an asymptotic sense. Also, some important examples are presented

    A high-resolution infrared spectroscopic investigation of the halogen atom-HCN entrance channel complexes solvated in superfluid helium droplets

    Get PDF
    Rotationally resolved infrared spectra are reported for the X-HCN (X = Cl, Br, I) binary complexes solvated in helium nanodroplets. These results are directly compared with that obtained previously for the corresponding X-HF complexes [J. M. Merritt, J. K\"upper, and R. E. Miller, PCCP, 7, 67 (2005)]. For bromine and iodine atoms complexed with HCN, two linear structures are observed and assigned to the 2Σ1/2^{2}\Sigma_{1/2} and 2Π3/2^{2}\Pi_{3/2} ground electronic states of the nitrogen and hydrogen bound geometries, respectively. Experiments for HCN + chlorine atoms give rise to only a single band which is attributed to the nitrogen bound isomer. That the hydrogen bound isomer is not stabilized is rationalized in terms of a lowering of the isomerization barrier by spin-orbit coupling. Theoretical calculations with and without spin-orbit coupling have also been performed and are compared with our experimental results. The possibility of stabilizing high-energy structures containing multiple radicals is discussed, motivated by preliminary spectroscopic evidence for the di-radical Br-HCCCN-Br complex. Spectra for the corresponding molecular halogen HCN-X2_{2} complexes are also presented.Comment: 20 pages, 15 figures, 6 tables, RevTe

    Non-locality and Communication Complexity

    Get PDF
    Quantum information processing is the emerging field that defines and realizes computing devices that make use of quantum mechanical principles, like the superposition principle, entanglement, and interference. In this review we study the information counterpart of computing. The abstract form of the distributed computing setting is called communication complexity. It studies the amount of information, in terms of bits or in our case qubits, that two spatially separated computing devices need to exchange in order to perform some computational task. Surprisingly, quantum mechanics can be used to obtain dramatic advantages for such tasks. We review the area of quantum communication complexity, and show how it connects the foundational physics questions regarding non-locality with those of communication complexity studied in theoretical computer science. The first examples exhibiting the advantage of the use of qubits in distributed information-processing tasks were based on non-locality tests. However, by now the field has produced strong and interesting quantum protocols and algorithms of its own that demonstrate that entanglement, although it cannot be used to replace communication, can be used to reduce the communication exponentially. In turn, these new advances yield a new outlook on the foundations of physics, and could even yield new proposals for experiments that test the foundations of physics.Comment: Survey paper, 63 pages LaTeX. A reformatted version will appear in Reviews of Modern Physic

    Young and Intermediate-age Distance Indicators

    Full text link
    Distance measurements beyond geometrical and semi-geometrical methods, rely mainly on standard candles. As the name suggests, these objects have known luminosities by virtue of their intrinsic proprieties and play a major role in our understanding of modern cosmology. The main caveats associated with standard candles are their absolute calibration, contamination of the sample from other sources and systematic uncertainties. The absolute calibration mainly depends on their chemical composition and age. To understand the impact of these effects on the distance scale, it is essential to develop methods based on different sample of standard candles. Here we review the fundamental properties of young and intermediate-age distance indicators such as Cepheids, Mira variables and Red Clump stars and the recent developments in their application as distance indicators.Comment: Review article, 63 pages (28 figures), Accepted for publication in Space Science Reviews (Chapter 3 of a special collection resulting from the May 2016 ISSI-BJ workshop on Astronomical Distance Determination in the Space Age

    Deciphering the intracellular metabolism of Listeria monocytogenes by mutant screening and modelling

    Get PDF
    Background: The human pathogen Listeria monocytogenes resides and proliferates within the cytoplasm of epithelial cells. While the virulence factors essentially contributing to this step of the infection cycle are well characterized, the set of listerial genes contributing to intracellular replication remains to be defined on a genome-wide level. Results: A comprehensive library of L. monocytogenes strain EGD knockout mutants was constructed upon insertion-duplication mutagenesis, and 1491 mutants were tested for their phenotypes in rich medium and in a Caco-2 cell culture assay. Following sequencing of the plasmid insertion site, 141 different genes required for invasion of and replication in Caco-2 cells were identified. Ten in-frame deletion mutants were constructed that confirmed the data. The genes with known functions are mainly involved in cellular processes including transport, in the intermediary metabolism of sugars, nucleotides and lipids, and in information pathways such as regulatory functions. No function could be ascribed to 18 genes, and a counterpart of eight genes is missing in the apathogenic species L. innocua. Mice infection studies revealed the in vivo requirement of IspE (Lmo0190) involved in mevalonate synthesis, and of the novel ABC transporter Lmo0135-0137 associated with cysteine transport. Based on the data of this genome-scale screening, an extreme pathway and elementary mode analysis was applied that demonstrates the critical role of glycerol and purine metabolism, of fucose utilization, and of the synthesis of glutathione, aspartate semialdehyde, serine and branched chain amino acids during intracellular replication of L. monocytogenes. Conclusion: The combination of a genetic screening and a modelling approach revealed that a series of transporters help L. monocytogenes to overcome a putative lack of nutrients within cells, and that a high metabolic flexibility contributes to the intracellular replication of this pathogen

    Learning Poisson Binomial Distributions

    Get PDF
    We consider a basic problem in unsupervised learning: learning an unknown \emph{Poisson Binomial Distribution}. A Poisson Binomial Distribution (PBD) over {0,1,,n}\{0,1,\dots,n\} is the distribution of a sum of nn independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by S. Poisson in 1837 \cite{Poisson:37} and are a natural nn-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to \eps-accuracy (with respect to the total variation distance) using \tilde{O}(1/\eps^3) samples \emph{independent of nn}. The running time of the algorithm is \emph{quasilinear} in the size of its input data, i.e., \tilde{O}(\log(n)/\eps^3) bit-operations. (Observe that each draw from the distribution is a log(n)\log(n)-bit string.) Our second main result is a {\em proper} learning algorithm that learns to \eps-accuracy using \tilde{O}(1/\eps^2) samples, and runs in time (1/\eps)^{\poly (\log (1/\eps))} \cdot \log n. This is nearly optimal, since any algorithm {for this problem} must use \Omega(1/\eps^2) samples. We also give positive and negative results for some extensions of this learning problem to weighted sums of independent Bernoulli random variables.Comment: Revised full version. Improved sample complexity bound of O~(1/eps^2
    corecore