394,997 research outputs found

    Probabilistic models of information retrieval based on measuring the divergence from randomness

    Get PDF
    We introduce and create a framework for deriving probabilistic models of Information Retrieval. The models are nonparametric models of IR obtained in the language model approach. We derive term-weighting models by measuring the divergence of the actual term distribution from that obtained under a random process. Among the random processes we study the binomial distribution and Bose--Einstein statistics. We define two types of term frequency normalization for tuning term weights in the document--query matching process. The first normalization assumes that documents have the same length and measures the information gain with the observed term once it has been accepted as a good descriptor of the observed document. The second normalization is related to the document length and to other statistics. These two normalization methods are applied to the basic models in succession to obtain weighting formulae. Results show that our framework produces different nonparametric models forming baseline alternatives to the standard tf-idf model

    Combining vocal tract length normalization with hierarchial linear transformations

    Get PDF
    Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR-based adaptation techniques, being much closer in quality to that generated by the original av-erage voice model. However with only a single parameter, VTLN captures very few speaker specific characteristics when compared to linear transform based adaptation techniques. This paper pro-poses that the merits of VTLN can be combined with those of linear transform based adaptation in a hierarchial Bayesian frame-work, where VTLN is used as the prior information. A novel tech-nique for propagating the gender information from the VTLN prior through constrained structural maximum a posteriori linear regres-sion (CSMAPLR) adaptation is presented. Experiments show that the resulting transformation has improved speech quality with better naturalness, intelligibility and improved speaker similarity. Index Terms — Statistical parametric speech synthesis, hidden Markov models, speaker adaptation, vocal tract length normaliza-tion, constrained structural maximum a posteriori linear regression 1

    Constraining Implicit Space with Minimum Description Length: An Unsupervised Attention Mechanism across Neural Network Layers

    Full text link
    Inspired by the adaptation phenomenon of neuronal firing, we propose the regularity normalization (RN) as an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, UAM constrains the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrated the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and non-stationary input distribution in image classification, classic control, procedurally-generated reinforcement learning, generative modeling, handwriting generation and question answering tasks with various neural network architectures. Lastly, UAM tracks dependency and critical learning stages across layers and recurrent time steps of deep networks

    Correlation Between the Deuteron Characteristics and the Low-energy Triplet np Scattering Parameters

    Full text link
    The correlation relationship between the deuteron asymptotic normalization constant, ASA_{S}, and the triplet np scattering length, ata_{t}, is investigated. It is found that 99.7% of the asymptotic constant ASA_{S} is determined by the scattering length ata_{t}. It is shown that the linear correlation relationship between the quantities AS2A_{S}^{-2} and 1/at1/a_{t} provides a good test of correctness of various models of nucleon-nucleon interaction. It is revealed that, for the normalization constant ASA_{S} and for the root-mean-square deuteron radius rdr_{d}, the results obtained with the experimental value recommended at present for the triplet scattering length ata_{t} are exaggerated with respect to their experimental counterparts. By using the latest experimental phase shifts of Arndt et al., we obtain, for the low-energy scattering parameters (ata_{t}, rtr_{t}, PtP_{t}) and for the deuteron characteristics (ASA_{S}, rdr_{d}), results that comply well with experimental data.Comment: 19 pages, 1 figure, To be published in Physics of Atomic Nucle

    Bounding normalization time through intersection types

    Get PDF
    Non-idempotent intersection types are used in order to give a bound of the length of the normalization beta-reduction sequence of a lambda term: namely, the bound is expressed as a function of the size of the term.Comment: In Proceedings ITRS 2012, arXiv:1307.784
    corecore