2,079 research outputs found

    Statistical modelling of summary values leads to accurate Approximate Bayesian Computations

    Full text link
    Approximate Bayesian Computation (ABC) methods rely on asymptotic arguments, implying that parameter inference can be systematically biased even when sufficient statistics are available. We propose to construct the ABC accept/reject step from decision theoretic arguments on a suitable auxiliary space. This framework, referred to as ABC*, fully specifies which test statistics to use, how to combine them, how to set the tolerances and how long to simulate in order to obtain accuracy properties on the auxiliary space. Akin to maximum-likelihood indirect inference, regularity conditions establish when the ABC* approximation to the posterior density is accurate on the original parameter space in terms of the Kullback-Leibler divergence and the maximum a posteriori point estimate. Fundamentally, escaping asymptotic arguments requires knowledge of the distribution of test statistics, which we obtain through modelling the distribution of summary values, data points on a summary level. Synthetic examples and an application to time series data of influenza A (H3N2) infections in the Netherlands illustrate ABC* in action.Comment: Videos can be played with Acrobat Reader. Manuscript under review and not accepte

    Molecular evolution of candidate male reproductive genes in the brown algal model Ectocarpus

    Get PDF
    Background: Evolutionary studies of genes that mediate recognition between sperm and egg contribute to our understanding of reproductive isolation and speciation. Surface receptors involved in fertilization are targets of sexual selection, reinforcement, and other evolutionary forces including positive selection. This observation was made across different lineages of the eukaryotic tree from land plants to mammals, and is particularly evident in free-spawning animals. Here we use the brown algal model species Ectocarpus (Phaeophyceae) to investigate the evolution of candidate gamete recognition proteins in a distant major phylogenetic group of eukaryotes. Results: Male gamete specific genes were identified by comparing transcriptome data covering different stages of the Ectocarpus life cycle and screened for characteristics expected from gamete recognition receptors. Selected genes were sequenced in a representative number of strains from distant geographical locations and varying stages of reproductive isolation, to search for signatures of adaptive evolution. One of the genes (Esi0130_0068) showed evidence of selective pressure. Interestingly, that gene displayed domain similarities to the receptor for egg jelly (REJ) protein involved in sperm-egg recognition in sea urchins. Conclusions: We have identified a male gamete specific gene with similarity to known gamete recognition receptors and signatures of adaptation. Altogether, this gene could contribute to gamete interaction during reproduction as well as reproductive isolation in Ectocarpus and is therefore a good candidate for further functional evaluation

    Julian Ernst Besag, 26 March 1945 -- 6 August 2010, a biographical memoir

    Full text link
    Julian Besag was an outstanding statistical scientist, distinguished for his pioneering work on the statistical theory and analysis of spatial processes, especially conditional lattice systems. His work has been seminal in statistical developments over the last several decades ranging from image analysis to Markov chain Monte Carlo methods. He clarified the role of auto-logistic and auto-normal models as instances of Markov random fields and paved the way for their use in diverse applications. Later work included investigations into the efficacy of nearest neighbour models to accommodate spatial dependence in the analysis of data from agricultural field trials, image restoration from noisy data, and texture generation using lattice models.Comment: 26 pages, 14 figures; minor revisions, omission of full bibliograph

    Learning Tractable Word Alignment Models with Complex Constraints

    Get PDF
    Word-level alignment of bilingual text is a critical resource for a growing variety of tasks. Probabilistic models for word alignment present a fundamental trade-off between richness of captured constraints and correlations versus efficiency and tractability of inference. In this article, we use the Posterior Regularization framework (Graça, Ganchev, and Taskar 2007) to incorporate complex constraints into probabilistic models during learning without changing the efficiency of the underlying model. We focus on the simple and tractable hidden Markov model, and present an efficient learning algorithm for incorporating approximate bijectivity and symmetry constraints. Models estimated with these constraints produce a significant boost in performance as measured by both precision and recall of manually annotated alignments for six language pairs. We also report experiments on two different tasks where word alignments are required: phrase-based machine translation and syntax transfer, and show promising improvements over standard methods

    Multilingual unsupervised word alignment models and their application

    Get PDF
    Word alignment is an essential task in natural language processing because of its critical role in training statistical machine translation (SMT) models, error analysis for neural machine translation (NMT), building bilingual lexicon, and annotation transfer. In this thesis, we explore models for word alignment, how they can be extended to incorporate linguistically-motivated alignment types, and how they can be neuralized in an end-to-end fashion. In addition to these methodological developments, we apply our word alignment models to cross-lingual part-of-speech projection. First, we present a new probabilistic model for word alignment where word alignments are associated with linguistically-motivated alignment types. We propose a novel task of joint prediction of word alignment and alignment types and propose novel semi-supervised learning algorithms for this task. We also solve a sub-task of predicting the alignment type given an aligned word pair. The proposed joint generative models (alignment-type-enhanced models) significantly outperform the models without alignment types in terms of word alignment and translation quality. Next, we present an unsupervised neural Hidden Markov Model for word alignment, where emission and transition probabilities are modeled using neural networks. The model is simpler in structure, allows for seamless integration of additional context, and can be used in an end-to-end neural network. Finally, we tackle the part-of-speech tagging task for the zero-resource scenario where no part-of-speech (POS) annotated training data is available. We present a cross-lingual projection approach where neural HMM aligners are used to obtain high quality word alignments between resource-poor and resource-rich languages. Moreover, high quality neural POS taggers are used to provide annotations for the resource-rich language side of the parallel data, as well as to train a tagger on the projected data. Our experimental results on truly low-resource languages show that our methods outperform their corresponding baselines

    The bracteatus pineapple genome and domestication of clonally propagated crops

    Get PDF
    Domestication of clonally propagated crops such as pineapple from South America was hypothesized to be a 'one-step operation'. We sequenced the genome of Ananas comosus var. bracteatus CB5 and assembled 513 Mb into 25 chromosomes with 29,412 genes. Comparison of the genomes of CB5, F153 and MD2 elucidated the genomic basis of fiber production, color formation, sugar accumulation and fruit maturation. We also resequenced 89 Ananas genomes. Cultivars 'Smooth Cayenne' and 'Queen' exhibited ancient and recent admixture, while 'Singapore Spanish' supported a one-step operation of domestication. We identified 25 selective sweeps, including a strong sweep containing a pair of tandemly duplicated bromelain inhibitors. Four candidate genes for self-incompatibility were linked in F153, but were not functional in self-compatible CB5. Our findings support the coexistence of sexual recombination and a one-step operation in the domestication of clonally propagated crops. This work guides the exploration of sexual and asexual domestication trajectories in other clonally propagated crops
    • …
    corecore