Search CORE

35,811 research outputs found

Bayesian Inference in a Sample Selection Model

Author: NC DOCKS at The University of North Carolina at Greensboro
Van Hasselt Martijn
Publication venue
Publication date: 01/01/2011
Field of study

This paper develops methods of Bayesian inference in a sample selection model. The main feature of this model is that the outcome variable is only partially observed. We first present a Gibbs sampling algorithm for a model in which the selection and outcome errors are normally distributed. The algorithm is then extended to analyze models that are characterized by nonnormality. Specifically, we use a Dirichlet process prior and model the distribution of the unobservables as a mixture of normal distributions with a random number of components. The posterior distribution in this model can simultaneously detect the presence of selection effects and departures from normality. Our methods are illustrated using some simulated data and an abstract from the RAND health insurance experiment

The University of North Carolina at Greensboro

Succinct Representations of Permutations and Functions

Author: Munro J. Ian
Raman Rajeev
Raman Venkatesh
Rao S. Srinivasa
Publication venue
Publication date: 09/08/2011
Field of study

We investigate the problem of succinctly representing an arbitrary permutation, \pi, on {0,...,n-1} so that \pi^k(i) can be computed quickly for any i and any (positive or negative) integer power k. A representation taking (1+\epsilon) n lg n + O(1) bits suffices to compute arbitrary powers in constant time, for any positive constant \epsilon <= 1. A representation taking the optimal \ceil{\lg n!} + o(n) bits can be used to compute arbitrary powers in O(lg n / lg lg n) time. We then consider the more general problem of succinctly representing an arbitrary function, f: [n] \rightarrow [n] so that f^k(i) can be computed quickly for any i and any integer power k. We give a representation that takes (1+\epsilon) n lg n + O(1) bits, for any positive constant \epsilon <= 1, and computes arbitrary positive powers in constant time. It can also be used to compute f^k(i), for any negative integer k, in optimal O(1+|f^k(i)|) time. We place emphasis on the redundancy, or the space beyond the information-theoretic lower bound that the data structure uses in order to support operations efficiently. A number of lower bounds have recently been shown on the redundancy of data structures. These lower bounds confirm the space-time optimality of some of our solutions. Furthermore, the redundancy of one of our structures "surpasses" a recent lower bound by Golynski [Golynski, SODA 2009], thus demonstrating the limitations of this lower bound.Comment: Preliminary versions of these results have appeared in the Proceedings of ICALP 2003 and 2004. However, all results in this version are improved over the earlier conference versio

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Leicester Research Archive

Ontology Evolution in Law (Extended Abstract)

Author: Bundy Alan
McNeill F.
Priddle-Higson A.
Schafer B.
Publication venue
Publication date: 01/01/2008
Field of study

Edinburgh Research Explorer

Provenance for SPARQL queries

Author: A.P. Antoine Zimmermann
F. Geerts
G. Flouris
J. Pérez
K. Amer
R. Dividino
T.J. Green
Y. Theoharis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Determining trust of data available in the Semantic Web is fundamental for applications and users, in particular for linked open data obtained from SPARQL endpoints. There exist several proposals in the literature to annotate SPARQL query results with values from abstract models, adapting the seminal works on provenance for annotated relational databases. We provide an approach capable of providing provenance information for a large and significant fragment of SPARQL 1.1, including for the first time the major non-monotonic constructs under multiset semantics. The approach is based on the translation of SPARQL into relational queries over annotated relations with values of the most general m-semiring, and in this way also refuting a claim in the literature that the OPTIONAL construct of SPARQL cannot be captured appropriately with the known abstract models.Comment: 22 pages, extended version of the ISWC 2012 paper including proof

arXiv.org e-Print Archive

CiteSeerX

Crossref

Filtration of non-monotonic rules for fuzzy rule base compression

Author: Gegov Alexander
Gobalakrishnan Neelamugilan
Sanders David
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2014
Field of study

Portsmouth University Research Portal (Pure)

Computable de Finetti measures

Author: Aldous
Aldous
Austin
Battenfeld
Billingsley
Bosserhoff
Brattka
Brattka
Braverman
Cameron E. Freer
Daniel M. Roy
Dawid
de~Finetti
de~Finetti
de~Finetti
Diaconis
Diaconis
Edalat
Escardó
Escardó
Freer
Goodman
Griffiths
Grubba
Hewitt
Kallenberg
Kallenberg
Kemp
Kingman
Kiselyov
Lauritzen
Müller
Park
Pfeffer
Plotkin
Pour-El
Rogers
Roy
Roy
Ryll-Nardzewski
Saheb-Djahromi
Schröder
Schröder
Sethuraman
Soare
Teh
Thibaux
Weihrauch
Weihrauch
Weihrauch
Weihrauch
Wolpert
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

We prove a computable version of de Finetti's theorem on exchangeable sequences of real random variables. As a consequence, exchangeable stochastic processes expressed in probabilistic functional programming languages can be automatically rewritten as procedures that do not modify non-local state. Along the way, we prove that a distribution on the unit interval is computable if and only if its moments are uniformly computable.Comment: 32 pages. Final journal version; expanded somewhat, with minor corrections. To appear in Annals of Pure and Applied Logic. Extended abstract appeared in Proceedings of CiE '09, LNCS 5635, pp. 218-23

arXiv.org e-Print Archive

CiteSeerX

Crossref

Elsevier - Publisher Connector

A comprehensive evaluation of alignment algorithms in the context of RNA-seq.

Author: Friedel Caroline C.
Lindner Robert
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/2012
Field of study

Transcriptome sequencing (RNA-Seq) overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

Open Access LMU

PubMed Central