1,521 research outputs found

    On Parsimonious Explanations For 2-D Tree- and Linearly-Ordered Data

    Get PDF
    This paper studies the ``explanation problem\u27\u27 for tree- and linearly-ordered array data, a problem motivated by database applications and recently solved for the one-dimensional tree-ordered case. In this paper, one is given a matrix A=(a_{ij}) whose rows and columns have semantics: special subsets of the rows and special subsets of the columns are meaningful, others are not. A submatrix in A is said to be meaningful if and only if it is the cross product of a meaningful row subset and a meaningful column subset, in which case we call it an ``allowed rectangle.\u27\u27 The goal is to ``explain\u27\u27 A as a sparse sum of weighted allowed rectangles. Specifically, we wish to find as few weighted allowed rectangles as possible such that, for all i,j, a_ij equals the sum of the weights of all rectangles which include cell (i,j). In this paper we consider the natural cases in which the matrix dimensions are tree-ordered or linearly-ordered. In the tree-ordered case, we are given a rooted tree T1T_1 whose leaves are the rows of AA and another, T2T_2, whose leaves are the columns. Nodes of the trees correspond in an obvious way to the sets of their leaf descendants. In the linearly-ordered case, a set of rows or columns is meaningful if and only if it is contiguous. For tree-ordered data, we prove the explanation problem NP-Hard and give a randomized 22-approximation algorithm for it. For linearly-ordered data, we prove the explanation problem NP-Har and give a 2.562.56-approximation algorithm. To our knowledge, these are the first results for the problem of sparsely and exactly representing matrices by weighted rectangles

    Complexity of Non-Monotonic Logics

    Full text link
    Over the past few decades, non-monotonic reasoning has developed to be one of the most important topics in computational logic and artificial intelligence. Different ways to introduce non-monotonic aspects to classical logic have been considered, e.g., extension with default rules, extension with modal belief operators, or modification of the semantics. In this survey we consider a logical formalism from each of the above possibilities, namely Reiter's default logic, Moore's autoepistemic logic and McCarthy's circumscription. Additionally, we consider abduction, where one is not interested in inferences from a given knowledge base but in computing possible explanations for an observation with respect to a given knowledge base. Complexity results for different reasoning tasks for propositional variants of these logics have been studied already in the nineties. In recent years, however, a renewed interest in complexity issues can be observed. One current focal approach is to consider parameterized problems and identify reasonable parameters that allow for FPT algorithms. In another approach, the emphasis lies on identifying fragments, i.e., restriction of the logical language, that allow more efficient algorithms for the most important reasoning tasks. In this survey we focus on this second aspect. We describe complexity results for fragments of logical languages obtained by either restricting the allowed set of operators (e.g., forbidding negations one might consider only monotone formulae) or by considering only formulae in conjunctive normal form but with generalized clause types. The algorithmic problems we consider are suitable variants of satisfiability and implication in each of the logics, but also counting problems, where one is not only interested in the existence of certain objects (e.g., models of a formula) but asks for their number.Comment: To appear in Bulletin of the EATC

    A framework for evaluating the influence of climate, dispersal limitation, and biotic interactions using fossil pollen associations across the late Quaternary

    Get PDF
    Environmental conditions, dispersal lags, and interactions among species are major factors structuring communities through time and across space. Ecologists have emphasized the importance of biotic interactions in determining local patterns of species association. In contrast, abiotic limits, dispersal limitation, and historical factors have commonly been invoked to explain community structure patterns at larger spatiotemporal scales, such as the appearance of late Pleistocene no-analog communities or latitudinal gradients of species richness in both modern and fossil assemblages. Quantifying the relative influence of these processes on species co-occurrence patterns is not straightforward. We provide a framework for assessing causes of species associations by combining a null-model analysis of co-occurrence with additional analyses of climatic differences and spatial pattern for pairs of pollen taxa that are significantly associated across geographic space. We tested this framework with data on associations among 106 fossil pollen taxa and paleoclimate simulations from eastern North America across the late Quaternary. The number and proportion of significantly associated taxon pairs increased over time, but only 449 of 56 194 taxon pairs were significantly different from random. Within this significant subset of pollen taxa, biotic interactions were rarely the exclusive cause of associations. Instead, climatic or spatial differences among sites were most frequently associated with significant patterns of taxon association. Most taxon pairs that exhibited co-occurrence patterns indicative of biotic interactions at one time did not exhibit significant associations at other times. Evidence for environmental filtering and dispersal limitation was weakest for aggregated pairs between 16 and 11 kyr BP, suggesting enhanced importance of positive species interactions during this interval. The framework can thus be used to identify species associations that may reflect biotic interactions because these associations are not tied to environmental or spatial differences. Furthermore, temporally repeated analyses of spatial associations can reveal whether such associations persist through time

    Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Haplotype inference based on unphased SNP markers is an important task in population genetics. Although there are different approaches to the inference of haplotypes in diploid species, the existing software is not suitable for inferring haplotypes from unphased SNP data in polyploid species, such as the cultivated potato (<it>Solanum tuberosum</it>). Potato species are tetraploid and highly heterozygous.</p> <p>Results</p> <p>Here we present the software SATlotyper which is able to handle polyploid and polyallelic data. SATlo-typer uses the Boolean satisfiability problem to formulate Haplotype Inference by Pure Parsimony. The software excludes existing haplotype inferences, thus allowing for calculation of alternative inferences. As it is not known which of the multiple haplotype inferences are best supported by the given unphased data set, we use a bootstrapping procedure that allows for scoring of alternative inferences. Finally, by means of the bootstrapping scores, it is possible to optimise the phased genotypes belonging to a given haplotype inference. The program is evaluated with simulated and experimental SNP data generated for heterozygous tetraploid populations of potato. We show that, instead of taking the first haplotype inference reported by the program, we can significantly improve the quality of the final result by applying additional methods that include scoring of the alternative haplotype inferences and genotype optimisation. For a sub-population of nineteen individuals, the predicted results computed by SATlotyper were directly compared with results obtained by experimental haplotype inference via sequencing of cloned amplicons. Prediction and experiment gave similar results regarding the inferred haplotypes and phased genotypes.</p> <p>Conclusion</p> <p>Our results suggest that Haplotype Inference by Pure Parsimony can be solved efficiently by the SAT approach, even for data sets of unphased SNP from heterozygous polyploids. SATlotyper is freeware and is distributed as a Java JAR file. The software can be downloaded from the webpage of the GABI Primary Database at <url>http://www.gabipd.org/projects/satlotyper/</url>. The application of SATlotyper will provide haplotype information, which can be used in haplotype association mapping studies of polyploid plants.</p

    Are any growth theories linear? Why we should care about what the evidence tells us

    Get PDF
    Recent research on macroeconomic growth has been focused on resolving several key issues, two of which, specification uncertainty of the growth process and variable uncertainty, have received much attention in the recent literature. The standard procedure has been to assume a linear growth process and then to proceed with investigating the relevant variables that determine growth across countries. However, a more appropriate approach would be to recognize that a misspecified model may lead one to conclude that a variable is relevant when in fact it is not. This paper takes a step in this direction by considering conditional variable uncertainty with full blown specification uncertainty. We use recently developed nonparametric model selection techniques to deal with nonlinearities and competing growth theories. We show how one can interpret our results and use them to motivate more intriguing specifications within the traditional studies that use Bayesian Model Averaging or other model selection criteria. We find that the inclusion of nonlinearities is necessary for determining the empirically relevant variables that dictate growth and that nonlinearities are especially important in uncovering key mechanism of the growth process.Growth Nonlinearities, Irrelevant Variables, Least Squares Cross Validation, Bayesian Model Averaging, Parameter Heterogeneity

    A Simple Generative Model of Collective Online Behaviour

    Full text link
    Human activities increasingly take place in online environments, providing novel opportunities for relating individual behaviours to population-level outcomes. In this paper, we introduce a simple generative model for the collective behaviour of millions of social networking site users who are deciding between different software applications. Our model incorporates two distinct components: one is associated with recent decisions of users, and the other reflects the cumulative popularity of each application. Importantly, although various combinations of the two mechanisms yield long-time behaviour that is consistent with data, the only models that reproduce the observed temporal dynamics are those that strongly emphasize the recent popularity of applications over their cumulative popularity. This demonstrates---even when using purely observational data without experimental design---that temporal data-driven modelling can effectively distinguish between competing microscopic mechanisms, allowing us to uncover new aspects of collective online behaviour.Comment: Updated, with new figures and Supplementary Informatio
    corecore