18,623 research outputs found

    A decision-theoretic approach for segmental classification

    Full text link
    This paper is concerned with statistical methods for the segmental classification of linear sequence data where the task is to segment and classify the data according to an underlying hidden discrete state sequence. Such analysis is commonplace in the empirical sciences including genomics, finance and speech processing. In particular, we are interested in answering the following question: given data yy and a statistical model π(x,y)\pi(x,y) of the hidden states xx, what should we report as the prediction x^\hat{x} under the posterior distribution π(xy)\pi (x|y)? That is, how should you make a prediction of the underlying states? We demonstrate that traditional approaches such as reporting the most probable state sequence or most probable set of marginal predictions can give undesirable classification artefacts and offer limited control over the properties of the prediction. We propose a decision theoretic approach using a novel class of Markov loss functions and report x^\hat{x} via the principle of minimum expected loss (maximum expected utility). We demonstrate that the sequence of minimum expected loss under the Markov loss function can be enumerated exactly using dynamic programming methods and that it offers flexibility and performance improvements over existing techniques. The result is generic and applicable to any probabilistic model on a sequence, such as Hidden Markov models, change point or product partition models.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS657 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Analysis of Software Binaries for Reengineering-Driven Product Line Architecture\^aAn Industrial Case Study

    Full text link
    This paper describes a method for the recovering of software architectures from a set of similar (but unrelated) software products in binary form. One intention is to drive refactoring into software product lines and combine architecture recovery with run time binary analysis and existing clustering methods. Using our runtime binary analysis, we create graphs that capture the dependencies between different software parts. These are clustered into smaller component graphs, that group software parts with high interactions into larger entities. The component graphs serve as a basis for further software product line work. In this paper, we concentrate on the analysis part of the method and the graph clustering. We apply the graph clustering method to a real application in the context of automation / robot configuration software tools.Comment: In Proceedings FMSPLE 2015, arXiv:1504.0301

    Unsupervised extraction of recurring words from infant-directed speech

    Get PDF
    To date, most computational models of infant word segmentation have worked from phonemic or phonetic input, or have used toy datasets. In this paper, we present an algorithm for word extraction that works directly from naturalistic acoustic input: infant-directed speech from the CHILDES corpus. The algorithm identifies recurring acoustic patterns that are candidates for identification as words or phrases, and then clusters together the most similar patterns. The recurring patterns are found in a single pass through the corpus using an incremental method, where only a small number of utterances are considered at once. Despite this limitation, we show that the algorithm is able to extract a number of recurring words, including some that infants learn earliest, such as Mommy and the child’s name. We also introduce a novel information-theoretic evaluation measure

    Loanword adaptation as first-language phonological perception

    Get PDF
    We show that loanword adaptation can be understood entirely in terms of phonological and phonetic comprehension and production mechanisms in the first language. We provide explicit accounts of several loanword adaptation phenomena (in Korean) in terms of an Optimality-Theoretic grammar model with the same three levels of representation that are needed to describe L1 phonology: the underlying form, the phonological surface form, and the auditory-phonetic form. The model is bidirectional, i.e., the same constraints and rankings are used by the listener and by the speaker. These constraints and rankings are the same for L1 processing and loanword adaptation

    Arithmetic, Set Theory, Reduction and Explanation

    Get PDF
    Philosophers of science since Nagel have been interested in the links between intertheoretic reduction and explanation, understanding and other forms of epistemic progress. Although intertheoretic reduction is widely agreed to occur in pure mathematics as well as empirical science, the relationship between reduction and explanation in the mathematical setting has rarely been investigated in a similarly serious way. This paper examines an important particular case: the reduction of arithmetic to set theory. I claim that the reduction is unexplanatory. In defense of this claim, I offer evidence from mathematical practice, and I respond to contrary suggestions due to Steinhart, Maddy, Kitcher and Quine. I then show how, even if set-theoretic reductions are generally not explanatory, set theory can nevertheless serve as a legitimate foundation for mathematics. Finally, some implications of my thesis for philosophy of mathematics and philosophy of science are discussed. In particular, I suggest that some reductions in mathematics are probably explanatory, and I propose that differing standards of theory acceptance might account for the apparent lack of unexplanatory reductions in the empirical sciences

    Lattice initial segments of the hyperdegrees

    Full text link
    We affirm a conjecture of Sacks [1972] by showing that every countable distributive lattice is isomorphic to an initial segment of the hyperdegrees, Dh\mathcal{D}_{h}. In fact, we prove that every sublattice of any hyperarithmetic lattice (and so, in particular, every countable locally finite lattice) is isomorphic to an initial segment of Dh\mathcal{D}_{h}. Corollaries include the decidability of the two quantifier theory of % \mathcal{D}_{h} and the undecidability of its three quantifier theory. The key tool in the proof is a new lattice representation theorem that provides a notion of forcing for which we can prove a version of the fusion lemma in the hyperarithmetic setting and so the preservation of ω1CK\omega _{1}^{CK}. Somewhat surprisingly, the set theoretic analog of this forcing does not preserve ω1\omega _{1}. On the other hand, we construct countable lattices that are not isomorphic to an initial segment of Dh\mathcal{D}_{h}
    corecore