82,858 research outputs found

    Information Compression, Intelligence, Computing, and Mathematics

    Full text link
    This paper presents evidence for the idea that much of artificial intelligence, human perception and cognition, mainstream computing, and mathematics, may be understood as compression of information via the matching and unification of patterns. This is the basis for the "SP theory of intelligence", outlined in the paper and fully described elsewhere. Relevant evidence may be seen: in empirical support for the SP theory; in some advantages of information compression (IC) in terms of biology and engineering; in our use of shorthands and ordinary words in language; in how we merge successive views of any one thing; in visual recognition; in binocular vision; in visual adaptation; in how we learn lexical and grammatical structures in language; and in perceptual constancies. IC via the matching and unification of patterns may be seen in both computing and mathematics: in IC via equations; in the matching and unification of names; in the reduction or removal of redundancy from unary numbers; in the workings of Post's Canonical System and the transition function in the Universal Turing Machine; in the way computers retrieve information from memory; in systems like Prolog; and in the query-by-example technique for information retrieval. The chunking-with-codes technique for IC may be seen in the use of named functions to avoid repetition of computer code. The schema-plus-correction technique may be seen in functions with parameters and in the use of classes in object-oriented programming. And the run-length coding technique may be seen in multiplication, in division, and in several other devices in mathematics and computing. The SP theory resolves the apparent paradox of "decompression by compression". And computing and cognition as IC is compatible with the uses of redundancy in such things as backup copies to safeguard data and understanding speech in a noisy environment

    A memory-based classification approach to marker-based EBMT

    Get PDF
    We describe a novel approach to example-based machine translation that makes use of marker-based chunks, in which the decoder is a memory-based classifier. The classifier is trained to map trigrams of source-language chunks onto trigrams of target-language chunks; then, in a second decoding step, the predicted trigrams are rearranged according to their overlap. We present the first results of this method on a Dutch-to-English translation system using Europarl data. Sparseness of the class space causes the results to lag behind a baseline phrase-based SMT system. In a further comparison, we also apply the method to a word-aligned version of the same data, and report a smaller difference with a word-based SMT system. We explore the scaling abilities of the memory-based approach, and observe linear scaling behavior in training and classification speed and memory costs, and loglinear BLEU improvements in the amount of training examples

    A Neural Attention Model for Abstractive Sentence Summarization

    Full text link
    Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.Comment: Proceedings of EMNLP 201
    corecore