43 research outputs found

    Towards heuristic algorithmic memory

    Get PDF
    We propose a long-term memory design for artificial general intelligence based on Solomonoff's incremental machine learning methods. We introduce four synergistic update algorithms that use a Stochastic Context-Free Grammar as a guiding probability distribution of programs. The update algorithms accomplish adjusting production probabilities, re-using previous solutions, learning programming idioms and discovery of frequent subprograms. A controlled experiment with a long training sequence shows that our incremental learning approach is effective. © 2011 Springer-Verlag Berlin Heidelberg

    K-Space at TRECVID 2008

    Get PDF
    In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde and Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results

    Metaheuristics for Natural Language Tagging

    Full text link

    Knowledge discovery with CRF-based clustering of named entities without a priori classes

    Get PDF
    International audienceKnowledge discovery aims at bringing out coherent groups of entities. It is usually based on clustering which necessitates defining a notion of similarity between the relevant entities. In this paper, we propose to divert a supervised machine learning technique (namely Conditional Random Fields, widely used for supervised labeling tasks) in order to calculate, indirectly and without supervision, similarities among text sequences. Our approach consists in generating artificial labeling problems on the data to reveal regularities between entities through their labeling. We describe how this framework can be implemented and experiment it on two information extraction/discovery tasks. The results demonstrate the usefulness of this unsupervised approach, and open many avenues for defining similarities for complex representations of textual data

    Unsupervised joint PoS tagging and stemming for agglutinative languages

    Get PDF
    This is an accepted manuscript of an article published by Association for Computing Machinery (ACM) in ACM Transactions on Asian and Low-Resource Language Information Processing on 25/01/2019, available online: https://doi.org/10.1145/3292398 The accepted version of the publication may differ from the final published version.The number of possible word forms is theoretically infinite in agglutinative languages. This brings up the out-of-vocabulary (OOV) issue for part-of-speech (PoS) tagging in agglutinative languages. Since inflectional morphology does not change the PoS tag of a word, we propose to learn stems along with PoS tags simultaneously. Therefore, we aim to overcome the sparsity problem by reducing word forms into their stems. We adopt a Bayesian model that is fully unsupervised. We build a Hidden Markov Model for PoS tagging where the stems are emitted through hidden states. Several versions of the model are introduced in order to observe the effects of different dependencies throughout the corpus, such as the dependency between stems and PoS tags or between PoS tags and affixes. Additionally, we use neural word embeddings to estimate the semantic similarity between the word form and stem. We use the semantic similarity as prior information to discover the actual stem of a word since inflection does not change the meaning of a word. We compare our models with other unsupervised stemming and PoS tagging models on Turkish, Hungarian, Finnish, Basque, and English. The results show that a joint model for PoS tagging and stemming improves on an independent PoS tagger and stemmer in agglutinative languages.This research is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) with the project number EEEAG-115E464.Published versio

    EURECOM at TRECVID 2016. The Adhoc Video Search and Video Hyperlinking Tasks

    Get PDF
    Contains fulltext : 166346.pdf (publisher's version ) (Open Access)TRECVID 2016, 14 november 201

    A Korean Part-of-Speech Tagging System Using Resolution Rules for Individual Ambiguous Word

    No full text

    Tagging with Small Training Corpora

    No full text
    corecore