25,437 research outputs found

    Distributed Representations of Sentences and Documents

    Full text link
    Many machine learning algorithms require the input to be represented as a fixed-length feature vector. When it comes to texts, one of the most common fixed-length features is bag-of-words. Despite their popularity, bag-of-words features have two major weaknesses: they lose the ordering of the words and they also ignore semantics of the words. For example, "powerful," "strong" and "Paris" are equally distant. In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents. Our algorithm represents each document by a dense vector which is trained to predict words in the document. Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models. Empirical results show that Paragraph Vectors outperform bag-of-words models as well as other techniques for text representations. Finally, we achieve new state-of-the-art results on several text classification and sentiment analysis tasks

    A semantic and language-based representation of an environmental scene

    Get PDF
    The modeling of a landscape environment is a cognitive activity that requires appropriate spatial representations. The research presented in this paper introduces a structural and semantic categorization of a landscape view based on panoramic photographs that act as a substitute of a given natural environment. Verbal descriptions of a landscape scene provide themodeling input of our approach. This structure-based model identifies the spatial, relational, and semantic constructs that emerge from these descriptions. Concepts in the environment are qualified according to a semantic classification, their proximity and direction to the observer, and the spatial relations that qualify them. The resulting model is represented in a way that constitutes a modeling support for the study of environmental scenes, and a contribution for further research oriented to the mapping of a verbal description onto a geographical information system-based representation

    Interpreting Quantum Particles as Conceptual Entities

    Full text link
    We elaborate an interpretation of quantum physics founded on the hypothesis that quantum particles are conceptual entities playing the role of communication vehicles between material entities composed of ordinary matter which function as memory structures for these quantum particles. We show in which way this new interpretation gives rise to a natural explanation for the quantum effects of interference and entanglement by analyzing how interference and entanglement emerge for the case of human concepts. We put forward a scheme to derive a metric based on similarity as a predecessor for the structure of 'space, time, momentum, energy' and 'quantum particles interacting with ordinary matter' underlying standard quantum physics, within the new interpretation, and making use of aspects of traditional quantum axiomatics. More specifically, we analyze how the effect of non-locality arises as a consequence of the confrontation of such an emerging metric type of structure and the remaining presence of the basic conceptual structure on the fundamental level, with the potential of being revealed in specific situations.Comment: 19 pages, 1 figur

    Individual and Domain Adaptation in Sentence Planning for Dialogue

    Full text link
    One of the biggest challenges in the development and deployment of spoken dialogue systems is the design of the spoken language generation module. This challenge arises from the need for the generator to adapt to many features of the dialogue domain, user population, and dialogue context. A promising approach is trainable generation, which uses general-purpose linguistic knowledge that is automatically adapted to the features of interest, such as the application domain, individual user, or user group. In this paper we present and evaluate a trainable sentence planner for providing restaurant information in the MATCH dialogue system. We show that trainable sentence planning can produce complex information presentations whose quality is comparable to the output of a template-based generator tuned to this domain. We also show that our method easily supports adapting the sentence planner to individuals, and that the individualized sentence planners generally perform better than models trained and tested on a population of individuals. Previous work has documented and utilized individual preferences for content selection, but to our knowledge, these results provide the first demonstration of individual preferences for sentence planning operations, affecting the content order, discourse structure and sentence structure of system responses. Finally, we evaluate the contribution of different feature sets, and show that, in our application, n-gram features often do as well as features based on higher-level linguistic representations
    • …
    corecore