71 research outputs found

    Question Quality in Community Question Answering Forums:A survey

    Get PDF
    Community Question Answering websites (CQA) offer a new opportunity for users to provide, search and share knowledge. Although the idea of receiving a direct, targeted response to a question sounds very attractive, the quality of the question itself can have an important effect on the likelihood of getting useful answers. High quality questions improve the CQA experience and therefore it is essential for CQA forums to better understand what characterizes questions that are more appealing for the forum community. In this survey, we review existing research on question quality in CQA websites. We discuss the possible measures of question quality and the question features that have been shown to influence question quality

    Learning language through pictures

    Get PDF

    Representation of linguistic form and function in recurrent neural networks

    Get PDF
    We present novel methods for analyzing the activation patterns of recurrent neural networks from a linguistic point of view and explore the types of linguistic structure they learn. As a case study, we use a standard standalone language model, and a multi-task gated recurrent network architecture consisting of two parallel pathways with shared word embeddings: The Visual pathway is trained on predicting the representations of the visual scene corresponding to an input sentence, and the Textual pathway is trained to predict the next word in the same sentence. We propose a method for estimating the amount of contribution of individual tokens in the input to the final prediction of the networks. Using this method, we show that the Visual pathway pays selective attention to lexical categories and grammatical functions that carry semantic information, and learns to treat word types differently depending on their grammatical function and their position in the sequential structure of the sentence. In contrast, the language models are comparatively more sensitive to words with a syntactic function. Further analysis of the most informative n-gram contexts for each model shows that in comparison with the Visual pathway, the language models react more strongly to abstract contexts that represent syntactic constructions

    Symbolic Inductive Bias for Visually Grounded Learning of Spoken Language

    No full text

    Learning to normalize text from few examples

    No full text
    I propose a text normalization model based on learning edit operations from labeled data while incorporating features induced from unlabeled text and from dictionaries. These features enable effective learning with little supervision, as demonstrated on an English tweet normalization dataset

    Architectures and representations for string transduction

    No full text
    String transduction problems are ubiquitous in natural language processing: they include transliteration, grapheme-to-phoneme conversion, text normalization and translation. String transduction can be reduced to the simpler problems of sequence labeling by expressing the target string as a sequence of edit operations applied to the source string. Due to this reduction all sequence labeling models become applicable in typical transduction settings. Sequence models range from simple linear models such as sequence perceptron which require external feature extractors to recurrent neural networks with long short-term memory (LSTM) units which can do feature extraction internally. Versions of recurrent neural networks are also capable of solving string transduction natively, without reformulating it in terms of edit operations. In this talk I analyze the effect of these variations in model architecture and input representation on performance and engineering effort for string transduction, focusing especially on the text normalization task

    Symbolic Inductive Bias for Visually Grounded Learning of Spoken Language

    No full text
    A widespread approach to processing spoken language is to first automatically transcribe it into text. An alternative is to use an end-to-end approach: recent works have proposed to learn semantic embeddings of spoken language from images with spoken captions, without an intermediate transcription step. We propose to use multitask learning to exploit existing transcribed speech within the end-to-end setting. We describe a three-task architecture which combines the objectives of matching spoken captions with corresponding images, speech with text, and text with images. We show that the addition of the speech/text task leads to substantial performance improvements on image retrieval when compared to training the speech/image task in isolation. We conjecture that this is due to a strong inductive bias transcribed speech provides to the model, and offer supporting evidence for this

    Learning character-wise text representations with Elman nets

    No full text
    Simple recurrent networks (SRNs) were introduced by Elman (1990) in order to model temporal structures in general and sequential structure in language in particular. More recently, SRN-based language models have become practical to train on large datasets and shown to outperform n-gram language models for speech recognition (Mikolov et al., 2010). In a parallel development, word embeddings induced using feedforward neural networks have proved to provide expressive and informative features for many language processing tasks (Collobert et al., 2011; Socher et al., 2012). The majority of representations of text used in computational linguistics are based on words as the smallest units. Words are not always the most appropriate atomic unit: this is the case for languages where orthographic words correspond to whole English phrases or sentences. It is equally the case when the text analysis task needs to be performed at character level: for example when segmenting text into tokens or when normalizing corrupted text into its canonical form. In this work we propose a mechanism to learn character-level representations of text. Our representations are low-dimensional real-valued embeddings which form an abstraction over the character string prior to each position in a stream of characters. They correspond to the activation of the hidden layer in a simple recurrent neural network. The network is trained as a language model: it is sequentially presented with each character in a string (encoded using a one-hot vector) and learns to predict the next character in the sequence. The representation of history is stored in a limited number of hidden units (we use 400), which forces the network to create a compressed and abstract representation rather than memorize verbatim strings. After training the network on large amounts on unlabeled text, it can be run on unseen character sequences, and activations of its hidden layer units can be recorded at each position and used as features in a supervised learning model. We use these representation as input features (in addition to character n-grams) for text analysis tasks: learning to detect and label programming language code samples embedded in natural language text (Chrupala, 2013), learning to segment text into words and sentences (Evang et al., 2013) and learning to translate non-canonical user generated contents into a normalized form (Chrupala, 2014). For all tasks and languages we obtain consistent performance boosts in comparison with using only character n-gram features, with relative error reductions ranging from around 12% for English tweet normalization to around 85% for Dutch word and sentence segmentation. References Chrupala, G. (2013). Text segmentation with character-level text embeddings. ICML Workshop on Deep Learning for Audio, Speech and Language Processing. Chrupala, G. (2014). Normalizing tweets with edit scripts and recurrent neural embeddings. ACL. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. The Journal of Machine Learning Research, 12, 2493–2537. Elman, J. L. (1990). Finding structure in time. Cognitive science, 14, 179–211. Evang, K., Basile, V., Chrupala, G., & Bos, J. (2013). Elephant: Sequence labeling for word and sentence segmentation. EMNLP. Mikolov, T., Karafi´at, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. INTERSPEECH. Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. EMNLP-CoNLL

    Visually grounded models of spoken language - A survey of datasets, architectures and evaluation techniques.

    No full text
    This survey provides an overview of the evolution of visually grounded models of spoken language over the last 20 years. Such models are inspired by the observation that when children pick up a language, they rely on a wide range of indirect and noisy clues, crucially including signals from the visual modality co-occurring with spoken utterances. Several fields have made important contributions to this approach to modeling or mimicking the process of learning language: Machine Learning, Natural Language and Speech Processing, Computer Vision and Cognitive Science. The current paper brings together these contributions in order to provide a useful introduction and overview for practitioners in all these areas. We discuss the central research questions addressed, the timeline of developments, and the datasets which enabled much of this work. We then summarize the main modeling architectures and offer an exhaustive overview of the evaluation metrics and analysis techniques
    • …
    corecore