693 research outputs found

    Synonym Detection Using Syntactic Dependency And Neural Embeddings

    Full text link
    Recent advances on the Vector Space Model have significantly improved some NLP applications such as neural machine translation and natural language generation. Although word co-occurrences in context have been widely used in counting-/predicting-based distributional models, the role of syntactic dependencies in deriving distributional semantics has not yet been thoroughly investigated. By comparing various Vector Space Models in detecting synonyms in TOEFL, we systematically study the salience of syntactic dependencies in accounting for distributional similarity. We separate syntactic dependencies into different groups according to their various grammatical roles and then use context-counting to construct their corresponding raw and SVD-compressed matrices. Moreover, using the same training hyperparameters and corpora, we study typical neural embeddings in the evaluation. We further study the effectiveness of injecting human-compiled semantic knowledge into neural embeddings on computing distributional similarity. Our results show that the syntactically conditioned contexts can interpret lexical semantics better than the unconditioned ones, whereas retrofitting neural embeddings with semantic knowledge can significantly improve synonym detection

    Testing word embeddings for Polish

    Get PDF
    Testing word embeddings for Polish Distributional Semantics postulates the representation of word meaning in the form of numeric vectors which represent words which occur in context in large text data. This paper addresses the problem of constructing such models for the Polish language. The paper compares the effectiveness of models based on lemmas and forms created with Continuous Bag of Words (CBOW) and skip-gram approaches based on different Polish corpora. For the purposes of this comparison, the results of two typical tasks solved with the help of distributional semantics, i.e. synonymy and analogy recognition, are compared. The results show that it is not possible to identify one universal approach to vector creation applicable to various tasks. The most important feature is the quality and size of the data, but different strategy choices can also lead to significantly different results.   Testowanie wektorowych reprezentacji dystrybucyjnych słów języka polskiego Semantyka dystrybucyjna opiera się na założeniu, że znaczenie słów wyrażone jest za pomocą wektorów reprezentujących, w sposób bezpośredni bądź pośredni, konteksty, w jakich słowo to jest używane w dużym zbiorze tekstów. Niniejszy artykuł dotyczy ewaluacji wielu takich modeli skonstruowanych dla języka polskiego. W pracy porównano skuteczność modeli opartych na lematach i formach słów, utworzonych przy wykorzystaniu sieci neuronowych na danych z dwóch różnych korpusów języka polskiego. Ewaluacji dokonano na podstawie wyników dwóch typowych zadań rozwiązywanych za pomocą metod semantyki dystrybucyjnej, tzn. rozpoznania występowania synonimii i analogii między konkretnymi parami słów. Uzyskane wyniki dowodzą, że nie można wskazać jednego uniwersalnego podejścia do tworzenia modeli dystrybucyjnych, gdyż ich skuteczność jest różna w zależności od zastosowania. Najważniejszą cechą wpływającą na jakość modelu jest jakość oraz rozmiar danych, ale wybory różnych strategii uczenia sieci mogą również prowadzić do istotnie odmiennych wyników

    A Socio-mathematical and Structure-Based Approach to Model Sentiment Dynamics in Event-Based Text

    Get PDF
    Natural language texts are often meant to express or impact the emotions of individuals. Recognizing the underlying emotions expressed in or triggered by textual content is essential if one is to arrive at an understanding of the full meaning that textual content conveys. Sentiment analysis (SA) researchers are becoming increasingly interested in investigating natural language processing techniques as well as emotion theory in order to detect, extract, and classify the sentiments that natural language text expresses. Most SA research is focused on the analysis of subjective documents from the writer’s perspective and their classification into categorical labels or sentiment polarity, in which text is associated with a descriptive label or a point on a continuum between two polarities. Researchers often perform sentiment or polarity classification tasks using machine learning (ML) techniques, sentiment lexicons, or hybrid-based approaches. Most ML methods rely on count-based word representations that fail to take word order into account. Despite the successful use of these flat word representations in topic-modelling problems, SA problems require a deeper understanding of sentence structure, since the entire meaning of words can be reversed through negations or word modifiers. On the other hand, approaches based on semantic lexicons are limited by the relatively small number of words they contain, which do not begin to embody the extensive and growing vocabulary on the Internet. The research presented in this thesis represents an effort to tackle the problem of sentiment analysis from a different viewpoint than those underlying current mainstream studies in this research area. A cross-disciplinary approach is proposed that incorporates affect control theory (ACT) into a structured model for determining the sentiment polarity of event-based articles from the perspectives of readers and interactants. A socio-mathematical theory, ACT provides valuable resources for handling interactions between words (event entities) and for predicting situational sentiments triggered by social events. ACT models human emotions arising from social event terms through the use of multidimensional representations that have been verified both empirically and theoretically. To model human emotions regarding textual content, the first step was to develop a fine-grained event extraction algorithm that extracts events and their entities from event-based textual information using semantic and syntactic parsing techniques. The results of the event extraction method were compared against a supervised learning approach on two human-coded corpora (a grammatically correct and a grammatically incorrect structured corpus). For both corpora, the semantic-syntactic event extraction method yielded a higher degree of accuracy than the supervised learning approach. The three-dimensional ACT lexicon was also augmented in a semi-supervised fashion using graph-based label propagation built from semantic and neural network word embeddings. The word embeddings were obtained through the training of commonly used count-based and neural-network-based algorithms on a single corpus, and each method was evaluated with respect to the reconstruction of a sentiment lexicon. The results show that, relative to other word embeddings and state-of-the-art methods, combining both semantic and neural word embeddings yielded the highest correlation scores and lowest error rates. Using the augmented lexicon and ACT mathematical equations, human emotions were modelled according to different levels of granularity (i.e., at the sentence and document levels). The initial stage involved the development of a proposed entity-based SA approach that models reader emotions triggered by event-based sentences. The emotions are modelled in a three-dimensional space based on reader sentiment toward different entities (e.g., subject and object) in the sentence. The new approach was evaluated using a human-annotated news-headline corpus; the results revealed the proposed method to be competitive with benchmark ML techniques. The second phase entailed the creation of a proposed ACT-based model for predicting the temporal progression of the emotions of the interactants and their optimal behaviour over a sequence of interactions. The model was evaluated using three different corpora: fairy tales, news articles, and a handcrafted corpus. The results produced by the proposed model demonstrate that, despite the challenging sentence structure, a reasonable agreement was achieved between the estimated emotions and behaviours and the corresponding ground truth

    Structural competition in second language production : towards a constraint-satisfaction model

    Get PDF
    Second language (L2) learners often show inconsistent production of some aspects of L2 grammar. One view, primarily based on data from L2 article production, suggests that grammatical patterns licensed by learners’ native language (L1) and those licensed by their L2 compete for selection, leading to variability in the production of L2 functional morphology. In this study, we show that the idea of structural competition has broader applicability, in correctly predicting certain asymmetries in the production of both the definite article the and plural marking –s by Thai learners of English. At the same time, we recognize that learners’ growing sensitivity to structural regularities in the L2 might be an additional contributing factor, and therefore make a novel proposal for how the L1–L2 structural competition model and the sensitivity-to-L2-structural regularities account could be integrated and their respective contributions studied under the constraint-satisfaction model of language processing. We argue that this approach is particularly suited to studying bilingual processing as it provides a natural framework for explaining how highly disparate factors, including partially activated options from both languages, interact during processing

    Speech Patterns in Child Speech and Their Origins

    Get PDF

    The role of English as L2 in the acquisition of L3 Norwegian by Italian native speakers. Cross-linguistic Influence in adult multilingualism.

    Get PDF
    As the field of third language acquisition (L3A) expands, the question of whether one or more languages influence the acquisition of (an) additional language(s) remains central. This thesis aims to contribute to the ongoing debate in the field of L3A by examining the role of English as a second language (L2) in the acquisition of L3 Norwegian by Italian native speakers (L1). English is the most commonly taught L2 in Italy and Europe, making it crucial to investigate its potential influence on L3 learning. This thesis incorporates a main experiment consisting of a self-paced reading task (SPR) to test the acquisition of four Norwegian properties: (i) pre-nominal gender agreement, adverbial word order in (ii) main and (iii) subordinate clauses, and (iv) topicalization. The participants were (n=10) Norwegian native speakers that served as a control group, and (n=16) Italian native speakers with L2 English learning Norwegian (L3). Altogether, results from the L1 Italian group indicated that L3 performance was influenced by their L2 English, and that length of English instruction significantly correlated with this behavior, while other extra-linguistic variables such as proficiency and exposure did not affect participants’ ability to detect grammatical violations

    An Attention-Based Model for Predicting Contextual Informativeness and Curriculum Learning Applications

    Full text link
    Both humans and machines learn the meaning of unknown words through contextual information in a sentence, but not all contexts are equally helpful for learning. We introduce an effective method for capturing the level of contextual informativeness with respect to a given target word. Our study makes three main contributions. First, we develop models for estimating contextual informativeness, focusing on the instructional aspect of sentences. Our attention-based approach using pre-trained embeddings demonstrates state-of-the-art performance on our single-context dataset and an existing multi-sentence context dataset. Second, we show how our model identifies key contextual elements in a sentence that are likely to contribute most to a reader's understanding of the target word. Third, we examine how our contextual informativeness model, originally developed for vocabulary learning applications for students, can be used for developing better training curricula for word embedding models in batch learning and few-shot machine learning settings. We believe our results open new possibilities for applications that support language learning for both human and machine learner
    corecore