610 research outputs found
Mixing and blending syntactic and semantic dependencies
Our system for the CoNLL 2008 shared
task uses a set of individual parsers, a set of
stand-alone semantic role labellers, and a
joint system for parsing and semantic role
labelling, all blended together. The system
achieved a macro averaged labelled F1-
score of 79.79 (WSJ 80.92, Brown 70.49)
for the overall task. The labelled attachment
score for syntactic dependencies was
86.63 (WSJ 87.36, Brown 80.77) and the
labelled F1-score for semantic dependencies
was 72.94 (WSJ 74.47, Brown 60.18)
Topic-based mixture language modelling
This paper describes an approach for constructing a mixture of language models based on simple statistical notions of semantics using probabilistic models developed for information retrieval. The approach encapsulates corpus-derived semantic information and is able to model varying styles of text. Using such information, the corpus texts are clustered in an unsupervised manner and a mixture of topic-specific language models is automatically created. The principal contribution of this work is to characterise the document space resulting from information retrieval techniques and to demonstrate the approach for mixture language modelling.
A comparison is made between manual and automatic clustering in order to elucidate how the global content information is expressed in the space. We also compare (in terms of association with manual clustering and language modelling accuracy) alternative term-weighting schemes and the effect of singular value decomposition dimension reduction (latent semantic analysis). Test set perplexity results using the British National Corpus indicate that the approach can improve the potential of statistical language modelling. Using an adaptive procedure, the conventional model may be tuned to track text data with a slight increase in computational cost
Us and them: identifying cyber hate on Twitter across multiple protected characteristics
Hateful and antagonistic content published and propagated via the World Wide Web
has the potential to cause harm and suffering on an individual basis, and lead to
social tension and disorder beyond cyber space. Despite new legislation aimed at
prosecuting those who misuse new forms of communication to post threatening,
harassing, or grossly offensive language - or cyber hate - and the fact large social
media companies have committed to protecting their users from harm, it goes largely
unpunished due to difficulties in policing online public spaces. To support the
automatic detection of cyber hate online, specifically on Twitter, we build multiple
individual models to classify cyber hate for a range of protected characteristics
including race, disability and sexual orientation. We use text parsing to extract typed
dependencies, which represent syntactic and grammatical relationships between
words, and are shown to capture âotheringâ language - consistently improving
machine classification for different types of cyber hate beyond the use of a Bag of
Words and known hateful terms. Furthermore, we build a data-driven blended model
of cyber hate to improve classification where more than one protected characteristic
may be attacked (
e.g.
race and sexual orientation), contributing to the nascent study
of intersectionality in hate crime
Code-Switching by Phase
We show that the theoretical construct âphaseâ underlies a number of restrictions on code-switching, in particular those formalized under the Principle of Functional Restriction (GonzĂĄlez-Vilbazo 2005) and the Phonetic Form Interface Condition (MacSwan and Colina 2014). The fundamental hypothesis that code-switching should be studied using the same tools that we use for monolingual phenomena is reinforced.Federal Ministry of Education and ResearchPeer Reviewe
The \u3ci\u3eman\u3csub\u3ei\u3c/sub\u3e said she\u3csub\u3ei\u3c/sub\u3e would return\u3c/i\u3e: English pronominal gender in native Mandarin speaking learners, examined within a comprehensive theory of language acquisition
The project that led to this honors thesis was begun in the Fall semester of 2010, in a graduate-level psycholinguistics course taught by Dr. T. Daniel Seely. At that time, I was intensively studying Mandarin and had been living with a native speaker who was also in the process of learning English. The types of speech errors in her English, particularly the ones that appeared to have resulted from influence from her native Mandarin, interested me greatly. One of the most striking errors that she tended to make, however, was mis-matching English gender-marked pronouns with the gender of the referent. That is, she would frequently say things like The man driving the bus said she could bring me to Ann Arbor, or I love Lady Gaga, his style is so interesting
Deep Learning With Sentiment Inference For Discourse-Oriented Opinion Analysis
Opinions are omnipresent in written and spoken text ranging from editorials, reviews, blogs, guides, and informal conversations to written and broadcast news. However, past research in NLP has mainly addressed explicit opinion expressions, ignoring implicit opinions. As a result, research in opinion analysis has plateaued at a somewhat superficial level, providing methods that only recognize what is explicitly said and do not understand what is implied.
In this dissertation, we develop machine learning models for two tasks that presumably support propagation of sentiment in discourse, beyond one sentence. The first task we address is opinion role labeling, i.e.\ the task of detecting who expressed a given attitude toward what or who. The second task is abstract anaphora resolution, i.e.\ the task of finding a (typically) non-nominal antecedent of pronouns and noun phrases that refer to abstract objects like facts, events, actions, or situations in the preceding discourse.
We propose a neural model for labeling of opinion holders and targets and circumvent the problems that arise from the limited labeled data. In particular, we extend the baseline model with different multi-task learning frameworks. We obtain clear performance improvements using semantic role labeling as the auxiliary task. We conduct a thorough analysis to demonstrate how multi-task learning helps, what has been solved for the task, and what is next. We show that future developments should improve the ability of the models to capture long-range dependencies and consider other auxiliary tasks such as dependency parsing or recognizing textual entailment. We emphasize that future improvements can be measured more reliably if opinion expressions with missing roles are curated and if the evaluation considers all mentions in opinion role coreference chains as well as discontinuous roles.
To the best of our knowledge, we propose the first abstract anaphora resolution model that handles the unrestricted phenomenon in a realistic setting.
We cast abstract anaphora resolution as the task of learning attributes of the relation that holds between the sentence with the abstract anaphor and its antecedent. We propose a Mention-Ranking siamese-LSTM model (MR-LSTM) for learning what characterizes the mentioned relation in a data-driven fashion. The current resources for abstract anaphora resolution are quite limited. However, we can train our models without conventional data for abstract anaphora resolution. In particular, we can train our models on many instances of antecedent-anaphoric sentence pairs. Such pairs can be automatically extracted from parsed corpora by searching for a common construction which consists of a verb with an embedded sentence (complement or adverbial), applying a simple transformation that replaces the embedded sentence with an abstract anaphor, and using the cut-off embedded sentence as the antecedent. We refer to the extracted data as silver data.
We evaluate our MR-LSTM models in a realistic task setup in which models need to rank embedded sentences and verb phrases from the sentence with the anaphor as well as a few preceding sentences. We report the first benchmark results on an abstract anaphora subset of the ARRAU corpus \citep{uryupina_et_al_2016} which presents a greater challenge due to a mixture of nominal and pronominal anaphors as well as a greater range of confounders. We also use two additional evaluation datasets: a subset of the CoNLL-12 shared task dataset \citep{pradhan_et_al_2012} and a subset of the ASN corpus \citep{kolhatkar_et_al_2013_crowdsourcing}. We show that our MR-LSTM models outperform the baselines in all evaluation datasets, except for events in the CoNLL-12 dataset. We conclude that training on the small-scale gold data works well if we encounter the same type of anaphors at the evaluation time. However, the gold training data contains only six shell nouns and events and thus resolution of anaphors in the ARRAU corpus that covers a variety of anaphor types benefits from the silver data. Our MR-LSTM models for resolution of abstract anaphors outperform the prior work for shell noun resolution \citep{kolhatkar_et_al_2013} in their restricted task setup. Finally, we try to get the best out of the gold and silver training data by mixing them. Moreover, we speculate that we could improve the training on a mixture if we: (i) handle artifacts in the silver data with adversarial training and (ii) use multi-task learning to enable our models to make ranking decisions dependent on the type of anaphor. These proposals give us mixed results and hence a robust mixed training strategy remains a challenge
Composition in distributional models of semantics
Distributional models of semantics have proven themselves invaluable both in cognitive
modelling of semantic phenomena and also in practical applications. For example,
they have been used to model judgments of semantic similarity (McDonald,
2000) and association (Denhire and Lemaire, 2004; Griffiths et al., 2007) and have
been shown to achieve human level performance on synonymy tests (Landuaer and
Dumais, 1997; Griffiths et al., 2007) such as those included in the Test of English as
Foreign Language (TOEFL). This ability has been put to practical use in automatic thesaurus
extraction (Grefenstette, 1994). However, while there has been a considerable
amount of research directed at the most effective ways of constructing representations
for individual words, the representation of larger constructions, e.g., phrases and sentences,
has received relatively little attention. In this thesis we examine this issue of
how to compose meanings within distributional models of semantics to form representations
of multi-word structures.
Natural language data typically consists of such complex structures, rather than
just individual isolated words. Thus, a model of composition, in which individual
word meanings are combined into phrases and phrases combine to form sentences,
is of central importance in modelling this data. Commonly, however, distributional
representations are combined in terms of addition (Landuaer and Dumais, 1997; Foltz
et al., 1998), without any empirical evaluation of alternative choices. Constructing
effective distributional representations of phrases and sentences requires that we have
both a theoretical foundation to direct the development of models of composition and
also a means of empirically evaluating those models.
The approach we take is to first consider the general properties of semantic composition
and from that basis define a comprehensive framework in which to consider
the composition of distributional representations. The framework subsumes existing
proposals, such as addition and tensor products, but also allows us to define novel
composition functions. We then show that the effectiveness of these models can be evaluated on three empirical tasks.
The first of these tasks involves modelling similarity judgements for short phrases
gathered in human experiments. Distributional representations of individual words are
commonly evaluated on tasks based on their ability to model semantic similarity relations,
e.g., synonymy or priming. Thus, it seems appropriate to evaluate phrase representations
in a similar manner. We then apply compositional models to language modelling,
demonstrating that the issue of composition has practical consequences, and
also providing an evaluation based on large amounts of natural data. In our third task,
we use these language models in an analysis of reading times from an eye-movement
study. This allows us to investigate the relationship between the composition of distributional
representations and the processes involved in comprehending phrases and
sentences.
We find that these tasks do indeed allow us to evaluate and differentiate the proposed
composition functions and that the results show a reasonable consistency across
tasks. In particular, a simple multiplicative model is best for a semantic space based
on word co-occurrence, whereas an additive model is better for the topic based model
we consider. More generally, employing compositional models to construct representations
of multi-word structures typically yields improvements in performance over
non-compositonal models, which only represent individual words
- âŠ