248 research outputs found

    Deep Learning With Sentiment Inference For Discourse-Oriented Opinion Analysis

    Get PDF
    Opinions are omnipresent in written and spoken text ranging from editorials, reviews, blogs, guides, and informal conversations to written and broadcast news. However, past research in NLP has mainly addressed explicit opinion expressions, ignoring implicit opinions. As a result, research in opinion analysis has plateaued at a somewhat superficial level, providing methods that only recognize what is explicitly said and do not understand what is implied. In this dissertation, we develop machine learning models for two tasks that presumably support propagation of sentiment in discourse, beyond one sentence. The first task we address is opinion role labeling, i.e.\ the task of detecting who expressed a given attitude toward what or who. The second task is abstract anaphora resolution, i.e.\ the task of finding a (typically) non-nominal antecedent of pronouns and noun phrases that refer to abstract objects like facts, events, actions, or situations in the preceding discourse. We propose a neural model for labeling of opinion holders and targets and circumvent the problems that arise from the limited labeled data. In particular, we extend the baseline model with different multi-task learning frameworks. We obtain clear performance improvements using semantic role labeling as the auxiliary task. We conduct a thorough analysis to demonstrate how multi-task learning helps, what has been solved for the task, and what is next. We show that future developments should improve the ability of the models to capture long-range dependencies and consider other auxiliary tasks such as dependency parsing or recognizing textual entailment. We emphasize that future improvements can be measured more reliably if opinion expressions with missing roles are curated and if the evaluation considers all mentions in opinion role coreference chains as well as discontinuous roles. To the best of our knowledge, we propose the first abstract anaphora resolution model that handles the unrestricted phenomenon in a realistic setting. We cast abstract anaphora resolution as the task of learning attributes of the relation that holds between the sentence with the abstract anaphor and its antecedent. We propose a Mention-Ranking siamese-LSTM model (MR-LSTM) for learning what characterizes the mentioned relation in a data-driven fashion. The current resources for abstract anaphora resolution are quite limited. However, we can train our models without conventional data for abstract anaphora resolution. In particular, we can train our models on many instances of antecedent-anaphoric sentence pairs. Such pairs can be automatically extracted from parsed corpora by searching for a common construction which consists of a verb with an embedded sentence (complement or adverbial), applying a simple transformation that replaces the embedded sentence with an abstract anaphor, and using the cut-off embedded sentence as the antecedent. We refer to the extracted data as silver data. We evaluate our MR-LSTM models in a realistic task setup in which models need to rank embedded sentences and verb phrases from the sentence with the anaphor as well as a few preceding sentences. We report the first benchmark results on an abstract anaphora subset of the ARRAU corpus \citep{uryupina_et_al_2016} which presents a greater challenge due to a mixture of nominal and pronominal anaphors as well as a greater range of confounders. We also use two additional evaluation datasets: a subset of the CoNLL-12 shared task dataset \citep{pradhan_et_al_2012} and a subset of the ASN corpus \citep{kolhatkar_et_al_2013_crowdsourcing}. We show that our MR-LSTM models outperform the baselines in all evaluation datasets, except for events in the CoNLL-12 dataset. We conclude that training on the small-scale gold data works well if we encounter the same type of anaphors at the evaluation time. However, the gold training data contains only six shell nouns and events and thus resolution of anaphors in the ARRAU corpus that covers a variety of anaphor types benefits from the silver data. Our MR-LSTM models for resolution of abstract anaphors outperform the prior work for shell noun resolution \citep{kolhatkar_et_al_2013} in their restricted task setup. Finally, we try to get the best out of the gold and silver training data by mixing them. Moreover, we speculate that we could improve the training on a mixture if we: (i) handle artifacts in the silver data with adversarial training and (ii) use multi-task learning to enable our models to make ranking decisions dependent on the type of anaphor. These proposals give us mixed results and hence a robust mixed training strategy remains a challenge

    Generating referring expressions in a domain of objects and processes

    Get PDF
    This thesis presents a collection of algorithms and data structures for the generation of pronouns, anaphoric definite noun phrases, and one-anaphoric phrases. After a close analysis of the particular kinds of referring expressions that appear in a particular domain -that of cookery recipes -the thesis presents an appropriate ontology and a corresponding representation language. This ontology is then integrated into a wider framework for language generation as a whole, whereupon we show how the representation language can be successfully used to produce appropriate referring expressions for a range of complex object types.Amongst the more important ideas explored in the thesis are the following:• We introduce the notion of a generalized physical object as a way of representing singular entities, mass entities, and entities which are sets.• We adopt the view that planning operators are essentially underspecified events, and use this, in conjunction with a simple model of the hearer, to allow us to determine the appropriate level of detail at which a given plan should be described.• We make use of a discourse model that distinguishes local and global focus, and is closely tied to a notion of discourse structure; and we introduce a notion of DISCRIMINATORY POWER as a means to choosing the content of a referring expression.• We present a model of the generation of referring expressions that makes use of two levels of intermediate representation, and integrate this model with the use of a linguistically- founded grammar for noun phrases.The thesis ends by making some suggestions for further extensions to the work reported here

    Resolving Other-Anaphora

    Get PDF
    Institute for Communicating and Collaborative SystemsReference resolution is a major component of any natural language system. In the past 30 years significant progress has been made in coreference resolution. However, there is more anaphora in texts than coreference. I present a computational treatment of other-anaphora, i.e., referential noun phrases (NPs) with non-pronominal heads modi- fied by “other” or “another”: [. . . ] the move is designed to more accurately reflect the value of products and to put steel on more equal footing with other commodities. Such NPs are anaphoric (i.e., they cannot be interpreted in isolation), with an antecedent that may occur in the previous discourse or the speaker’s and hearer’s mutual knowledge. For instance, in the example above, the NP “other commodities” refers to a set of commodities excluding steel, and it can be paraphrased as “commodities other than steel”. Resolving such cases requires first identifying the correct antecedent(s) of the other-anaphors. This task is the major focus of this dissertation. Specifically, the dissertation achieves two goals. First, it describes a procedure by which antecedents of other-anaphors can be found, including constraints and preferences which narrow down the search. Second, it presents several symbolic, machine learning and hybrid resolution algorithms designed specifically for other-anaphora. All the algorithms have been implemented and tested on a corpus of examples from the Wall Street Journal. The major results of this research are the following: 1. Grammatical salience plays a lesser role in resolving other-anaphors than in resolving pronominal anaphora. Algorithms that solely rely on grammatical features achieved worse results than algorithms that used semantic features as well. 2. Semantic knowledge (such as “steel is a commodity”) is crucial in resolving other-anaphors. Algorithms that operate solely on semantic features outperformed those that operate on grammatical knowledge. 3. The quality and relevance of the semantic knowledge base is important to success. WordNet proved insufficient as a source of semantic information for resolving other-anaphora. Algorithms that use the Web as a knowledge base achieved better performance than those using WordNet, because the Web contains domain specific and general world knowledge which is not available from WordNet. 4. But semantic information by itself is not sufficient to resolve other-anaphors, as it seems to overgenerate, leading to many false positives. 5. Although semantic information is more useful than grammatical information, only integration of semantic and grammatical knowledge sources can handle the full range of phenomena. The best results were obtained from a combination of semantic and grammatical resources. 6. A probabilistic framework is best at handling the full spectrum of features, both because it does not require commitment as to the order in which the features should be applied, and because it allows features to be treated as preferences, rather than as absolute constraints. 7. A full resolution procedure for other-anaphora requires both a probabilistic model and a set of informed heuristics and back-off procedures. Such a hybrid system achieved the best results so far on other-anaphora

    Optimization issues in machine learning of coreference resolution

    Get PDF

    Pattern Based Information Extraction System in Business News Articles

    Get PDF
    Business news journals provide a rich resource of business events, which enable domain experts to further understand the spatio-temporal changes occur among a set of firms and people. However, extracting structured data from journal resource that is text-based and unstructured is a non-trivial challenge. This project designs and implements a Business Information Extraction System, which combines advanced natural language processing (NLP) tools and knowledge-based extraction patterns to process and extract information of target business event from news journals automatically. The performance evaluation on the proposed system suggests that IE techniques works well on business event extraction and it is promising to apply the technique to extract more types of business events.Master of Science in Information Scienc
    • …
    corecore