6 research outputs found

    Deep Learning With Sentiment Inference For Discourse-Oriented Opinion Analysis

    Get PDF
    Opinions are omnipresent in written and spoken text ranging from editorials, reviews, blogs, guides, and informal conversations to written and broadcast news. However, past research in NLP has mainly addressed explicit opinion expressions, ignoring implicit opinions. As a result, research in opinion analysis has plateaued at a somewhat superficial level, providing methods that only recognize what is explicitly said and do not understand what is implied. In this dissertation, we develop machine learning models for two tasks that presumably support propagation of sentiment in discourse, beyond one sentence. The first task we address is opinion role labeling, i.e.\ the task of detecting who expressed a given attitude toward what or who. The second task is abstract anaphora resolution, i.e.\ the task of finding a (typically) non-nominal antecedent of pronouns and noun phrases that refer to abstract objects like facts, events, actions, or situations in the preceding discourse. We propose a neural model for labeling of opinion holders and targets and circumvent the problems that arise from the limited labeled data. In particular, we extend the baseline model with different multi-task learning frameworks. We obtain clear performance improvements using semantic role labeling as the auxiliary task. We conduct a thorough analysis to demonstrate how multi-task learning helps, what has been solved for the task, and what is next. We show that future developments should improve the ability of the models to capture long-range dependencies and consider other auxiliary tasks such as dependency parsing or recognizing textual entailment. We emphasize that future improvements can be measured more reliably if opinion expressions with missing roles are curated and if the evaluation considers all mentions in opinion role coreference chains as well as discontinuous roles. To the best of our knowledge, we propose the first abstract anaphora resolution model that handles the unrestricted phenomenon in a realistic setting. We cast abstract anaphora resolution as the task of learning attributes of the relation that holds between the sentence with the abstract anaphor and its antecedent. We propose a Mention-Ranking siamese-LSTM model (MR-LSTM) for learning what characterizes the mentioned relation in a data-driven fashion. The current resources for abstract anaphora resolution are quite limited. However, we can train our models without conventional data for abstract anaphora resolution. In particular, we can train our models on many instances of antecedent-anaphoric sentence pairs. Such pairs can be automatically extracted from parsed corpora by searching for a common construction which consists of a verb with an embedded sentence (complement or adverbial), applying a simple transformation that replaces the embedded sentence with an abstract anaphor, and using the cut-off embedded sentence as the antecedent. We refer to the extracted data as silver data. We evaluate our MR-LSTM models in a realistic task setup in which models need to rank embedded sentences and verb phrases from the sentence with the anaphor as well as a few preceding sentences. We report the first benchmark results on an abstract anaphora subset of the ARRAU corpus \citep{uryupina_et_al_2016} which presents a greater challenge due to a mixture of nominal and pronominal anaphors as well as a greater range of confounders. We also use two additional evaluation datasets: a subset of the CoNLL-12 shared task dataset \citep{pradhan_et_al_2012} and a subset of the ASN corpus \citep{kolhatkar_et_al_2013_crowdsourcing}. We show that our MR-LSTM models outperform the baselines in all evaluation datasets, except for events in the CoNLL-12 dataset. We conclude that training on the small-scale gold data works well if we encounter the same type of anaphors at the evaluation time. However, the gold training data contains only six shell nouns and events and thus resolution of anaphors in the ARRAU corpus that covers a variety of anaphor types benefits from the silver data. Our MR-LSTM models for resolution of abstract anaphors outperform the prior work for shell noun resolution \citep{kolhatkar_et_al_2013} in their restricted task setup. Finally, we try to get the best out of the gold and silver training data by mixing them. Moreover, we speculate that we could improve the training on a mixture if we: (i) handle artifacts in the silver data with adversarial training and (ii) use multi-task learning to enable our models to make ranking decisions dependent on the type of anaphor. These proposals give us mixed results and hence a robust mixed training strategy remains a challenge

    Proteomics and protein activity profiling: an investigation into the salivary proteome and kinase activities in various systems using mass spectrometry

    Get PDF
    Protein identification and quantitation using mass spectrometry has evolved as the dominant technique for studying the protein complement of a system: cell, tissue or organism. The proteomics of body fluids is a very active research area as there is great potential for protein biomarker discovery; application of such technologies would revolutionise medical practice and treatment. Saliva, through its non intrusive nature of sampling, is an ideal body fluid for disease diagnosis, screening and monitoring. Gingivitis is a gum disease with symptoms including bleeding, swollen, and receding gums. After dental decay, gingivitis is estimated to be the most common disease worldwide, and around 40% of the population in the US are reported to have gingivitis. The end point goal of this project was to identify salivary biomarkers for gingivitis. This dissertation presents an investigation of: 1) the salivary proteome; 2) developments and applications of a mass spectrometry kinase assay; and 3) salivary biomarkers for gingivitis using proteomics and kinase activities. The soluble portion of the human salivary proteome (saliva supernatant) has been studied by several research groups but very few proteomic studies have been performed on the insoluble, cellular and bacterial portion of saliva. Presented here, is the first global proteomics study performed on the saliva residue and supernatant from the same test subject. A total of 834 and 1426 proteins were identified in the saliva supernatant and residue, respectively. A global analysis of protein complexes in saliva was also performed and is the first study, to date, of such an analysis. KAYAK (‘Kinase ActivitY Assay for Kinome analysis’) was further developed for its application on a number of cell types, tissue types, and a variety of organisms. Proof of concept work for in-gel kinase activity/kinase abundance correlation profiling using blue native gels was performed, and experiments using anion exchange chromatographic kinase activity/kinase abundance correlation profiling were performed to identify kinase-substrate pairs. KAYAK applications included the analysis of kinase activities in Saccharomyces cervisiae, Drosophila, mouse, and human saliva in which significant kinase activity was detected in the saliva supernatant, a novel finding. Finally, gingivitis was induced in patients, and the saliva samples were analysed using proteomics and kinase activity profiling. Although this work is ongoing, preliminary data indicate that there are increases in various inflammatory proteins, certain bacteria and also in the activity of particular kinases as a result of the induction of gingivitis. The overall study provided insights into the salivary proteome for both the human and bacterial complement, as well as discovering the presence of significant kinase activity in saliva. In the induced gingivitis study, almost half of all the proteins identified in the residue were from bacteria (1274 bacterial proteins, 198 species identified) and there may be more potential for biomarker discovery for certain diseases in the saliva residue than in the supernatant. A very large overlap was observed between the human proteins in the saliva supernatant and residue, indicating that many of the salivary proteins originate from lysed cells. The origin of the kinase activity in the saliva supernatant is not known but is also proposed to originate predominantly from lysed cells. A range of novel KAYAK applications have been investigated, demonstrating that KAYAK has a wide variety of future uses ranging from target compound evaluation in Pharmaceutical companies to patient testing in the clinic

    Cyber-Physical Systems of Systems: Foundations – A Conceptual Model and Some Derivations: The AMADEOS Legacy

    Get PDF
    Computer Systems Organization and Communication Networks; Software Engineering; Complex Systems; Information Systems Applications (incl. Internet); Computer Application
    corecore