617 research outputs found

    Low- and high-resource opinion summarization

    Get PDF
    Customer reviews play a vital role in the online purchasing decisions we make. The reviews express user opinions that are useful for setting realistic expectations and uncovering important details about products. However, some products receive hundreds or even thousands of reviews, making them time-consuming to read. Moreover, many reviews contain uninformative content, such as irrelevant personal experiences. Automatic summarization offers an alternative – short text summaries capturing the essential information expressed in reviews. Automatically produced summaries can reflect overall or particular opinions and be tailored to user preferences. Besides being presented on major e-commerce platforms, home assistants can also vocalize them. This approach can improve user satisfaction by assisting in making faster and better decisions. Modern summarization approaches are based on neural networks, often requiring thousands of annotated samples for training. However, human-written summaries for products are expensive to produce because annotators need to read many reviews. This has led to annotated data scarcity where only a few datasets are available. Data scarcity is the central theme of our works, and we propose a number of approaches to alleviate the problem. The thesis consists of two parts where we discuss low- and high-resource data settings. In the first part, we propose self-supervised learning methods applied to customer reviews and few-shot methods for learning from small annotated datasets. Customer reviews without summaries are available in large quantities, contain a breadth of in-domain specifics, and provide a powerful training signal. We show that reviews can be used for learning summarizers via a self-supervised objective. Further, we address two main challenges associated with learning from small annotated datasets. First, large models rapidly overfit on small datasets leading to poor generalization. Second, it is not possible to learn a wide range of in-domain specifics (e.g., product aspects and usage) from a handful of gold samples. This leads to subtle semantic mistakes in generated summaries, such as ‘great dead on arrival battery.’ We address the first challenge by explicitly modeling summary properties (e.g., content coverage and sentiment alignment). Furthermore, we leverage small modules – adapters – that are more robust to overfitting. As we show, despite their size, these modules can be used to store in-domain knowledge to reduce semantic mistakes. Lastly, we propose a simple method for learning personalized summarizers based on aspects, such as ‘price,’ ‘battery life,’ and ‘resolution.’ This task is harder to learn, and we present a few-shot method for training a query-based summarizer on small annotated datasets. In the second part, we focus on the high-resource setting and present a large dataset with summaries collected from various online resources. The dataset has more than 33,000 humanwritten summaries, where each is linked up to thousands of reviews. This, however, makes it challenging to apply an ‘expensive’ deep encoder due to memory and computational costs. To address this problem, we propose selecting small subsets of informative reviews. Only these subsets are encoded by the deep encoder and subsequently summarized. We show that the selector and summarizer can be trained end-to-end via amortized inference and policy gradient methods

    Tracking the Structure and Sentiment of Vaccination Discussions on Mumsnet

    Full text link
    Vaccination is one of the most impactful healthcare interventions in terms of lives saved at a given cost, leading the anti-vaccination movement to be identified as one of the top 10 threats to global health in 2019 by the World Health Organization. This issue increased in importance during the COVID-19 pandemic where, despite good overall adherence to vaccination, specific communities still showed high rates of refusal. Online social media has been identified as a breeding ground for anti-vaccination discussions. In this work, we study how vaccination discussions are conducted in the discussion forum of Mumsnet, a United Kingdom based website aimed at parents. By representing vaccination discussions as networks of social interactions, we can apply techniques from network analysis to characterize these discussions, namely network comparison, a task aimed at quantifying similarities and differences between networks. Using network comparison based on graphlets -- small connected network subgraphs -- we show how the topological structure vaccination discussions on Mumsnet differs over time, in particular before and after COVID-19. We also perform sentiment analysis on the content of the discussions and show how the sentiment towards vaccinations changes over time. Our results highlight an association between differences in network structure and changes to sentiment, demonstrating how network comparison can be used as a tool to guide and enhance the conclusions from sentiment analysis

    Subgroup discovery for structured target concepts

    Get PDF
    The main object of study in this thesis is subgroup discovery, a theoretical framework for finding subgroups in data—i.e., named sub-populations— whose behaviour with respect to a specified target concept is exceptional when compared to the rest of the dataset. This is a powerful tool that conveys crucial information to a human audience, but despite past advances has been limited to simple target concepts. In this work we propose algorithms that bring this framework to novel application domains. We introduce the concept of representative subgroups, which we use not only to ensure the fairness of a sub-population with regard to a sensitive trait, such as race or gender, but also to go beyond known trends in the data. For entities with additional relational information that can be encoded as a graph, we introduce a novel measure of robust connectedness which improves on established alternative measures of density; we then provide a method that uses this measure to discover which named sub-populations are more well-connected. Our contributions within subgroup discovery crescent with the introduction of kernelised subgroup discovery: a novel framework that enables the discovery of subgroups on i.i.d. target concepts with virtually any kind of structure. Importantly, our framework additionally provides a concrete and efficient tool that works out-of-the-box without any modification, apart from specifying the Gramian of a positive definite kernel. To use within kernelised subgroup discovery, but also on any other kind of kernel method, we additionally introduce a novel random walk graph kernel. Our kernel allows the fine tuning of the alignment between the vertices of the two compared graphs, during the count of the random walks, while we also propose meaningful structure-aware vertex labels to utilise this new capability. With these contributions we thoroughly extend the applicability of subgroup discovery and ultimately re-define it as a kernel method.Der Hauptgegenstand dieser Arbeit ist die Subgruppenentdeckung (Subgroup Discovery), ein theoretischer Rahmen für das Auffinden von Subgruppen in Daten—d. h. benannte Teilpopulationen—deren Verhalten in Bezug auf ein bestimmtes Targetkonzept im Vergleich zum Rest des Datensatzes außergewöhnlich ist. Es handelt sich hierbei um ein leistungsfähiges Instrument, das einem menschlichen Publikum wichtige Informationen vermittelt. Allerdings ist es trotz bisherigen Fortschritte auf einfache Targetkonzepte beschränkt. In dieser Arbeit schlagen wir Algorithmen vor, die diesen Rahmen auf neuartige Anwendungsbereiche übertragen. Wir führen das Konzept der repräsentativen Untergruppen ein, mit dem wir nicht nur die Fairness einer Teilpopulation in Bezug auf ein sensibles Merkmal wie Rasse oder Geschlecht sicherstellen, sondern auch über bekannte Trends in den Daten hinausgehen können. Für Entitäten mit zusätzlicher relationalen Information, die als Graph kodiert werden kann, führen wir ein neuartiges Maß für robuste Verbundenheit ein, das die etablierten alternativen Dichtemaße verbessert; anschließend stellen wir eine Methode bereit, die dieses Maß verwendet, um herauszufinden, welche benannte Teilpopulationen besser verbunden sind. Unsere Beiträge in diesem Rahmen gipfeln in der Einführung der kernelisierten Subgruppenentdeckung: ein neuartiger Rahmen, der die Entdeckung von Subgruppen für u.i.v. Targetkonzepten mit praktisch jeder Art von Struktur ermöglicht. Wichtigerweise, unser Rahmen bereitstellt zusätzlich ein konkretes und effizientes Werkzeug, das ohne jegliche Modifikation funktioniert, abgesehen von der Angabe des Gramian eines positiv definitiven Kernels. Für den Einsatz innerhalb der kernelisierten Subgruppentdeckung, aber auch für jede andere Art von Kernel-Methode, führen wir zusätzlich einen neuartigen Random-Walk-Graph-Kernel ein. Unser Kernel ermöglicht die Feinabstimmung der Ausrichtung zwischen den Eckpunkten der beiden unter-Vergleich-gestelltenen Graphen während der Zählung der Random Walks, während wir auch sinnvolle strukturbewusste Vertex-Labels vorschlagen, um diese neue Fähigkeit zu nutzen. Mit diesen Beiträgen erweitern wir die Anwendbarkeit der Subgruppentdeckung gründlich und definieren wir sie im Endeffekt als Kernel-Methode neu

    Detecting Team Conflict From Multiparty Dialogue

    Get PDF
    The emergence of online collaboration platforms has dramatically changed the dynamics of human teamwork, creating a veritable army of virtual teams composed of workers in different physical locations. The global world requires a tremendous amount of collaborative problem solving, primarily virtual, making it an excellent domain for computer scientists and team cognition researchers who seek to understand the dynamics involved in collaborative tasks to provide a solution that can support effective collaboration. Mining and analyzing data from collaborative dialogues can yield insights into virtual teams\u27 thought processes and help develop virtual agents to support collaboration. Good communication is indubitably the foundation of effective collaboration. Over time teams develop their own communication styles and often exhibit entrainment, a conversational phenomenon in which humans synchronize their linguistic choices. This dissertation presents several technical innovations in the usage of machine learning towards analyzing, monitoring, and predicting collaboration success from multiparty dialogue by successfully handling the problems of resource scarcity and natural distribution shifts. First, we examine the problem of predicting team performance from embeddings learned from multiparty dialogues such that teams with similar conflict scores lie close to one another in vector space. We extract the embeddings from three types of features: 1) dialogue acts 2) sentiment polarity 3) syntactic entrainment. Although all of these features can be used to predict team performance effectively, their utility varies by the teamwork phase. We separate the dialogues of players playing a cooperative game into stages: 1) early (knowledge building), 2) middle (problem-solving), and 3) late (culmination). Unlike syntactic entrainment, both dialogue act and sentiment embeddings effectively classify team performance, even during the initial phase. Second, we address the problem of learning generalizable models of collaboration. Machine learning models often suffer domain shifts; one advantage of encoding the semantic features is their adaptability across multiple domains. We evaluate the generalizability of different embeddings to other goal-oriented teamwork dialogues. Finally, in addition to identifying the features predictive of successful collaboration, we propose multi-feature embedding (MFeEmb) to improve the generalizability of collaborative task success prediction models under natural distribution shifts and resource scarcity. MFeEmb leverages the strengths of semantic, structural, and textual features of the dialogues by incorporating the most meaningful information from dialogue acts (DAs), sentiment polarities, and vocabulary of the dialogues. To further enhance the performance of MFeEmb under a resource-scarce scenario, we employ synthetic data generation and few-shot learning. We use the method proposed by Bailey and Chopra (2018) for few-shot learning from the FsText python library. We replaced the universal embedding with our proposed multi-feature embedding to compare the performance of the two. For data augmentation, we propose using synonym replacement from collaborative dialogue vocabulary instead of synonym replacement from WordNet. The research was conducted on several multiparty dialogue datasets, including ASIST, SwDA, Hate Speech, Diplomacy, Military, SAMSum, AMI, and GitHub. Results show that the proposed multi-feature embedding is an excellent choice for the meta-training stage of the few-shot learning, even if it learns from a small train set of size as small as 62 samples. Also, our proposed data augmentation method showed significant performance improvement. Our research has potential ramifications for the development of conversational agents that facilitate teaming as well as towards the creation of more effective social coding platforms to better support teamwork between software engineers

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    Investigating compositional visual knowledge through challenging visual tasks

    Get PDF
    Human vision manifests remarkable robustness to recognize objects from the visual world filled with a chaotic, dynamic assortment of information. Computationally, our visual system is challenged by the enormous variability in two-dimensional projected images as a function of viewpoint, lighting, material, articulation as well as occlusion. Many past research investigated the underlying representations and computational principles that support human vision robustness with controlled and simplified visual stimuli. Nevertheless, the generality of these findings was unclear until tested on more challenging and more naturalistic stimuli. In this thesis, I study human vision robustness with several challenging visual tasks and more naturalistic stimuli, including the recognition of occluded objects and the recognition of non-rigid human bodies from natural images of scenes. I use psychophysics, functional magnetic resonance imaging as well as computational modeling approaches to measure human vision robustness and examine the hierarchical, compositional framework as the underlying principle where the representation of the whole is composed of the representation of its parts through different hierarchies. I show that human vision has impressive abilities to recognize heavily occluded natural objects, and the human behavioral performance is better explained by compositional models rather than standard deep convolutional neural networks. In addition, I also show that human vision can rapidly and robustly extract information about spatial relationships between human body parts and discriminate three-dimensional non-rigid human poses even from a mere glance. Lastly, I show that there exists a distributed cortical network that encodes compositional pose representations with different view invariance and depth sensitivity, and the difference in these neural representations might be driven by the diversity of the supported behavior tasks. Taken together, this thesis demonstrates that human vision manifests great robustness even in these challenging visual tasks, and that the hierarchical, compositional framework may be one of the underlying principles supporting such robustness

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted

    Entity Linking for the Biomedical Domain

    Get PDF
    Entity linking is the process of detecting mentions of different concepts in text documents and linking them to canonical entities in a target lexicon. However, one of the biggest issues in entity linking is the ambiguity in entity names. The ambiguity is an issue that many text mining tools have yet to address since different names can represent the same thing and every mention could indicate a different thing. For instance, search engines that rely on heuristic string matches frequently return irrelevant results, because they are unable to satisfactorily resolve ambiguity. Thus, resolving named entity ambiguity is a crucial step in entity linking. To solve the problem of ambiguity, this work proposes a heuristic method for entity recognition and entity linking over the biomedical knowledge graph concerning the semantic similarity of entities in the knowledge graph. Named entity recognition (NER), relation extraction (RE), and relationship linking make up a conventional entity linking (EL) system pipeline (RL). We have used the accuracy metric in this thesis. Therefore, for each identified relation or entity, the solution comprises identifying the correct one and matching it to its corresponding unique CUI in the knowledge base. Because KBs contain a substantial number of relations and entities, each with only one natural language label, the second phase is directly dependent on the accuracy of the first. The framework developed in this thesis enables the extraction of relations and entities from the text and their mapping to the associated CUI in the UMLS knowledge base. This approach derives a new representation of the knowledge base that lends it to the easy comparison. Our idea to select the best candidates is to build a graph of relations and determine the shortest path distance using a ranking approach. We test our suggested approach on two well-known benchmarks in the biomedical field and show that our method exceeds the search engine's top result and provides us with around 4% more accuracy. In general, when it comes to fine-tuning, we notice that entity linking contains subjective characteristics and modifications may be required depending on the task at hand. The performance of the framework is evaluated based on a Python implementation

    Learning representations for effective and explainable software bug detection and fixing

    Get PDF
    Software has an integral role in modern life; hence software bugs, which undermine software quality and reliability, have substantial societal and economic implications. The advent of machine learning and deep learning in software engineering has led to major advances in bug detection and fixing approaches, yet they fall short of desired precision and recall. This shortfall arises from the absence of a \u27bridge,\u27 known as learning code representations, that can transform information from source code into a suitable representation for effective processing via machine and deep learning. This dissertation builds such a bridge. Specifically, it presents solutions for effectively learning code representations using four distinct methods?context-based, testing results-based, tree-based, and graph-based?thus improving bug detection and fixing approaches, as well as providing developers insight into the foundational reasoning. The experimental results demonstrate that using learning code representations can significantly enhance explainable bug detection and fixing, showcasing the practicability and meaningfulness of the approaches formulated in this dissertation toward improving software quality and reliability
    • …
    corecore