2,195 research outputs found

    BYOC: Personalized Few-Shot Classification with Co-Authored Class Descriptions

    Full text link
    Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for themselves. To address this issue, we propose a novel approach to few-shot text classification using an LLM. Rather than few-shot examples, the LLM is prompted with descriptions of the salient features of each class. These descriptions are coauthored by the user and the LLM interactively: while the user annotates each few-shot example, the LLM asks relevant questions that the user answers. Examples, questions, and answers are summarized to form the classification prompt. Our experiments show that our approach yields high accuracy classifiers, within 82% of the performance of models trained with significantly larger datasets while using only 1% of their training sets. Additionally, in a study with 30 participants, we show that end-users are able to build classifiers to suit their specific needs. The personalized classifiers show an average accuracy of 90%, which is 15% higher than the state-of-the-art approach.Comment: Accepted at EMNLP 2023 (Findings

    Characterizing and Predicting Email Deferral Behavior

    Full text link
    Email triage involves going through unhandled emails and deciding what to do with them. This familiar process can become increasingly challenging as the number of unhandled email grows. During a triage session, users commonly defer handling emails that they cannot immediately deal with to later. These deferred emails, are often related to tasks that are postponed until the user has more time or the right information to deal with them. In this paper, through qualitative interviews and a large-scale log analysis, we study when and what enterprise email users tend to defer. We found that users are more likely to defer emails when handling them involves replying, reading carefully, or clicking on links and attachments. We also learned that the decision to defer emails depends on many factors such as user's workload and the importance of the sender. Our qualitative results suggested that deferring is very common, and our quantitative log analysis confirms that 12% of triage sessions and 16% of daily active users had at least one deferred email on weekdays. We also discuss several deferral strategies such as marking emails as unread and flagging that are reported by our interviewees, and illustrate how such patterns can be also observed in user logs. Inspired by the characteristics of deferred emails and contextual factors involved in deciding if an email should be deferred, we train a classifier for predicting whether a recently triaged email is actually deferred. Our experimental results suggests that deferral can be classified with modest effectiveness. Overall, our work provides novel insights about how users handle their emails and how deferral can be modeled

    Genres in young learner L2 English writing: A genre typology for the TRAWL (Tracking Written Learner Language) corpus

    Get PDF
    In learner corpus research, it is well known that one should control for genre when collecting and analysing written L2 (second language) English data, as genre is one factor that has been shown to account for language variation. This article presents a genre typology for annotating learner texts from the lower secondary level in Norway (ages 13-15, school years 8-10). The data are drawn from TRAWL (Tracking Written Learner Language), a new learner corpus currently under compilation. As the TRAWL corpus will be openly available for research, it is important that the typology is clearly described, which is the primary aim of the present study. Little research has been carried out on younger learners, and no detailed genre typology exists for classifying learner texts at the lower secondary level. Therefore, a genre typology developed by Ørevik (2019) for the upper secondary level was tested on data from TRAWL using a functional, social semiotic perspective and a mixed-methods (quantitative and qualitative) approach. The analysis showed that Ørevik’s typology was largely suitable for annotating the selected TRAWL data and only had to be slightly modified. By highlighting some of the theoretical and methodological challenges with the genre typology, the analysis may inform discussions about genre in L2 English teaching, which was a secondary aim of the present study. Not only do the results mirror the tensions in the international debate within genre research, they also mirror the everyday challenges of lower secondary school teachers/examiners, who seem to adopt an eclectic approach to genre.publishedVersio

    CVE-driven Attack Technique Prediction with Semantic Information Extraction and a Domain-specific Language Model

    Full text link
    This paper addresses a critical challenge in cybersecurity: the gap between vulnerability information represented by Common Vulnerabilities and Exposures (CVEs) and the resulting cyberattack actions. CVEs provide insights into vulnerabilities, but often lack details on potential threat actions (tactics, techniques, and procedures, or TTPs) within the ATT&CK framework. This gap hinders accurate CVE categorization and proactive countermeasure initiation. The paper introduces the TTPpredictor tool, which uses innovative techniques to analyze CVE descriptions and infer plausible TTP attacks resulting from CVE exploitation. TTPpredictor overcomes challenges posed by limited labeled data and semantic disparities between CVE and TTP descriptions. It initially extracts threat actions from unstructured cyber threat reports using Semantic Role Labeling (SRL) techniques. These actions, along with their contextual attributes, are correlated with MITRE's attack functionality classes. This automated correlation facilitates the creation of labeled data, essential for categorizing novel threat actions into threat functionality classes and TTPs. The paper presents an empirical assessment, demonstrating TTPpredictor's effectiveness with accuracy rates of approximately 98% and F1-scores ranging from 95% to 98% in precise CVE classification to ATT&CK techniques. TTPpredictor outperforms state-of-the-art language model tools like ChatGPT. Overall, this paper offers a robust solution for linking CVEs to potential attack techniques, enhancing cybersecurity practitioners' ability to proactively identify and mitigate threats

    Caring for the patient, caring for the record: an ethnographic study of 'back office' work in upholding quality of care in general practice

    Get PDF
    © 2015 Swinglehurst and Greenhalgh; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Additional file 1: Box 1. Field notes on summarising (Clover Surgery). Box 2. Extract of document prepared for GPs by summarisers at Clover Surgery. Box 3. Fieldnotes on coding incoming post, Clover (original notes edited for brevity).This work was funded by a research grant from the UK Medical Research Council (Healthcare Electronic Records in Organisations 07/133) and a National Institute of Health Research doctoral fellowship award for DS (RDA/03/07/076). The funders were not involved in the selection or analysis of data nor did they make any contribution to the content of the final manuscript

    Delivering Behaviour Change Interventions: Development of a Mode of Delivery Ontology [version 1; peer review: 1 approved, 1 approved with reservations]

    Get PDF
    Background: Investigating and improving the effects of behaviour change interventions requires detailed and consistent specification of all aspects of interventions. An important feature of interventions is the way in which these are delivered, i.e. their mode of delivery. This paper describes an ontology for specifying the mode of delivery of interventions, which forms part of the Behaviour Change Intervention Ontology, currently being developed in the Wellcome Trust funded Human Behaviour-Change Project. / Methods: The Mode of Delivery Ontology was developed in an iterative process of annotating behaviour change interventions evaluation reports, and consulting with expert stakeholders. It consisted of seven steps: 1) annotation of 110 intervention reports to develop a preliminary classification of modes of delivery; 2) open review from international experts (n=25); 3) second round of annotations with 55 reports to test inter-rater reliability and identify limitations; 4) second round of expert review feedback (n=16); 5) final round of testing of the refined ontology by two annotators familiar and two annotators unfamiliar with the ontology; 6) specification of ontological relationships between entities; and 7) transformation into a machine-readable format using the Web Ontology Language (OWL) language and publishing online. / Results: The resulting ontology is a four-level hierarchical structure comprising 65 unique modes of delivery, organised by 15 upper-level classes: Informational, Environmental change, Somatic, Somatic alteration, Individual-based/ Pair-based /Group-based, Uni-directional/Interactional, Synchronous/ Asynchronous, Push/ Pull, Gamification, Arts feature. Relationships between entities consist of is_a. Inter-rater reliability of the Mode of Delivery Ontology for annotating intervention evaluation reports was a=0.80 (very good) for those familiar with the ontology and a= 0.58 (acceptable) for those unfamiliar with it. / Conclusion: The ontology can be used for both annotating and writing behaviour change intervention evaluation reports in a consistent and coherent manner, thereby improving evidence comparison, synthesis, replication, and implementation of effective interventions
    • …
    corecore