1,989 research outputs found

    Explainability for Large Language Models: A Survey

    Full text link
    Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional machine learning models

    SWAT: A System for Detecting Salient Wikipedia Entities in Texts

    Full text link
    We study the problem of entity salience by proposing the design and implementation of SWAT, a system that identifies the salient Wikipedia entities occurring in an input document. SWAT consists of several modules that are able to detect and classify on-the-fly Wikipedia entities as salient or not, based on a large number of syntactic, semantic and latent features properly extracted via a supervised process which has been trained over millions of examples drawn from the New York Times corpus. The validation process is performed through a large experimental assessment, eventually showing that SWAT improves known solutions over all publicly available datasets. We release SWAT via an API that we describe and comment in the paper in order to ease its use in other software

    All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch

    Get PDF
    Readability research has a long and rich tradition, but there has been too little focus on general readability prediction without targeting a specific audience or text genre. Moreover, though NLP-inspired research has focused on adding more complex readability features there is still no consensus on which features contribute most to the prediction. In this article, we investigate in close detail the feasibility of constructing a readability prediction system for English and Dutch generic text using supervised machine learning. Based on readability assessments by both experts and a crowd, we implement different types of text characteristics ranging from easy-to-compute superficial text characteristics to features requiring a deep linguistic processing, resulting in ten different feature groups. Both a regression and classification setup are investigated reflecting the two possible readability prediction tasks: scoring individual texts or comparing two texts. We show that going beyond correlation calculations for readability optimization using a wrapper-based genetic algorithm optimization approach is a promising task which provides considerable insights in which feature combinations contribute to the overall readability prediction. Since we also have gold standard information available for those features requiring deep processing we are able to investigate the true upper bound of our Dutch system. Interestingly, we will observe that the performance of our fully-automatic readability prediction pipeline is on par with the pipeline using golden deep syntactic and semantic information

    State-of-the-art generalisation research in NLP: a taxonomy and review

    Get PDF
    The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what `good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the ground-work to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to up-date as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.Comment: 35 pages of content + 53 pages of reference

    The educational research-practice interface revisited

    Get PDF
    The question of how the realms of research and practice might successfully relate to one another is a persisting one, and especially so in education. The article takes a fresh look at this issue by using the terminology of collaboration scripts to reflect upon various forms of this relationship. Under this perspective, several approaches towards bridging the research/ practice gap are being described with regard to the type and closeness of interaction between the two realms. As different focuses and blind spots become discernible, the issue is raised concerning which 'script' might be appropriate depending upon the starting conditions of research interacting with practice

    Biomedical Event Extraction with Machine Learning

    Get PDF
    Biomedical natural language processing (BioNLP) is a subfield of natural language processing, an area of computational linguistics concerned with developing programs that work with natural language: written texts and speech. Biomedical relation extraction concerns the detection of semantic relations such as protein--protein interactions (PPI) from scientific texts. The aim is to enhance information retrieval by detecting relations between concepts, not just individual concepts as with a keyword search. In recent years, events have been proposed as a more detailed alternative for simple pairwise PPI relations. Events provide a systematic, structural representation for annotating the content of natural language texts. Events are characterized by annotated trigger words, directed and typed arguments and the ability to nest other events. For example, the sentence ``Protein A causes protein B to bind protein C&#39;&#39; can be annotated with the nested event structure CAUSE(A, BIND(B, C)). Converted to such formal representations, the information of natural language texts can be used by computational applications. Biomedical event annotations were introduced by the BioInfer and GENIA corpora, and event extraction was popularized by the BioNLP&#39;09 Shared Task on Event Extraction. In this thesis we present a method for automated event extraction, implemented as the Turku Event Extraction System (TEES). A unified graph format is defined for representing event annotations and the problem of extracting complex event structures is decomposed into a number of independent classification tasks. These classification tasks are solved using SVM and RLS classifiers, utilizing rich feature representations built from full dependency parsing.&nbsp; Building on earlier work on pairwise relation extraction and using a generalized graph representation, the resulting TEES system is capable of detecting binary relations as well as complex event structures. We show that this event extraction system has good performance, reaching the first place in the BioNLP&#39;09 Shared Task on Event Extraction. Subsequently, TEES has achieved several first ranks in the BioNLP&#39;11 and BioNLP&#39;13 Shared Tasks, as well as shown competitive performance in the binary relation Drug-Drug Interaction Extraction 2011 and 2013 shared tasks. The Turku Event Extraction System is published as a freely available open-source project, documenting the research in detail as well as making the method available for practical applications. In particular, in this thesis we describe the application of the event extraction method to PubMed-scale text mining, showing how the developed approach not only shows good performance, but is generalizable and applicable to large-scale real-world text mining projects. Finally, we discuss related literature, summarize the contributions of the work and present some thoughts on future directions for biomedical event extraction. This thesis includes and builds on six original research publications. The first of these introduces the analysis of dependency parses that leads to development of TEES. The entries in the three BioNLP Shared Tasks, as well as in the DDIExtraction 2011 task are covered in four publications, and the sixth one demonstrates the application of the system to PubMed-scale text mining.</p

    Data discovery through profile-based similarity metrics

    Get PDF
    The most essential step in a data integration process is to find the datasets whose combined information provides relevant insights. This task, defined as data discovery, is highly dependent on the definition of the similarity between the candidate attributes to join, which commonly involves assessing the closeness of the semantic concepts that the two attributes represent. Most of the state-of-the-art approaches to this issue rely on syntactic methodologies, that is, procedures in which the instances of the two columns are compared to determine whether they are similar or not. These approaches suffice when the two sets of instances share the same syntactic representation but fail to detect cases in which the same semantic idea is represented by different sets of values. This latter case is ever-increasing in proportion, given the characteristics of big-data environments and the lack of standardization of the data. The aim of this project is to develop a system that can solve this issue and facilitate the establishment of relationships between related data that do not share a syntactic relationship. The approach presented in this work leverages the extensively studied syntactic methodologies to data discovery in conjunction with a new formulation for semantic similarity: the resemblance of probability distributions. Additionally, this system will be made scalable and able to handle vast quantities of data

    Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations

    Full text link
    Identifying academic plagiarism is a pressing task for educational and research institutions, publishers, and funding agencies. Current plagiarism detection systems reliably find instances of copied and moderately reworded text. However, reliably detecting concealed plagiarism, such as strong paraphrases, translations, and the reuse of nontextual content and ideas is an open research problem. In this paper, we extend our prior research on analyzing mathematical content and academic citations. Both are promising approaches for improving the detection of concealed academic plagiarism primarily in Science, Technology, Engineering and Mathematics (STEM). We make the following contributions: i) We present a two-stage detection process that combines similarity assessments of mathematical content, academic citations, and text. ii) We introduce new similarity measures that consider the order of mathematical features and outperform the measures in our prior research. iii) We compare the effectiveness of the math-based, citation-based, and text-based detection approaches using confirmed cases of academic plagiarism. iv) We demonstrate that the combined analysis of math-based and citation-based content features allows identifying potentially suspicious cases in a collection of 102K STEM documents. Overall, we show that analyzing the similarity of mathematical content and academic citations is a striking supplement for conventional text-based detection approaches for academic literature in the STEM disciplines.Comment: Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL) 2019. The data and code of our study are openly available at https://purl.org/hybridP
    corecore