227 research outputs found

    Aggregating and Analysing Opinions for Argument-based Relations

    Get PDF
    We present measurements of hadronic resonance, strange and multi-strange particle production in collisions of Xe-Xe and Pb-Pb at the center-of-mass energies of √sNN = 5.44 and 5.02 TeV, respectively, by the ALICE collaboration at the LHC. Particle ratios are presented as a function of multiplicity for K0 s , Λ, Ξ−, Ξ¯ +, Ω−, Ω¯ +, ρ(770)0, K∗(892)0, φ(1020) and Λ(1520). Our results are discussed and compared with predictions of QCD-inspired event generators. Additionally, comparisons with lower energy measurements and smaller systems are also presented

    A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions

    Full text link
    The powerful ability to understand, follow, and generate complex language emerging from large language models (LLMs) makes LLM-generated text flood many areas of our daily lives at an incredible speed and is widely accepted by humans. As LLMs continue to expand, there is an imperative need to develop detectors that can detect LLM-generated text. This is crucial to mitigate potential misuse of LLMs and safeguard realms like artistic expression and social networks from harmful influence of LLM-generated content. The LLM-generated text detection aims to discern if a piece of text was produced by an LLM, which is essentially a binary classification task. The detector techniques have witnessed notable advancements recently, propelled by innovations in watermarking techniques, zero-shot methods, fine-turning LMs methods, adversarial learning methods, LLMs as detectors, and human-assisted methods. In this survey, we collate recent research breakthroughs in this area and underscore the pressing need to bolster detector research. We also delve into prevalent datasets, elucidating their limitations and developmental requirements. Furthermore, we analyze various LLM-generated text detection paradigms, shedding light on challenges like out-of-distribution problems, potential attacks, and data ambiguity. Conclusively, we highlight interesting directions for future research in LLM-generated text detection to advance the implementation of responsible artificial intelligence (AI). Our aim with this survey is to provide a clear and comprehensive introduction for newcomers while also offering seasoned researchers a valuable update in the field of LLM-generated text detection. The useful resources are publicly available at: https://github.com/NLP2CT/LLM-generated-Text-Detection

    Do peers see more in a paper than its authors?

    Get PDF
    Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances-sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances

    The Detection of Contradictory Claims in Biomedical Abstracts

    Get PDF
    Research claims in the biomedical domain are not always consistent, and may even be contradictory. This thesis explores contradictions between research claims in order to determine whether or not it is possible to develop a solution to automate the detection of such phenomena. Such a solution will help decision-makers, including researchers, to alleviate the effects of contradictory claims on their decisions. This study develops two methodologies to construct corpora of contradictions. The first methodology utilises systematic reviews to construct a manually-annotated corpus of contradictions. The second methodology uses a different approach to construct a corpus of contradictions which does not rely on human annotation. This methodology is proposed to overcome the limitations of the manual annotation approach. Moreover, this thesis proposes a pipeline to detect contradictions in abstracts. The pipeline takes a question and a list of research abstracts which may contain answers to it. The output of the pipeline is a list of sentences extracted from abstracts which answer the question, where each sentence is annotated with an assertion value with respect to the question. Claims which feature opposing assertion values are considered as potentially contradictory claims. The research demonstrates that automating the detection of contradictory claims in research abstracts is a feasible problem

    EMIL: Extracting Meaning from Inconsistent Language

    Get PDF
    Developments in formal and computational theories of argumentation reason with inconsistency. Developments in Computational Linguistics extract arguments from large textual corpora. Both developments head in the direction of automated processing and reasoning with inconsistent, linguistic knowledge so as to explain and justify arguments in a humanly accessible form. Yet, there is a gap between the coarse-grained, semi-structured knowledge-bases of computational theories of argumentation and fine-grained, highly-structured inferences from knowledge-bases derived from natural language. We identify several subproblems which must be addressed in order to bridge the gap. We provide a direct semantics for argumentation. It has attractive properties in terms of expressivity and complexity, enables reasoning by cases, and can be more highly structured. For language processing, we work with an existing controlled natural language (CNL), which interfaces with our computational theory of argumentation; the tool processes natural language input, translates them into a form for automated inference engines, outputs argument extensions, then generates natural language statements. The key novel adaptation incorporates the defeasible expression ‘it is usual that’. This is an important, albeit incremental, step to incorporate linguistic expressions of defeasibility. Overall, the novel contribution of the paper is an integrated, end-to-end argumentation system which bridges between automated defeasible reasoning and a natural language interface. Specific novel contributions are the theory of ‘direct semantics’, motivations for our theory, results with respect to the direct semantics, an implementation, experimental results, the tie between the formalisation and the CNL, the introduction into a CNL of a natural language expression of defeasibility, and an ‘engineering’ approach to fine-grained argument analysis

    A CORPUS-BASED STUDY OF CONNECTORS AND THEMATIC PROGRESSION IN THE ACADEMIC WRITING OF THAI EFL STUDENTS

    Get PDF
    The objective of the current study is to compare how Thai EFL writers develop and express their oppositional ideas in arguments and to compare their use of oppositional connectors in arguments to those of published scholars in the field of health science. An investigation of thematic progression pattern was conducted to examine whether a certain connector frequently occurs in a particular type of thematic progression. Classifications of oppositional meaning categories (Izutsu, 2008) and thematic progression patterns (DaneĹĄ, 1974) were incorporated as the framework of study. For the purpose of the analysis, two substantial large corpora, the Mahidol University Learner Corpus (MULC, 4.5 million words) and the Scholar Corpus of Health Science (SCHS, 2 million words) were developed by the researcher. Five hundred segments from each corpus (a total of 1,000 segments, approximately 1,000,000 words), containing oppositional connectors and thematic progression, written by 50 Thai EFL graduate students and 50 scholars in health sciences were analyzed as sample texts. Coding schemes for the analysis were validated and achieved absolute agreement between inter-raters. The British National Corpus (BNC) was used as a referent corpus in a pilot trial while the Corpus of Contemporary American English (COCA) was referenced in the actual analysis. One-way, two-way and three-way ANOVAs, and the Universidad AutĂłnoma de Madrid (UAM) corpus tool, which provides chi-square statistics, were used for data analyses. Findings revealed that both groups of writers preferred to use concessive connectors to express their oppositional ideas and to use derived thematic progression pattern to organize their texts (ps < .001). Additionally, no major differences were found in the use of concessive connectors, the accuracy in how these connectors were used showed student writers did not use concessive ideas in the same way as scholars and, at times, students used them inaccurately. Important findings of differences in the use of oppositional connectors and thematic progression patterns are discussed from the perspectives of cognitive linguistics, cultural influences, and EFL academic writing teaching and instruction. The current study also provides evidence-based recommendations for EFL academic writing curriculum and instructional development
    • …
    corecore