556 research outputs found

    Automatic generation of benchmarks for plagiarism detection tools using grammatical evolution

    Full text link
    This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in {Source Publication}, http://dx.doi.org/10.1145/10.1145/1276958.1277388An extended version of this poster is available at arXiv‘. See: http://arxiv.org/abs/cs/0703134v4Student plagiarism is a major problem in universities worldwide. In this paper, we focus on plagiarism in answers to computer programming assignments, where students mix and/or modify one or more original solutions to obtain counterfeits. Although several software tools have been developed to help the tedious and time consuming task of detecting plagiarism, little has been done to assess their quality, because determining the real authorship of the whole submission corpus is practically impossible for graders. In this article we present a Grammatical Evolution technique which generates benchmarks for testing plagiarism detection tools. Given a programming language, our technique generates a set of original solutions to an assignment, together with a set of plagiarisms of the former set which mimic the basic plagiarism techniques performed by students. The authorship of the submission corpus is predefined by the user, providing a base for the assessment and further comparison of copy-catching tools. We give empirical evidence of the suitability of our approach by studying the behavior of one state-of-the-art detection tool (AC) on four benchmarks coded in APL2, generated with our technique.Work supported by grant TSI2005-08255-C07-06 of the Spanish Ministry of Education and Science

    Automated scholarly paper review: Technologies and challenges

    Full text link
    Peer review is a widely accepted mechanism for research evaluation, playing a pivotal role in scholarly publishing. However, criticisms have long been leveled on this mechanism, mostly because of its inefficiency and subjectivity. Recent years have seen the application of artificial intelligence (AI) in assisting the peer review process. Nonetheless, with the involvement of humans, such limitations remain inevitable. In this review paper, we propose the concept and pipeline of automated scholarly paper review (ASPR) and review the relevant literature and technologies of achieving a full-scale computerized review process. On the basis of the review and discussion, we conclude that there is already corresponding research and implementation at each stage of ASPR. We further look into the challenges in ASPR with the existing technologies. The major difficulties lie in imperfect document parsing and representation, inadequate data, defective human-computer interaction and flawed deep logical reasoning. Moreover, we discuss the possible moral & ethical issues and point out the future directions of ASPR. In the foreseeable future, ASPR and peer review will coexist in a reinforcing manner before ASPR is able to fully undertake the reviewing workload from humans

    An Inclusive Report on Robust Malware Detection and Analysis for Cross-Version Binary Code Optimizations

    Get PDF
    Numerous practices exist for binary code similarity detection (BCSD), such as Control Flow Graph, Semantics Scrutiny, Code Obfuscation, Malware Detection and Analysis, vulnerability search, etc. On the basis of professional knowledge, existing solutions often compare particular syntactic aspects retrieved from binary code. They either have substantial performance overheads or have inaccurate detection. Furthermore, there aren't many tools available for comparing cross-version binaries, which may differ not only in programming with proper syntax but also marginally in semantics. This Binary code similarity detection is existing for past 10 years, but this research area is not yet systematically analysed. The paper presents a comprehensive analysis on existing Cross-version Binary Code Optimization techniques on four characteristics: 1. Structural analysis, 2. Semantic Analysis, 3. Syntactic Analysis, 4. Validation Metrics.  It helps the researchers to best select the suitable tool for their necessary implementation on binary code analysis. Furthermore, this paper presents scope of the area along with future directions of the research

    A review on the use of large language models as virtual tutors

    Get PDF
    Transformer architectures contribute to managing long-term dependencies for natural language processing, representing one of the most recent changes in the field. These architectures are the basis of the innovative, cutting-edge large language models (LLMs) that have produced a huge buzz in several fields and industrial sectors, among the ones education stands out. Accordingly, these generative artificial intelligence-based solutions have directed the change in techniques and the evolution in educational methods and contents, along with network infrastructure, towards high-quality learning. Given the popularity of LLMs, this review seeks to provide a comprehensive overview of those solutions designed specifically to generate and evaluate educational materials and which involve students and teachers in their design or experimental plan. To the best of our knowledge, this is the first review of educational applications (e.g., student assessment) of LLMs. As expected, the most common role of these systems is as virtual tutors for automatic question generation. Moreover, the most popular models are GPT-3 and BERT. However, due to the continuous launch of new generative models, new works are expected to be published shortly.Xunta de Galicia | Ref. ED481B-2021-118Xunta de Galicia | Ref. ED481B-2022-093Universidade de Vigo/CISU

    How should the advent of large language models affect the practice of science?

    Full text link
    Large language models (LLMs) are being increasingly incorporated into scientific workflows. However, we have yet to fully grasp the implications of this integration. How should the advent of large language models affect the practice of science? For this opinion piece, we have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate. Schulz et al. make the argument that working with LLMs is not fundamentally different from working with human collaborators, while Bender et al. argue that LLMs are often misused and over-hyped, and that their limitations warrant a focus on more specialized, easily interpretable tools. Marelli et al. emphasize the importance of transparent attribution and responsible use of LLMs. Finally, Botvinick and Gershman advocate that humans should retain responsibility for determining the scientific roadmap. To facilitate the discussion, the four perspectives are complemented with a response from each group. By putting these different perspectives in conversation, we aim to bring attention to important considerations within the academic community regarding the adoption of LLMs and their impact on both current and future scientific practices

    Quantifying the impact of Twitter activity in political battlegrounds

    Get PDF
    It may be challenging to determine the reach of the information, how well it corresponds with the domain design, and how to utilize it as a communication medium when utilizing social media platforms, notably Twitter, to engage the public in advocating a parliament act, or during a global health emergency. Chapter 3 offers a broad overview of how candidates running in the 2020 US Elections used Twitter as a communication tool to interact with voters. More precisely, it seeks to identify components related to internal collaboration and public participation (in terms of content and stance similarity among the candidates from the same political front and to the official Twitter accounts of their political parties). The 2020 US Presidential and Vice Presidential candidates from the two main political parties, the Republicans and Democrats, are our main subjects. Along with the content similarity, their tweets were assessed for social reach and stance similarity on 22 topics. This study complements previous research on efficiently using social media platforms for election campaigns. Chapter 4 empirically examines the online social associations of the top-10 COVID-19 resilient nations’ leaders and healthcare institutions based on the Bloomberg COVID-19 Resilience Ranking. In order to measure the strength of the online social association in terms of public engagement, sentiment strength, inclusivity and diversity, we used the attributes provided by Twitter Academic Research API, coupled with the tweets of leaders and healthcare organizations from these nations. Understanding how leaders and healthcare organizations may utilize Twitter to establish digital connections with the public during health emergencies is made more accessible by this study. The thesis has proposed methods for efficiently using Twitter in various domains, utilizing the implementations of various Language Models and several data mining and analytics techniques

    A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions

    Full text link
    The powerful ability to understand, follow, and generate complex language emerging from large language models (LLMs) makes LLM-generated text flood many areas of our daily lives at an incredible speed and is widely accepted by humans. As LLMs continue to expand, there is an imperative need to develop detectors that can detect LLM-generated text. This is crucial to mitigate potential misuse of LLMs and safeguard realms like artistic expression and social networks from harmful influence of LLM-generated content. The LLM-generated text detection aims to discern if a piece of text was produced by an LLM, which is essentially a binary classification task. The detector techniques have witnessed notable advancements recently, propelled by innovations in watermarking techniques, zero-shot methods, fine-turning LMs methods, adversarial learning methods, LLMs as detectors, and human-assisted methods. In this survey, we collate recent research breakthroughs in this area and underscore the pressing need to bolster detector research. We also delve into prevalent datasets, elucidating their limitations and developmental requirements. Furthermore, we analyze various LLM-generated text detection paradigms, shedding light on challenges like out-of-distribution problems, potential attacks, and data ambiguity. Conclusively, we highlight interesting directions for future research in LLM-generated text detection to advance the implementation of responsible artificial intelligence (AI). Our aim with this survey is to provide a clear and comprehensive introduction for newcomers while also offering seasoned researchers a valuable update in the field of LLM-generated text detection. The useful resources are publicly available at: https://github.com/NLP2CT/LLM-generated-Text-Detection

    Ontologies for automatic question generation

    Get PDF
    Assessment is an important tool for formal learning, especially in higher education. At present, many universities use online assessment systems where questions are entered manually into a question bank system. This kind of system requires the instructor’s time and effort to construct questions manually. The main aim of this thesis is, therefore, to contribute to the investigation of new question generation strategies for short/long answer questions in order to allow for the development of automatic factual question generation from an ontology for educational assessment purposes. This research is guided by four research questions: (1) How well can an ontology be used for generating factual assessment questions? (2) How can questions be generated from course ontology? (3) Are the ontological question generation strategies able to generate acceptable assessment questions? and (4) Do the topic-based indexing able to improve the feasibility of AQGen. We firstly conduct ontology validation to evaluate the appropriateness of concept representation using a competency question approach. We used revision questions from the textbook to obtain keyword (in revision questions) and a concept (in the ontology) matching. The results show that only half of the ontology concepts matched the keywords. We took further investigation on the unmatched concepts and found some incorrect concept naming and later suggest a guideline for an appropriate concept naming. At the same time, we introduce validation of ontology using revision questions as competency questions to check for ontology completeness. Furthermore, we also proposed 17 short/long answer question templates for 3 question categories, namely definition, concept completion and comparison. In the subsequent part of the thesis, we develop the AQGen tool and evaluate the generated questions. Two Computer Science subjects, namely OS and CNS, are chosen to evaluate AQGen generated questions. We conduct a questionnaire survey from 17 domain experts to identify experts’ agreement on the acceptability measure of AQGen generated questions. The experts’ agreements for acceptability measure are favourable, and it is reported that three of the four QG strategies proposed can generate acceptable questions. It has generated thousands of questions from the 3 question categories. AQGen is updated with question selection to generate a feasible question set from a tremendous amount of generated questions before. We have suggested topic-based indexing with the purpose to assert knowledge about topic chapters into ontology representation for question selection. The topic indexing shows a feasible result for filtering question by topics. Finally, our results contribute to an understanding of ontology element representation for question generations and how to automatically generate questions from ontology for education assessment
    • …
    corecore