556 research outputs found
Automatic generation of benchmarks for plagiarism detection tools using grammatical evolution
This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in {Source Publication}, http://dx.doi.org/10.1145/10.1145/1276958.1277388An extended version of this poster is available at arXiv‘. See: http://arxiv.org/abs/cs/0703134v4Student plagiarism is a major problem in universities worldwide. In this
paper, we focus on plagiarism in answers to computer programming assignments,
where students mix and/or modify one or more original solutions
to obtain counterfeits. Although several software tools have been
developed to help the tedious and time consuming task of detecting plagiarism,
little has been done to assess their quality, because determining the
real authorship of the whole submission corpus is practically impossible
for graders. In this article we present a Grammatical Evolution technique
which generates benchmarks for testing plagiarism detection tools. Given
a programming language, our technique generates a set of original solutions
to an assignment, together with a set of plagiarisms of the former
set which mimic the basic plagiarism techniques performed by students.
The authorship of the submission corpus is predefined by the user, providing
a base for the assessment and further comparison of copy-catching
tools. We give empirical evidence of the suitability of our approach by
studying the behavior of one state-of-the-art detection tool (AC) on four
benchmarks coded in APL2, generated with our technique.Work supported by grant TSI2005-08255-C07-06 of the Spanish Ministry of Education and Science
Automated scholarly paper review: Technologies and challenges
Peer review is a widely accepted mechanism for research evaluation, playing a
pivotal role in scholarly publishing. However, criticisms have long been
leveled on this mechanism, mostly because of its inefficiency and subjectivity.
Recent years have seen the application of artificial intelligence (AI) in
assisting the peer review process. Nonetheless, with the involvement of humans,
such limitations remain inevitable. In this review paper, we propose the
concept and pipeline of automated scholarly paper review (ASPR) and review the
relevant literature and technologies of achieving a full-scale computerized
review process. On the basis of the review and discussion, we conclude that
there is already corresponding research and implementation at each stage of
ASPR. We further look into the challenges in ASPR with the existing
technologies. The major difficulties lie in imperfect document parsing and
representation, inadequate data, defective human-computer interaction and
flawed deep logical reasoning. Moreover, we discuss the possible moral &
ethical issues and point out the future directions of ASPR. In the foreseeable
future, ASPR and peer review will coexist in a reinforcing manner before ASPR
is able to fully undertake the reviewing workload from humans
Using algorithmic information theory and stochastic modeling to improve classification and evolutionary computation: a dissertation submitted to the department of computer science of the Universidad Autonoma de Madrid in partial fulfillment of the requirements for the degree of doctor of philosophy
Tesis doctoral inédita. Universidad Autónoma de Madrid, Escuela Politécnica Superior, junio de 200
An Inclusive Report on Robust Malware Detection and Analysis for Cross-Version Binary Code Optimizations
Numerous practices exist for binary code similarity detection (BCSD), such as Control Flow Graph, Semantics Scrutiny, Code Obfuscation, Malware Detection and Analysis, vulnerability search, etc. On the basis of professional knowledge, existing solutions often compare particular syntactic aspects retrieved from binary code. They either have substantial performance overheads or have inaccurate detection. Furthermore, there aren't many tools available for comparing cross-version binaries, which may differ not only in programming with proper syntax but also marginally in semantics. This Binary code similarity detection is existing for past 10 years, but this research area is not yet systematically analysed. The paper presents a comprehensive analysis on existing Cross-version Binary Code Optimization techniques on four characteristics: 1. Structural analysis, 2. Semantic Analysis, 3. Syntactic Analysis, 4. Validation Metrics. It helps the researchers to best select the suitable tool for their necessary implementation on binary code analysis. Furthermore, this paper presents scope of the area along with future directions of the research
A review on the use of large language models as virtual tutors
Transformer architectures contribute to managing long-term dependencies for natural language processing, representing one of the most recent changes in the field. These architectures are the basis of the innovative, cutting-edge large language models (LLMs) that have produced a huge buzz in several fields and industrial sectors, among the ones education stands out. Accordingly, these generative artificial intelligence-based solutions have directed the change in techniques and the evolution in educational methods and contents, along with network infrastructure, towards high-quality learning. Given the popularity of LLMs, this review seeks to provide a comprehensive overview of those solutions designed specifically to generate and evaluate educational materials and which involve students and teachers in their design or experimental plan. To the best of our knowledge, this is the first review of educational applications (e.g., student assessment) of LLMs. As expected, the most common role of these systems is as virtual tutors for automatic question generation. Moreover, the most popular models are GPT-3 and BERT. However, due to the continuous launch of new generative models, new works are expected to be published shortly.Xunta de Galicia | Ref. ED481B-2021-118Xunta de Galicia | Ref. ED481B-2022-093Universidade de Vigo/CISU
How should the advent of large language models affect the practice of science?
Large language models (LLMs) are being increasingly incorporated into
scientific workflows. However, we have yet to fully grasp the implications of
this integration. How should the advent of large language models affect the
practice of science? For this opinion piece, we have invited four diverse
groups of scientists to reflect on this query, sharing their perspectives and
engaging in debate. Schulz et al. make the argument that working with LLMs is
not fundamentally different from working with human collaborators, while Bender
et al. argue that LLMs are often misused and over-hyped, and that their
limitations warrant a focus on more specialized, easily interpretable tools.
Marelli et al. emphasize the importance of transparent attribution and
responsible use of LLMs. Finally, Botvinick and Gershman advocate that humans
should retain responsibility for determining the scientific roadmap. To
facilitate the discussion, the four perspectives are complemented with a
response from each group. By putting these different perspectives in
conversation, we aim to bring attention to important considerations within the
academic community regarding the adoption of LLMs and their impact on both
current and future scientific practices
Quantifying the impact of Twitter activity in political battlegrounds
It may be challenging to determine the reach of the information, how well it corresponds with
the domain design, and how to utilize it as a communication medium when utilizing social
media platforms, notably Twitter, to engage the public in advocating a parliament act, or
during a global health emergency. Chapter 3 offers a broad overview of how candidates running in the 2020 US Elections used Twitter as a communication tool to interact with voters.
More precisely, it seeks to identify components related to internal collaboration and public
participation (in terms of content and stance similarity among the candidates from the same
political front and to the official Twitter accounts of their political parties). The 2020 US
Presidential and Vice Presidential candidates from the two main political parties, the Republicans and Democrats, are our main subjects. Along with the content similarity, their tweets
were assessed for social reach and stance similarity on 22 topics. This study complements
previous research on efficiently using social media platforms for election campaigns. Chapter 4 empirically examines the online social associations of the top-10 COVID-19 resilient
nations’ leaders and healthcare institutions based on the Bloomberg COVID-19 Resilience
Ranking. In order to measure the strength of the online social association in terms of public
engagement, sentiment strength, inclusivity and diversity, we used the attributes provided
by Twitter Academic Research API, coupled with the tweets of leaders and healthcare organizations from these nations. Understanding how leaders and healthcare organizations may
utilize Twitter to establish digital connections with the public during health emergencies is
made more accessible by this study. The thesis has proposed methods for efficiently using
Twitter in various domains, utilizing the implementations of various Language Models and
several data mining and analytics techniques
A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions
The powerful ability to understand, follow, and generate complex language
emerging from large language models (LLMs) makes LLM-generated text flood many
areas of our daily lives at an incredible speed and is widely accepted by
humans. As LLMs continue to expand, there is an imperative need to develop
detectors that can detect LLM-generated text. This is crucial to mitigate
potential misuse of LLMs and safeguard realms like artistic expression and
social networks from harmful influence of LLM-generated content. The
LLM-generated text detection aims to discern if a piece of text was produced by
an LLM, which is essentially a binary classification task. The detector
techniques have witnessed notable advancements recently, propelled by
innovations in watermarking techniques, zero-shot methods, fine-turning LMs
methods, adversarial learning methods, LLMs as detectors, and human-assisted
methods. In this survey, we collate recent research breakthroughs in this area
and underscore the pressing need to bolster detector research. We also delve
into prevalent datasets, elucidating their limitations and developmental
requirements. Furthermore, we analyze various LLM-generated text detection
paradigms, shedding light on challenges like out-of-distribution problems,
potential attacks, and data ambiguity. Conclusively, we highlight interesting
directions for future research in LLM-generated text detection to advance the
implementation of responsible artificial intelligence (AI). Our aim with this
survey is to provide a clear and comprehensive introduction for newcomers while
also offering seasoned researchers a valuable update in the field of
LLM-generated text detection. The useful resources are publicly available at:
https://github.com/NLP2CT/LLM-generated-Text-Detection
Ontologies for automatic question generation
Assessment is an important tool for formal learning, especially in higher education. At present, many universities use online assessment systems where questions are entered manually into a question bank system. This kind of system requires the instructor’s time and effort to construct questions manually. The main aim of this thesis is, therefore, to contribute to the investigation of new question generation strategies for short/long answer questions in order to allow for the development of automatic factual question generation from an ontology for educational assessment purposes. This research is guided by four research questions: (1) How well can an ontology be used for generating factual assessment questions? (2) How can questions be generated from course ontology? (3) Are the ontological question generation strategies able to generate acceptable assessment questions? and (4) Do the topic-based indexing able to improve the feasibility of AQGen.
We firstly conduct ontology validation to evaluate the appropriateness of concept representation using a competency question approach. We used revision questions from the textbook to obtain keyword (in revision questions) and a concept (in the ontology) matching. The results show that only half of the ontology concepts matched the keywords. We took further investigation on the unmatched concepts and found some incorrect concept naming and later suggest a guideline for an appropriate concept naming. At the same time, we introduce validation of ontology using revision questions as competency questions to check for ontology completeness. Furthermore, we also proposed 17 short/long answer question templates for 3 question categories, namely definition, concept completion and comparison.
In the subsequent part of the thesis, we develop the AQGen tool and evaluate the generated questions. Two Computer Science subjects, namely OS and CNS, are chosen to evaluate AQGen generated questions. We conduct a questionnaire survey from 17 domain experts to identify experts’ agreement on the acceptability measure of AQGen generated questions. The experts’ agreements for acceptability measure are favourable, and it is reported that three of the four QG strategies proposed can generate acceptable questions. It has generated thousands of questions from the 3 question categories. AQGen is updated with question selection to generate a feasible question set from a tremendous amount of generated questions before. We have suggested topic-based indexing with the purpose to assert knowledge about topic chapters into ontology representation for question selection. The topic indexing shows a feasible result for filtering question by topics.
Finally, our results contribute to an understanding of ontology element representation for question generations and how to automatically generate questions from ontology for education assessment
- …