Search CORE

45,250 research outputs found

Evaluating the Capability of Large-scale Language Models on Chinese Grammatical Error Correction Task

Author: Qu Fanyi
Wu Yunfang
Publication venue
Publication date: 08/07/2023
Field of study

Large-scale language models (LLMs) has shown remarkable capability in various of Natural Language Processing (NLP) tasks and attracted lots of attention recently. However, some studies indicated that large language models fail to achieve promising result beyond the state-of-the-art models in English grammatical error correction (GEC) tasks. In this report, we aim to explore the how large language models perform on Chinese grammatical error correction tasks and provide guidance for future work. We conduct experiments with 3 different LLMs of different model scale on 4 Chinese GEC dataset. Our experimental results indicate that the performances of LLMs on automatic evaluation metrics falls short of the previous sota models because of the problem of over-correction. Furthermore, we also discover notable variations in the performance of LLMs when evaluated on different data distributions. Our findings demonstrates that further investigation is required for the application of LLMs on Chinese GEC task

arXiv.org e-Print Archive

Automated Assessment of Students' Code Comprehension using LLMs

Author: Banjade Rabin
Chapagain Jeevan
Oli Priti
Rus Vasile
Publication venue
Publication date: 19/12/2023
Field of study

Assessing student's answers and in particular natural language answers is a crucial challenge in the field of education. Advances in machine learning, including transformer-based models such as Large Language Models(LLMs), have led to significant progress in various natural language tasks. Nevertheless, amidst the growing trend of evaluating LLMs across diverse tasks, evaluating LLMs in the realm of automated answer assesment has not received much attention. To address this gap, we explore the potential of using LLMs for automated assessment of student's short and open-ended answer. Particularly, we use LLMs to compare students' explanations with expert explanations in the context of line-by-line explanations of computer programs. For comparison purposes, we assess both Large Language Models (LLMs) and encoder-based Semantic Textual Similarity (STS) models in the context of assessing the correctness of students' explanation of computer code. Our findings indicate that LLMs, when prompted in few-shot and chain-of-thought setting perform comparable to fine-tuned encoder-based models in evaluating students' short answers in programming domain

arXiv.org e-Print Archive

Several categories of Large Language Models (LLMs): A Short Survey

Author: Chandrasekharan Manoj
Pahune Saurabh
Publication venue
Publication date: 05/07/2023
Field of study

Large Language Models(LLMs)have become effective tools for natural language processing and have been used in many different fields. This essay offers a succinct summary of various LLM subcategories. The survey emphasizes recent developments and efforts made for various LLM kinds, including task-based financial LLMs, multilingual language LLMs, biomedical and clinical LLMs, vision language LLMs, and code language models. The survey gives a general summary of the methods, attributes, datasets, transformer models, and comparison metrics applied in each category of LLMs. Furthermore, it highlights unresolved problems in the field of developing chatbots and virtual assistants, such as boosting natural language processing, enhancing chatbot intelligence, and resolving moral and legal dilemmas. The purpose of this study is to provide readers, developers, academics, and users interested in LLM-based chatbots and virtual intelligent assistant technologies with useful information and future directions

arXiv.org e-Print Archive

Benchmarking Large Language Models in Retrieval-Augmented Generation

Author: Chen Jiawei
Han Xianpei
Lin Hongyu
Sun Le
Publication venue
Publication date: 04/09/2023
Field of study

Retrieval-Augmented Generation (RAG) is a promising approach for mitigating the hallucination of large language models (LLMs). However, existing research lacks rigorous evaluation of the impact of retrieval-augmented generation on different large language models, which make it challenging to identify the potential bottlenecks in the capabilities of RAG for different LLMs. In this paper, we systematically investigate the impact of Retrieval-Augmented Generation on large language models. We analyze the performance of different large language models in 4 fundamental abilities required for RAG, including noise robustness, negative rejection, information integration, and counterfactual robustness. To this end, we establish Retrieval-Augmented Generation Benchmark (RGB), a new corpus for RAG evaluation in both English and Chinese. RGB divides the instances within the benchmark into 4 separate testbeds based on the aforementioned fundamental abilities required to resolve the case. Then we evaluate 6 representative LLMs on RGB to diagnose the challenges of current LLMs when applying RAG. Evaluation reveals that while LLMs exhibit a certain degree of noise robustness, they still struggle significantly in terms of negative rejection, information integration, and dealing with false information. The aforementioned assessment outcomes indicate that there is still a considerable journey ahead to effectively apply RAG to LLMs

arXiv.org e-Print Archive

The Role of Large Language Models in Enhancing Cybersecurity Measures: Empirical Evidence from Regional Banking Institutions

Author: Husain Mamo Muhamad
Mohammed Harman Salih
Zangana Hewa Majeed
Publication venue: Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer
Publication date: 01/09/2025
Field of study

The rapid advancements in artificial intelligence (AI) and machine learning (ML) have significantly influenced the cybersecurity landscape, particularly in the banking sector, where threats are increasingly sophisticated. Large Language Models (LLMs) such as OpenAI’s GPT-4 and Google’s BERT, offer novel approaches to threat detection, fraud prevention, and automated risk assessment. This paper explores the integration of Large Language Models (LLMs) in cybersecurity frameworks within financial institutions, highlighting their role in real-time anomaly detection, predictive analytics, and intelligent automation of security operations. By leveraging LLMs, banks can enhance their cybersecurity resilience, mitigate cyber threats, and improve regulatory compliance. However, challenges such as data privacy concerns, adversarial attacks, and computational resource demands must be addressed to ensure the secure and ethical deployment of these models. This study provides insights into the current applications, benefits, and limitations of Large Language Models (LLMs) in strengthening cybersecurity measures in the banking sector

Jurnal Sistemasi (OJS FTIK - UNISI, Fakultas Teknik dan Ilmu Komputer Universitas Islam Indragiri)

After-School Care and Parents' Labor Supply

Author: Felfe Christina
Lechner Michael
Thiemann Petra
Publication venue: Munich: Center for Economic Studies and ifo Institute (CESifo)
Publication date: 01/01/2013
Field of study

Does after-school care provision promote mothers' employment and balance the allocation of paid work among parents of schoolchildren? We address this question by exploiting variation in cantonal (state) regulations of after-school care provision in Switzerland. To establish exogeneity of cantonal regulations with respect to employment opportunities and preferences of the population, we restrict our analysis to confined regions along cantonal borders. Using semi-parametric instrumental variable methods, we find a positive impact of after-school care provision on mothers' full-time employment, but a negative impact on fathers' full-time employment. Thus, the supply of after-school care fosters a convergence of parental working hours

EconStor (ZBW Kiel)

Visual Literacy and New Technologies

Author: Bamford Anne
Publication venue: New Media Consortium
Publication date: 31/01/2003
Field of study

This body of research addresses the connection between arts, identity and new technology, and investigates the impact of images on adolescent identities, the relationship between online modes of communication and cyber-bullying, the increasing visualization of information and explores the way drawing and critical analysis of imagery develops visual literacy. Commissioned by Adobe Systems Pty Ltd, Australia (2003) to compile the Visual Literacy White Paper, Bamford’s report defines visual literacy and highlights its importance in the learning of such skill as problem solving and critical thinking. Providing strategies to promote visual literacy and emphasizing the role of technology in visual communication, this report has become a major reference for policy on visual literacy and cyber-bullying in the UK, USA and Asia

UAL Research Online

After-School Care and Parents' Labor Supply

Author: Felfe Christina
Lechner Michael
Thiemann Petra
Publication venue: Bonn: Institute for the Study of Labor (IZA)
Publication date: 01/01/2013
Field of study

EconStor (ZBW Kiel)

Supervised Knowledge Makes Large Language Models Better In-context Learners

Author: Bao Guangsheng
Chen Weizhu
Wang Jindong
Wang Yidong
Xie Xing
Xu Ruochen
Yang Linyi
Ye Wei
Yu Zhuohao
Zhang Shuibai
Zhang Yue
Publication venue
Publication date: 11/04/2024
Field of study

Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the critical challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. While previous in-context learning research has focused on enhancing models to adhere to users' specific instructions and quality expectations, and to avoid undesired outputs, little to no work has explored the use of task-Specific fine-tuned Language Models (SLMs) to improve LLMs' in-context learning during the inference stage. Our primary contribution is the establishment of a simple yet effective framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks. Using our proposed plug-in method, enhanced versions of Llama 2 and ChatGPT surpass their original versions regarding generalizability and factuality. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks. The code and data are released at: https://github.com/YangLinyi/Supervised-Knowledge-Makes-Large-Language-Models-Better-In-context-Learners. Our empirical analysis sheds light on the advantages of incorporating discriminative models into LLMs and highlights the potential of our methodology in fostering more reliable LLMs.Comment: Accepted to ICLR 202

arXiv.org e-Print Archive