Search CORE

902 research outputs found

LearnedSort as a learning-augmented SampleSort: Analysis and Parallelization

Author: Carvalho Ivan
Lawrence Ramon
Publication venue
Publication date: 17/07/2023
Field of study

This work analyzes and parallelizes LearnedSort, the novel algorithm that sorts using machine learning models based on the cumulative distribution function. LearnedSort is analyzed under the lens of algorithms with predictions, and it is argued that LearnedSort is a learning-augmented SampleSort. A parallel LearnedSort algorithm is developed combining LearnedSort with the state-of-the-art SampleSort implementation, IPS4o. Benchmarks on synthetic and real-world datasets demonstrate improved parallel performance for parallel LearnedSort compared to IPS4o and other sorting algorithms.Comment: Published in SSDBM 202

arXiv.org e-Print Archive

A Case Study on Record Matching of Individuals in Historical Archives of Indigenous Databases

Author: Currie Matthew
Lawrence Ramon
Publication venue
Publication date: 15/02/2023
Field of study

Digitization of historical records has produced a significant amount of data for analysis and interpretation. A critical challenge is the ability to relate historical information across different archives to allow for the data to be framed in the appropriate historical context. This paper presents a real-world case study on historical information integration and record matching with the goal to improve the historical value of archives containing data in the period 1800 to 1920. The archives contain unique information about M\'etis and Indigenous people in Canada and interactions with European settlers. The archives contain thousands of records that have increased relevance when relationships and interconnections are discovered. The contribution is a record linking approach suitable for historical archives and an evaluation of its effectiveness. Experimental results demonstrate potential for discovering historical linkage with high precision enabling new historical discoveries.Comment: Published in 20th International Conference on Information & Knowledge Engineering (IKE'21

arXiv.org e-Print Archive

Using Assignment Incentives to Reduce Student Procrastination and Encourage Code Review Interactions

Author: Lawrence Ramon
Wang Kevin
Publication venue
Publication date: 25/11/2023
Field of study

Procrastination causes student stress, reduced learning and performance, and results in very busy help sessions immediately before deadlines. A key challenge is encouraging students to complete assignments earlier rather than waiting until right before the deadline, so the focus becomes on the learning objectives rather than just meeting deadlines. This work presents an incentive system encouraging students to complete assignments many days before deadlines. Completed assignments are code reviewed by staff for correctness and providing feedback, which results in more student-instructor interactions and may help reduce student use of generative AI. The incentives result in a change in student behavior with 45% of assignments completed early and 30% up to 4 days before the deadline. Students receive real-time feedback with no increase in marking time.Comment: 6 pages, To be published in 2023 International Conference on Computational Science and Computational Intelligence Research Track on Education (CSCI-RTED) IEEE CP

arXiv.org e-Print Archive

Detecting Argumentative Fallacies in the Wild:Problems and Limitations of Large Language Models

Author: Lawrence John
Ruiz-Dolz Ramon
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 10/12/2023
Field of study

Previous work on the automatic identification of fallacies in natural language text has typically approached the problem in constrained experimental setups that make it difficult to understand the applicability and usefulness of the proposals in the real world. In this paper, we present the first analysis of the limitations that these data-driven approaches could show in real situations. For that purpose, we first create a validation corpus consisting of natural language argumentation schemes. Second, we provide new empirical results to the emerging task of identifying fallacies in natural language text. Third, we analyse the errors observed outside of the testing data domains considering the new validation corpus. Finally, we point out some important limitations observed in our analysis that should be taken into account in future research in this topic. Specifically, if we want to deploy these systems in the Wild

University of Dundee Online Publications

Detecting Argumentative Fallacies in the Wild:Problems and Limitations of Large Language Models

Author: Lawrence John
Ruiz-Dolz Ramon
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 10/12/2023
Field of study

University of Dundee Online Publications

ChatEd: A Chatbot Leveraging ChatGPT for an Enhanced Learning Experience in Higher Education

Author: Lawrence Ramon
Ramos Jason
Wang Kevin
Publication venue
Publication date: 29/12/2023
Field of study

With the rapid evolution of Natural Language Processing (NLP), Large Language Models (LLMs) like ChatGPT have emerged as powerful tools capable of transforming various sectors. Their vast knowledge base and dynamic interaction capabilities represent significant potential in improving education by operating as a personalized assistant. However, the possibility of generating incorrect, biased, or unhelpful answers are a key challenge to resolve when deploying LLMs in an education context. This work introduces an innovative architecture that combines the strengths of ChatGPT with a traditional information retrieval based chatbot framework to offer enhanced student support in higher education. Our empirical evaluations underscore the high promise of this approach.Comment: To appear at INTED2024 - 18th annual International Technology, Education and Development Conferenc

arXiv.org e-Print Archive

An Efficient B-tree Implementation for Memory-Constrained Embedded Systems

Author: Fazackerley Scott
Lawrence Ramon
Ould-Khessal Nadir
Publication venue
Publication date: 15/02/2023
Field of study

Embedded devices collect and process significant amounts of data in a variety of applications including environmental monitoring, industrial automation and control, and other Internet of Things (IoT) applications. Storing data efficiently is critically important, especially when the device must perform local processing on the data. The most widely used data structure for high performance query and insert is the B-tree. However, existing implementations consume too much memory for small embedded devices and often rely on operating system support. This work presents an extremely memory efficient implementation of B-trees for embedded devices that functions on the smallest devices and does not require an operating system. Experimental results demonstrate that the B-tree implementation can run on devices with as little as 4 KB of RAM while efficiently processing thousands of records.Comment: Published in the 19th International Conference on Embedded Systems, Cyber-physical Systems, and Applications (ESCS'21). Code is available at https://github.com/ubco-d

arXiv.org e-Print Archive

Student Mastery or AI Deception? Analyzing ChatGPT's Assessment Proficiency and Evaluating Detection Strategies

Author: Akins Seth
Lawrence Ramon
Mohammed Abdallah
Wang Kevin
Publication venue
Publication date: 27/11/2023
Field of study

Generative AI systems such as ChatGPT have a disruptive effect on learning and assessment. Computer science requires practice to develop skills in problem solving and programming that are traditionally developed using assignments. Generative AI has the capability of completing these assignments for students with high accuracy, which dramatically increases the potential for academic integrity issues and students not achieving desired learning outcomes. This work investigates the performance of ChatGPT by evaluating it across three courses (CS1,CS2,databases). ChatGPT completes almost all introductory assessments perfectly. Existing detection methods, such as MOSS and JPlag (based on similarity metrics) and GPTzero (AI detection), have mixed success in identifying AI solutions. Evaluating instructors and teaching assistants using heuristics to distinguish between student and AI code shows that their detection is not sufficiently accurate. These observations emphasize the need for adapting assessments and improved detection methods.Comment: 7 pages, Published in 2023 International Conference on Computational Science and Computational Intelligence Research Track on Education, IEEE CP

arXiv.org e-Print Archive