Search CORE

545 research outputs found

Typing Patterns and Authentication in Practical Programming Exams

Author: Ahadi Alireza
Klami Arto
Leinonen Juho
Longi Krista
Vihavainen Arto
Publication venue: ACM New York
Publication date: 11/07/2016
Field of study

In traditional programming courses, students have usually been at least partly graded using pen and paper exams. One of the problems related to such exams is that they only partially connect to the practice conducted within such courses. Testing students in a more practical environment has been constrained due to the limited resources that are needed, for example, for authentication. In this work, we study whether students in a programming course can be identified in an exam setting based solely on their typing patterns. We replicate an earlier study that indicated that keystroke analysis can be used for identifying programmers. Then, we examine how a controlled machine examination setting affects the identification accuracy, i.e. if students can be identified reliably in a machine exam based on typing profiles built with data from students' programming assignments from a course. Finally, we investigate the identification accuracy in an uncontrolled machine exam, where students can complete the exam at any time using any computer they want. Our results indicate that even though the identification accuracy deteriorates when identifying students in an exam, the accuracy is high enough to reliably identify students if the identification is not required to be exact, but top k closest matches are regarded as correct.Peer reviewe

OPUS - University of Technology Sydney

Helsingin yliopiston digitaalinen arkisto

Preventing Keystroke Based Identification in Open Data Sets

Author: Hellas Arto
Ihantola Petri
Leinonen Juho
Publication venue: ACM
Publication date: 12/04/2017
Field of study

Large-scale courses such as Massive Online Open Courses (MOOCs) can be a great data source for researchers. Ideally, the data gathered on such courses should be openly available to all researchers. Studies could be easily replicated and novel studies on existing data could be conducted. However, very fine-grained data such as source code snapshots can contain hidden identifiers. For example, distinct typing patterns that identify individuals can be extracted from such data. Hence, simply removing explicit identifiers such as names and student numbers is not sufficient to protect the privacy of the users who have supplied the data. At the same time, removing all keystroke information would decrease the value of the shared data significantly. In this work, we study how keystroke data from a programming context could be modified to prevent keystroke latency based identification whilst still retaining information that can be used to e.g. infer programming experience. We investigate the degree of anonymization required to render identification of students based on their typing patterns unreliable. Then, we study whether the modified keystroke data can still be used to infer the programming experience of the students as a case study of whether the anonymized typing patterns have retained at least some informative value. We show that it is possible to modify data so that keystroke latency based identification is no longer accurate, but the programming experience of the students can still be inferred, i.e. the data still has value to researchers. In a broader context, our results indicate that information and anonymity are not necessarily mutually exclusive.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

A Study of Keystroke Data in Two Contexts : Written Language and Programming Language Influence Predictability of Learning Outcomes

Author: Edwards John
Hellas Arto
Leinonen Juho
Publication venue: ACM
Publication date: 26/02/2020
Field of study

We study programming process data from two introductory programming courses. Between the course contexts, the programming languages differ, the teaching approaches differ, and the spoken languages differ. In both courses, students' keystroke data -- timestamps and the pressed keys -- are recorded as students work on programming assignments. We study how the keystroke data differs between the contexts, and whether research on predicting course outcomes using keystroke latencies generalizes to other contexts. Our results show that there are differences between the contexts in terms of frequently used keys, which can be partially explained by the differences between the spoken languages and the programming languages. Further, our results suggest that programming process data that can be collected non-intrusive in-situ can be used for predicting course outcomes in multiple contexts. The predictive power, however, varies between contexts possibly because the frequently used keys differ between programming languages and spoken languages. Thus, context-specific fine-tuning of predictive models may be needed.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Programming Versus Natural Language: On the Effect of Context on Typing in CS1

Author: Birthare Chetan
Edwards John
Hellas Arto
Leinonen Juho
Zavgorodniaia Albina
Publication venue: ACM
Publication date: 10/08/2020
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

The influence of conception paradigms on data protection in E-Learning platforms::a case study

Author: De Vos Nathan
Garcia-Alfaro Joaquin
Kiennert Christophe
Knockaert Manon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

International audienceThe wide adoption of virtual learning environments such as Moodle in numerous universities illustrate the growing trend of e-learning development and diffusion. These e-learning environments alter the relationship between the students and academic knowledge and learning processes considerably stimulating the students' autonomy by making most of the course material freely available at any time while inducing a progressive reduction of physical student-teacher interactions with virtual ones. Recent advances, as proposed in the TeSLA project, even introduces an e-assessment environment. This entire virtual learning framework raises new concerns in terms of privacy, given that such environments are potentially able to track the students, profile their habits, and retrieve personal data. In this paper, we analyze the influence of conception paradigms of e-learning platforms on personal data protection, based on a classification of these platforms in two antagonistic approaches. We illustrate our analysis with a case study of the TeSLA project and examine how the design choices impact the efficiency and legal compliance of personal data protection means. We finally propose alternative designs that could lead to significant improvements in this matter

Biometric Systems

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Biometric authentication has been widely used for access control and security systems over the past few years. The purpose of this book is to provide the readers with life cycle of different biometric authentication systems from their design and development to qualification and final application. The major systems discussed in this book include fingerprint identification, face recognition, iris segmentation and classification, signature verification and other miscellaneous systems which describe management policies of biometrics, reliability measures, pressure based typing and signature verification, bio-chemical systems and behavioral characteristics. In summary, this book provides the students and the researchers with different approaches to develop biometric authentication systems and at the same time includes state-of-the-art approaches in their design and development. The approaches have been thoroughly tested on standard databases and in real world applications

Choosing Code Segments to Exclude from Code Similarity Detection

Author: Abelson H
Allyson França B
Cole Jason R
Dijkstra Edsger Wybe
Hage Jurriaan
Karnalim Oscar
Le Nguyen Thanh Tri
Mann Samuel
Misi Marko J
Myers Trina
Papert Seymour
Prechelt Lutz
Sheard Judy
Sheard Judy
Zhang Michael
Publication venue: ACM
Publication date: 01/06/2020
Field of study

When student programs are compared for similarity as a step in the detection of academic misconduct, certain segments of code are always sure to be similar but are no cause for suspicion. Some of these segments are boilerplate code (e.g. public static void main String [] args) and some will be code that was provided to students as part of the assessment specification. This working group explores these and other types of code that are legitimately common in student assessments and can therefore be excluded from similarity checking. From their own institutions, working group members collected assessment submissions that together encompass a wide variety of assessment tasks in a wide variety of programming languages. The submissions were analysed to determine what sorts of code segment arose frequently in each assessment task. The group has found that common code can arise in programming assessment tasks when it is required for compilation purposes; when it reflects an intuitive way to undertake part or all of the task in question; when it can be legitimately copied from external sources; and when it has been suggested by people with whom many of the students have been in contact. A further finding is that the nature and size of the common code fragments vary with course level and with task complexity. An informal survey of programming educators confirms the group's findings and gives some reasons why various educators include code when setting programming assignments.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

TLAD 2010 Proceedings:8th international workshop on teaching, learning and assesment of databases (TLAD)

Author
Publication venue
Publication date: 01/01/2010
Field of study

This is the eighth in the series of highly successful international workshops on the Teaching, Learning and Assessment of Databases (TLAD 2010), which once again is held as a workshop of BNCOD 2010 - the 27th International Information Systems Conference. TLAD 2010 is held on the 28th June at the beautiful Dudhope Castle at the Abertay University, just before BNCOD, and hopes to be just as successful as its predecessors.The teaching of databases is central to all Computing Science, Software Engineering, Information Systems and Information Technology courses, and this year, the workshop aims to continue the tradition of bringing together both database teachers and researchers, in order to share good learning, teaching and assessment practice and experience, and further the growing community amongst database academics. As well as attracting academics from the UK community, the workshop has also been successful in attracting academics from the wider international community, through serving on the programme committee, and attending and presenting papers.This year, the workshop includes an invited talk given by Richard Cooper (of the University of Glasgow) who will present a discussion and some results from the Database Disciplinary Commons which was held in the UK over the academic year. Due to the healthy number of high quality submissions this year, the workshop will also present seven peer reviewed papers, and six refereed poster papers. Of the seven presented papers, three will be presented as full papers and four as short papers. These papers and posters cover a number of themes, including: approaches to teaching databases, e.g. group centered and problem based learning; use of novel case studies, e.g. forensics and XML data; techniques and approaches for improving teaching and student learning processes; assessment techniques, e.g. peer review; methods for improving students abilities to develop database queries and develop E-R diagrams; and e-learning platforms for supporting teaching and learning