384 research outputs found
A computational academic integrity framework
L'abast creixent i la naturalesa canviant dels programes acadèmics constitueixen un repte per a la integritat dels protocols tradicionals de proves i exàmens. L'objectiu d¿aquesta tesi és introduir una alternativa als enfocaments tradicionals d'integritat acadèmica, per a cobrir la bretxa del buit de l'anonimat i donar la possibilitat als instructors i administradors acadèmics de fer servir nous mitjans que permetin mantenir la integritat acadèmica i promoguin la responsabilitat, accessibilitat i eficiència, a més de preservar la privadesa i minimitzin la interrupció en el procés d'aprenentatge. Aquest treball té com a objectiu començar un canvi de paradigma en les pràctiques d'integritat acadèmica. La recerca en l'àrea de la identitat de l'estudiant i la garantia de l'autoria són importants perquè la concessió de crèdits d'estudi a entitats no verificades és perjudicial per a la credibilitat institucional i la seguretat pública. Aquesta tesi es basa en la noció que la identitat de l'alumne es compon de dues capes diferents, física i de comportament, en les quals tant els criteris d'identitat com els d'autoria han de ser confirmats per a mantenir un nivell raonable d'integritat acadèmica. Per a això, aquesta tesi s'organitza en tres seccions, cadascuna de les quals aborda el problema des d'una de les perspectives següents: (a) teòrica, (b) empírica i (c) pragmàtica.El creciente alcance y la naturaleza cambiante de los programas académicos constituyen un reto para la integridad de los protocolos tradicionales de pruebas y exámenes. El objetivo de esta tesis es introducir una alternativa a los enfoques tradicionales de integridad académica, para cubrir la brecha del vacío anonimato y dar la posibilidad a los instructores y administradores académicos de usar nuevos medios que permitan mantener la integridad académica y promuevan la responsabilidad, accesibilidad y eficiencia, además de preservar la privacidad y minimizar la interrupción en el proceso de aprendizaje. Este trabajo tiene como objetivo iniciar un cambio de paradigma en las prácticas de integridad académica. La investigación en el área de la identidad del estudiante y la garantía de la autoría son importantes porque la concesión de créditos de estudio a entidades no verificadas es perjudicial para la credibilidad institucional y la seguridad pública. Esta tesis se basa en la noción de que la identidad del alumno se compone de dos capas distintas, física y de comportamiento, en las que tanto los criterios de identidad como los de autoría deben ser confirmados para mantener un nivel razonable de integridad académica. Para ello, esta tesis se organiza en tres secciones, cada una de las cuales aborda el problema desde una de las siguientes perspectivas: (a) teórica, (b) empírica y (c) pragmática.The growing scope and changing nature of academic programmes provide a challenge to the integrity of traditional testing and examination protocols. The aim of this thesis is to introduce an alternative to the traditional approaches to academic integrity, bridging the anonymity gap and empowering instructors and academic administrators with new ways of maintaining academic integrity that preserve privacy, minimize disruption to the learning process, and promote accountability, accessibility and efficiency. This work aims to initiate a paradigm shift in academic integrity practices. Research in the area of learner identity and authorship assurance is important because the award of course credits to unverified entities is detrimental to institutional credibility and public safety. This thesis builds upon the notion of learner identity consisting of two distinct layers (a physical layer and a behavioural layer), where the criteria of identity and authorship must both be confirmed to maintain a reasonable level of academic integrity. To pursue this goal in organized fashion, this thesis has the following three sections: (a) theoretical, (b) empirical, and (c) pragmatic
A Computational Academic Integrity Framework
L'abast creixent i la naturalesa canviant dels programes acadèmics constitueixen un repte per a la integritat dels protocols tradicionals de proves i exàmens. L'objectiu d'aquesta tesi és introduir una alternativa als enfocaments tradicionals d'integritat acadèmica, per a cobrir la bretxa del buit de l'anonimat i donar la possibilitat als instructors i administradors acadèmics de fer servir nous mitjans que permetin mantenir la integritat acadèmica i promoguin la responsabilitat, accessibilitat i eficiència, a més de preservar la privadesa i minimitzin la interrupció en el procés d'aprenentatge. Aquest treball té com a objectiu començar un canvi de paradigma en les pràctiques d'integritat acadèmica. La recerca en l'àrea de la identitat de l'estudiant i la garantia de l'autoria són importants perquè la concessió de crèdits d'estudi a entitats no verificades és perjudicial per a la credibilitat institucional i la seguretat pública. Aquesta tesi es basa en la noció que la identitat de l'alumne es compon de dues capes diferents, física i de comportament, en les quals tant els criteris d'identitat com els d'autoria han de ser confirmats per a mantenir un nivell raonable d'integritat acadèmica. Per a això, aquesta tesi s'organitza en tres seccions, cadascuna de les quals aborda el problema des d'una de les perspectives següents: (a) teòrica, (b) empírica i (c) pragmàtica.El creciente alcance y la naturaleza cambiante de los programas académicos constituyen un reto para la integridad de los protocolos tradicionales de pruebas y exámenes. El objetivo de esta tesis es introducir una alternativa a los enfoques tradicionales de integridad académica, para cubrir la brecha del vacío anonimato y dar la posibilidad a los instructores y administradores académicos de usar nuevos medios que permitan mantener la integridad académica y promuevan la responsabilidad, accesibilidad y eficiencia, además de preservar la privacidad y minimizar la interrupción en el proceso de aprendizaje. Este trabajo tiene como objetivo iniciar un cambio de paradigma en las prácticas de integridad académica. La investigación en el área de la identidad del estudiante y la garantía de la autoría son importantes porque la concesión de créditos de estudio a entidades no verificadas es perjudicial para la credibilidad institucional y la seguridad pública. Esta tesis se basa en la noción de que la identidad del alumno se compone de dos capas distintas, física y de comportamiento, en las que tanto los criterios de identidad como los de autoría deben ser confirmados para mantener un nivel razonable de integridad académica. Para ello, esta tesis se organiza en tres secciones, cada una de las cuales aborda el problema desde una de las siguientes perspectivas: (a) teórica, (b) empírica y (c) pragmática.The growing scope and changing nature of academic programmes provide a challenge to the integrity of traditional testing and examination protocols. The aim of this thesis is to introduce an alternative to the traditional approaches to academic integrity, bridging the anonymity gap and empowering instructors and academic administrators with new ways of maintaining academic integrity that preserve privacy, minimize disruption to the learning process, and promote accountability, accessibility and efficiency. This work aims to initiate a paradigm shift in academic integrity practices. Research in the area of learner identity and authorship assurance is important because the award of course credits to unverified entities is detrimental to institutional credibility and public safety. This thesis builds upon the notion of learner identity consisting of two distinct layers (a physical layer and a behavioural layer), where the criteria of identity and authorship must both be confirmed to maintain a reasonable level of academic integrity. To pursue this goal in organized fashion, this thesis has the following three sections: (a) theoretical, (b) empirical, and (c) pragmatic
Detecting Contract Cheating by using Stylometry and Keystroke Dynamics
Utdanningssektoren er midt i en overgang til nye, digitaliserte måter å holde undervisning på. Med disse nye læringsformene, kommer nye måter å evaluere studentene på, for eksempel hjemmeeksamener over nettet eller større, skriftlige oppgaver. Tradisjonelle måter for å oppdage og forhindre juks, som eksamensvakter og hjelpemiddelkontroll, kan dermed ikke lenger benyttes. Når evalueringsmetodene av studentene endres, åpner dette også opp for nye måter studenter kan jukse på, eksempelvis kontraktjuksing. Kontraktjuksing referer til når en student får en tredjepart til å utføre arbeid på egne vegne, slik at studenten dermed blir vurdert basert på tredjepartens arbeid.
Dette prosjektet har undersøkt mulighetene for å oppdage om kontraktjuksing har funnet sted på en netteksamen. Til dette har prosjektet benyttet seg av tre forskjellige fremgangsmåter; bruk av stylometry, bruk av keystroke dynamics, og en tredje fremgangsmåte hvor en fusjon av stylometry og keystroke dynamics ble tatt i bruk. Tre ulike datasett har blitt brukt: ett datasett som kun inneholdt tekstdata, ett datasett som kun inneholdt keystroke-data, og et tredje sett som inneholdt både stylometry- og keystroke-data. Fremgangsmåten med stylometry ble benyttet på de to datasettene som inneholdt tekst og metoden med keystroke dynamics ble brukt på de to datasettene som inneholdt keystroke-data. Fremgangsmåten hvor keystroke-dynamics og stylometry ble kombinert ble benyttet på datasettet som inneholdt både tekst og keystroke-data. Systemet som tok for seg keystroke dynamics viste de beste resultatene. Her klarte systemet å oppdage 98.4% av juksetilfellene, hvor bare 1.7% av tilfellene ble feilklassifisert som juks. De beste resultatene fra stylometry-systemet viste en detekteringsrate på 95.1%, hvor 5.3% av ikke-jukserne ble feilaktig klassifisert. Det ble også gjennomført tester for å undersøke hvor mange tilfeller av juks det var mulig å oppdage uten å feilaktig beskylde noen studenter for juks. De beste resultatene fra disse testene kom fra en Aggregated Scores Fusion som klarte å oppdage 97.4% av juksetilfellene uten å feilaktig klassifisere noen ikke-juksere.As the education sector is transitioning into new, digitalized forms of teaching and conducting classes, so comes new forms of evaluating students. The evolution of technology opens up for examining students remotely, either by online home exams or longer written assessments done away from the classroom. With these new, digitalized evaluation methods, traditional measures to counter cheating on exams can not always be applied, such as exam proctoring or exam aid controls. This transition also opens up for new ways to conduct academic dishonesty, such as contract cheating on remote exams or assessments. Contract cheating refers to when a student gets an obligatory exam, essay, or other assessment work completed by a third party on their behalf, which will then be submitted as if they have completed the work themselves.
This project aimed to investigate the feasibility of detecting if contract cheating has taken place in an online exam. Three different approaches for contract cheating detection were developed; one approach using stylometry, another approach using keystroke dynamics, and a third approach where stylometry and keystroke dynamics were combined. Three different datasets were used in this research: one dataset containing only text data, another dataset containing keystroke data, and a third dataset that contained both text and keystroke data. The stylometry approach was applied to the two datasets containing text, while the keystroke dynamics approach was applied to the two datasets consisting of keystroke data. The fusion approach was tested on the dataset consisting of both text and keystrokes. The keystroke dynamics method showed the best results, where the system was able to detect 98.4% of the cheating cases, and wrongfully classifying only 1.7% of the non-cheating cases. The best results from the stylometry approach showed a detection rate of 95.1%, with a 5.3% wrongful accusation rate of non-cheaters. Experiments were also conducted to see how many cheaters the methods could detect without wrongfully accusing any genuine exam attempts. The best results from these experiments came from an Aggregated Scores Fusion that was able to detect 97.4% of the cheating cases without wrongfully classifying any non-cheating attempts
Identification of Programmers from Typing Patterns
Being able to identify the user of a computer solely based on their typing patterns can lead to improvements in plagiarism detection, provide new opportunities for authentication, and enable novel guidance methods in tutoring systems. However, at the same time, if such identification is possible, new privacy and ethical concerns arise. In our work, we explore methods for identifying individuals from typing data captured by a programming environment as these individuals are learning to program. We compare the identification accuracy of automatically generated user profiles, ranging from the average amount of time that a user needs between keystrokes to the amount of time that it takes for the user to press specific pairs of keys, digraphs. We also explore the effect of data quantity and different acceptance thresholds on the identification accuracy, and analyze how the accuracy changes when identifying individuals across courses. Our results show that, while the identification accuracy varies depending on data quantity and the method, identification of users based on their programming data is possible. These results indicate that there is potential in using this method, for example, in identification of students taking exams, and that such data has privacy concerns that should be addressed.Peer reviewe
Continuous User Authentication Using Multi-Modal Biometrics
It is commonly acknowledged that mobile devices now form an integral part of an individual’s everyday life. The modern mobile handheld devices are capable to provide a wide range of services and applications over multiple networks. With the increasing capability and accessibility, they introduce additional demands in term of security.
This thesis explores the need for authentication on mobile devices and proposes a novel mechanism to improve the current techniques. The research begins with an intensive review of mobile technologies and the current security challenges that mobile devices experience to illustrate the imperative of authentication on mobile devices. The research then highlights the existing authentication mechanism and a wide range of weakness. To this end, biometric approaches are identified as an appropriate solution an opportunity for security to be maintained beyond point-of-entry. Indeed, by utilising behaviour biometric techniques, the authentication mechanism can be performed in a continuous and transparent fashion.
This research investigated three behavioural biometric techniques based on SMS texting activities and messages, looking to apply these techniques as a multi-modal biometric authentication method for mobile devices. The results showed that linguistic profiling; keystroke dynamics and behaviour profiling can be used to discriminate users with overall Equal Error Rates (EER) 12.8%, 20.8% and 9.2% respectively. By using a combination of biometrics, the results showed clearly that the classification performance is better than using single biometric technique achieving EER 3.3%. Based on these findings, a novel architecture of multi-modal biometric authentication on mobile devices is proposed. The framework is able to provide a robust, continuous and transparent authentication in standalone and server-client modes regardless of mobile hardware configuration. The framework is able to continuously maintain the security status of the devices. With a high level of security status, users are permitted to access sensitive services and data. On the other hand, with the low level of security, users are required to re-authenticate before accessing sensitive service or data
Keystroke dynamics as signal for shallow syntactic parsing
Keystroke dynamics have been extensively used in psycholinguistic and writing
research to gain insights into cognitive processing. But do keystroke logs
contain actual signal that can be used to learn better natural language
processing models?
We postulate that keystroke dynamics contain information about syntactic
structure that can inform shallow syntactic parsing. To test this hypothesis,
we explore labels derived from keystroke logs as auxiliary task in a multi-task
bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising
results on two shallow syntactic parsing tasks, chunking and CCG supertagging.
Our model is simple, has the advantage that data can come from distinct
sources, and produces models that are significantly better than models trained
on the text annotations alone.Comment: In COLING 201
Privacy versus Information in Keystroke Latency Data
The computer science education research field studies how students learn computer science related concepts such as programming and algorithms. One of the major goals of the field is to help students learn CS concepts that are often difficult to grasp because students rarely encounter them in primary or secondary education. In order to help struggling students, information on the learning process of students has to be collected. In many introductory programming courses process data is automatically collected in the form of source code snapshots. Source code snapshots usually include at least the source code of the student's program and a timestamp. Studies ranging from identifying at-risk students to inferring programming experience and topic knowledge have been conducted using source code snapshots.
However, replicating source code snapshot -based studies is currently hard as data is rarely shared due to privacy concerns. Source code snapshot data often includes many attributes that can be used for identification, for example the name of the student or the student number. There can even be hidden identifiers in the data that can be used for identification even if obvious identifiers are removed. For example, keystroke data from source code snapshots can be used for identification based on the distinct typing profiles of students. Hence, simply removing explicit identifiers such as names and student numbers is not enough to protect the privacy of the users who have supplied the data. At the same time, removing all keystroke data would decrease the value of the data significantly and possibly preclude replication studies.
In this work, we investigate how keystroke data from a programming context could be modified to prevent keystroke latency -based identification whilst still retaining valuable information in the data. This study is the first step in enabling the sharing of anonymized source code snapshots. We investigate the degree of anonymization required to make identification of students based on their typing patterns unreliable. Then, we study whether the modified keystroke data can still be used to infer the programming experience of the students as a case study of whether the anonymized typing patterns have retained at least some informative value. We show that it is possible to modify data so that keystroke latency -based identification is no longer accurate, but the programming experience of the students can still be inferred, i.e. the data still has value to researchers
Biometrics
Biometrics-Unique and Diverse Applications in Nature, Science, and Technology provides a unique sampling of the diverse ways in which biometrics is integrated into our lives and our technology. From time immemorial, we as humans have been intrigued by, perplexed by, and entertained by observing and analyzing ourselves and the natural world around us. Science and technology have evolved to a point where we can empirically record a measure of a biological or behavioral feature and use it for recognizing patterns, trends, and or discrete phenomena, such as individuals' and this is what biometrics is all about. Understanding some of the ways in which we use biometrics and for what specific purposes is what this book is all about
Ranking to Learn and Learning to Rank: On the Role of Ranking in Pattern Recognition Applications
The last decade has seen a revolution in the theory and application of
machine learning and pattern recognition. Through these advancements, variable
ranking has emerged as an active and growing research area and it is now
beginning to be applied to many new problems. The rationale behind this fact is
that many pattern recognition problems are by nature ranking problems. The main
objective of a ranking algorithm is to sort objects according to some criteria,
so that, the most relevant items will appear early in the produced result list.
Ranking methods can be analyzed from two different methodological perspectives:
ranking to learn and learning to rank. The former aims at studying methods and
techniques to sort objects for improving the accuracy of a machine learning
model. Enhancing a model performance can be challenging at times. For example,
in pattern classification tasks, different data representations can complicate
and hide the different explanatory factors of variation behind the data. In
particular, hand-crafted features contain many cues that are either redundant
or irrelevant, which turn out to reduce the overall accuracy of the classifier.
In such a case feature selection is used, that, by producing ranked lists of
features, helps to filter out the unwanted information. Moreover, in real-time
systems (e.g., visual trackers) ranking approaches are used as optimization
procedures which improve the robustness of the system that deals with the high
variability of the image streams that change over time. The other way around,
learning to rank is necessary in the construction of ranking models for
information retrieval, biometric authentication, re-identification, and
recommender systems. In this context, the ranking model's purpose is to sort
objects according to their degrees of relevance, importance, or preference as
defined in the specific application.Comment: European PhD Thesis. arXiv admin note: text overlap with
arXiv:1601.06615, arXiv:1505.06821, arXiv:1704.02665 by other author
- …