1,146 research outputs found

    A Study of Keystroke Data in Two Contexts : Written Language and Programming Language Influence Predictability of Learning Outcomes

    Get PDF
    We study programming process data from two introductory programming courses. Between the course contexts, the programming languages differ, the teaching approaches differ, and the spoken languages differ. In both courses, students' keystroke data -- timestamps and the pressed keys -- are recorded as students work on programming assignments. We study how the keystroke data differs between the contexts, and whether research on predicting course outcomes using keystroke latencies generalizes to other contexts. Our results show that there are differences between the contexts in terms of frequently used keys, which can be partially explained by the differences between the spoken languages and the programming languages. Further, our results suggest that programming process data that can be collected non-intrusive in-situ can be used for predicting course outcomes in multiple contexts. The predictive power, however, varies between contexts possibly because the frequently used keys differ between programming languages and spoken languages. Thus, context-specific fine-tuning of predictive models may be needed.Peer reviewe

    Keystroke Dynamics as Part of Lifelogging

    Get PDF
    In this paper we present the case for including keystroke dynamics in lifelogging. We describe how we have used a simple keystroke logging application called Loggerman, to create a dataset of longitudinal keystroke timing data spanning a period of more than 6 months for 4 participants. We perform a detailed analysis of this data by examining the timing information associated with bigrams or pairs of adjacently-typed alphabetic characters. We show how there is very little day-on-day variation of the keystroke timing among the top-200 bigrams for some participants and for others there is a lot and this correlates with the amount of typing each would do on a daily basis. We explore how daily variations could correlate with sleep score from the previous night but find no significant relation-ship between the two. Finally we describe the public release of this data as well including as a series of pointers for future work including correlating keystroke dynamics with mood and fatigue during the day.Comment: Accepted to 27th International Conference on Multimedia Modeling, Prague, Czech Republic, June 202

    Continuous multibiometric authentication for online exam with machine learning

    Get PDF
    Multibiometric authentication has been received great attention over the past decades with the growing demand of a robust authentication system. Continuous authentication system verifies a user continuously once a person is login in order to prevent intruders from the impersonation. In this study, we propose a continuous multibiometric authentication system for the identification of the person during online exam using two modalities, face recognition and keystrokes. Each modality is separately processed to generate matching scores, and the fusion method is performed at the score level to improve the accuracy. The EigenFace and support vector machine (SVM) approach are applied to the facial recognition and keystrokes dynamic accordingly. The matching score calculated from each modality is combined using the classification by the decision tree with the weighted sum after the score is split into three zones of interes

    An investigation of the predictability of the Brazilian three-modal hand-based behavioural biometric: a feature selection and feature-fusion approach

    Get PDF
    Abstract: New security systems, methods or techniques need to have their performance evaluated in conditions that closely resemble a real-life situation. The effectiveness with which individual identity can be predicted in different scenarios can benefit from seeking a broad base of identity evidence. Many approaches to the implementation of biometric-based identification systems are possible, and different configurations are likely to generate significantly different operational characteristics. The choice of implementational structure is, therefore, very dependent on the performance criteria, which is most important in any particular task scenario. The issue of improving performance can be addressed in many ways, but system configurations based on integrating different information sources are widely adopted in order to achieve this. Thus, understanding how each data information can influence performance is very important. The use of similar modalities may imply that we can use the same features. However, there is no indication that very similar (such as keyboard and touch keystroke dynamics, for example) basic biometrics will perform well using the same set of features. In this paper, we will evaluate the merits of using a three-modal hand-based biometric database for user prediction focusing on feature selection as the main investigation point. To the best of our knowledge, this is the first thought-out analysis of a database with three modalities that were collected from the same users, containing keyboard keystroke, touch keystroke and handwritten signature. First, we will investigate how the keystroke modalities perform, and then, we will add the signature in order to understand if there is any improvement in the results. We have used a wide range of techniques for feature selection that includes filters and wrappers (genetic algorithms), and we have validated our findings using a clustering technique

    Analysis of Cloud Based Keystroke Dynamics for Behavioral Biometrics Using Multiclass Machine Learning

    Get PDF
    With the rapid proliferation of interconnected devices and the exponential growth of data stored in the cloud, the potential attack surface for cybercriminals expands significantly. Behavioral biometrics provide an additional layer of security by enabling continuous authentication and real-time monitoring. Its continuous and dynamic nature offers enhanced security, as it analyzes an individual's unique behavioral patterns in real-time. In this study, we utilized a dataset consisting of 90 users' attempts to type the 11-character string 'Exponential' eight times. Each attempt was recorded in the cloud with timestamps for key press and release events, aligned with the initial key press. The objective was to explore the potential of keystroke dynamics for user authentication. Various features were extracted from the dataset, categorized into tiers. Tier-0 features included key-press time and key-release time, while Tier-1 derived features encompassed durations, latencies, and digraphs. Additionally, Tier-2 statistical measures such as maximum, minimum, and mean values were calculated. The performance of three popular multiclass machine learning models, namely Decision Tree, Multi-layer Perceptron, and LightGBM, was evaluated using these features. The results indicated that incorporating Tier-1 and Tier-2 features significantly improved the models' performance compared to relying solely on Tier-0 features. The inclusion of Tier-1 and Tier-2 features allows the models to capture more nuanced patterns and relationships in the keystroke data. While Decision Trees provide a baseline, Multi-layer Perceptron and LightGBM outperform them by effectively capturing complex relationships. Particularly, LightGBM excels in leveraging information from all features, resulting in the highest level of explanatory power and prediction accuracy. This highlights the importance of capturing both local and higher-level patterns in keystroke data to accurately authenticate users

    Age Detection Through Keystroke Dynamics From User Authentication Failures

    Get PDF
    In this paper an incident response approach is proposed for handling detections of authentication failures in systems that employ dynamic biometric authentication and more specifically keystroke user recognition. The main component of the approach is a multi layer perceptron focusing on the age classification of a user. Empirical findings show that the classifier can detect the age of the subject with a probability that is far from the uniform random distribution, making the proposed method suitable for providing supporting yet circumstantial evidence during e-discovery
    • …
    corecore