17 research outputs found

    Predicting Authorship and Author Traits from Keystroke Dynamics

    Get PDF

    A Study of Keystroke Data in Two Contexts : Written Language and Programming Language Influence Predictability of Learning Outcomes

    Get PDF
    We study programming process data from two introductory programming courses. Between the course contexts, the programming languages differ, the teaching approaches differ, and the spoken languages differ. In both courses, students' keystroke data -- timestamps and the pressed keys -- are recorded as students work on programming assignments. We study how the keystroke data differs between the contexts, and whether research on predicting course outcomes using keystroke latencies generalizes to other contexts. Our results show that there are differences between the contexts in terms of frequently used keys, which can be partially explained by the differences between the spoken languages and the programming languages. Further, our results suggest that programming process data that can be collected non-intrusive in-situ can be used for predicting course outcomes in multiple contexts. The predictive power, however, varies between contexts possibly because the frequently used keys differ between programming languages and spoken languages. Thus, context-specific fine-tuning of predictive models may be needed.Peer reviewe

    Utilizing Linguistic Context To Improve Individual and Cohort Identification in Typed Text

    Full text link
    The process of producing written text is complex and constrained by pressures that range from physical to psychological. In a series of three sets of experiments, this thesis demonstrates the effects of linguistic context on the timing patterns of the production of keystrokes. We elucidate the effect of linguistic context at three different levels of granularity: The first set of experiments illustrate how the nontraditional syntax of a single linguistic construct, the multi-word expression, can create significant changes in keystroke production patterns. This set of experiments is followed by a set of experiments that test the hypothesis on the entire linguistic output of an individual. By taking into account linguistic context, we are able to create more informative feature-sets, and utilize these to improve the accuracy of keystroke dynamic-based user authentication. Finally, we extend our findings to entire populations, or demographic cohorts. We show that typing patterns can be used to predict a group\u27s gender, native language and dominant hand. In addition, keystroke patterns can shed light on the cognitive complexity of a task that a typist is engaged in. The findings of these experiments have far-reaching implications for linguists, cognitive scientists, computer security researchers and social scientists

    Real-world keystroke dynamics are a potentially valid biomarker for clinical disability in multiple sclerosis

    Get PDF
    Background: Clinical measures in multiple sclerosis (MS) face limitations that may be overcome by utilising smartphone keyboard interactions acquired continuously and remotely during regular typing. Objective: The aim of this study was to determine the reliability and validity of keystroke dynamics to assess clinical aspects of MS. Methods: In total, 102 MS patients and 24 controls were included in this observational study. Keyboard interactions were obtained with the Neurokeys keyboard app. Eight timing-related keystroke features were assessed for reliability with intraclass correlation coefficients (ICCs); construct validity by analysing group differences (in fatigue, gadolinium-enhancing lesions on magnetic resonance imaging (MRI), and patients vs controls); and concurrent validity by correlating with disability measures. Results: Reliability was moderate in two (ICC = 0.601 and 0.742) and good to excellent in the remaining six features (ICC = 0.760–0.965). Patients had significantly higher keystroke latencies than controls. Latency between key presses correlated the highest with Expanded Disability Status Scale (r = 0.407) and latency between key releases with Nine-Hole Peg Test and Symbol Digit Modalities Test (ρ = 0.503 and r = −0.553, respectively), ps < 0.001. Conclusion: Keystroke dynamics were reliable, distinguished patients and controls, and were associated with clinical disability measures. Consequently, keystroke dynamics are a promising valid surrogate marker for clinical disability in MS

    On the Inference of Soft Biometrics from Typing Patterns Collected in a Multi-device Environment

    Full text link
    In this paper, we study the inference of gender, major/minor (computer science, non-computer science), typing style, age, and height from the typing patterns collected from 117 individuals in a multi-device environment. The inference of the first three identifiers was considered as classification tasks, while the rest as regression tasks. For classification tasks, we benchmark the performance of six classical machine learning (ML) and four deep learning (DL) classifiers. On the other hand, for regression tasks, we evaluated three ML and four DL-based regressors. The overall experiment consisted of two text-entry (free and fixed) and four device (Desktop, Tablet, Phone, and Combined) configurations. The best arrangements achieved accuracies of 96.15%, 93.02%, and 87.80% for typing style, gender, and major/minor, respectively, and mean absolute errors of 1.77 years and 2.65 inches for age and height, respectively. The results are promising considering the variety of application scenarios that we have listed in this work.Comment: The first two authors contributed equally. The code is available upon request. Please contact the last autho

    User Attribution Through Keystroke Dynamics-Based Author Age Estimation

    Get PDF
    Keystroke dynamics analysis has often been used in user authentication. In this work, it is used to classify users according to their age. The authors have extended their previous research in which they managed to identify the age group that a user belongs to with an accuracy of 66.1%. The main changes made were the use of a larger dataset, which resulted from a new volunteer recording phase, the exploitation of more keystroke dynamics features, and the use of a procedure for selecting those features that can best distinguish users according to their age. Five machine learning models were used for the classification, and their performance in relation to the number of features involved was tested. As a result of these changes in the research method, an improvement in the performance of the proposed system has been achieved. The accuracy of the improved system is 89.7%

    Social and Linguistic Behavior and its Correlation to Trait Empathy

    Get PDF
    A growing body of research exploits social media behaviors to gauge psychological character- istics, though trait empathy has received little attention. Because of its intimate link to the abil- ity to relate to others, our research aims to predict participants’ levels of empathy, given their textual and friending behaviors on Facebook. Using Poisson regression, we compared the vari- ance explained in Davis’ Interpersonal Reactivity Index (IRI) scores on four constructs (em- pathic concern, personal distress, fantasy, perspective taking), by two classes of variables: 1) post content and 2) linguistic style. Our study lays the groundwork for a greater understanding of empathy’s role in facilitating interactions on social media

    In a split second : Handwriting pauses in typical and struggling writers

    Get PDF
    Ajuts: This research was supported by Spanish grants 2015ACUP 00175 and PID2019-108791GA-I00, awarded to NS.A two-second threshold has been typically used when analyzing the writing processes. However, there is only a weak empirical basis to claim that specific average numbers and durations of pauses may be associated with specific writing processes. We focused on handwriting execution pauses, because immature writers are known to struggle with transcription skills. We aimed to provide an evidence-based account of the average number and duration of handwriting pauses in the mid-Primary grades and to identify process-level markers of writing difficulties. Eighty 3rd and 5th graders, with and without writing difficulties, participated in the study. We examined pauses in a handwriting-only task, to be able to isolate those which could only be attributed to handwriting processes. Letter features were considered, as well as children's handwriting fluency level. The average duration of handwriting pauses was around 400ms, in line with assumptions that transcription pauses would fall under the 2,000ms threshold. We found that 3rd graders made more and longer pauses than 5th graders. Struggling writers made a similar number of pauses across grades than typically-developing children, although they were significantly longer, even after controlling for the effect of handwriting fluency. Our findings provide an evidence-based account of the duration of handwriting pauses. They also suggest that children need fewer and shorter handwriting pauses as they progress in automatizing transcription. However, some young writers struggle with letter formation even after 3 to 5 years of instruction
    corecore