4 research outputs found

    Preventing Keystroke Based Identification in Open Data Sets

    Get PDF
    Large-scale courses such as Massive Online Open Courses (MOOCs) can be a great data source for researchers. Ideally, the data gathered on such courses should be openly available to all researchers. Studies could be easily replicated and novel studies on existing data could be conducted. However, very fine-grained data such as source code snapshots can contain hidden identifiers. For example, distinct typing patterns that identify individuals can be extracted from such data. Hence, simply removing explicit identifiers such as names and student numbers is not sufficient to protect the privacy of the users who have supplied the data. At the same time, removing all keystroke information would decrease the value of the shared data significantly. In this work, we study how keystroke data from a programming context could be modified to prevent keystroke latency based identification whilst still retaining information that can be used to e.g. infer programming experience. We investigate the degree of anonymization required to render identification of students based on their typing patterns unreliable. Then, we study whether the modified keystroke data can still be used to infer the programming experience of the students as a case study of whether the anonymized typing patterns have retained at least some informative value. We show that it is possible to modify data so that keystroke latency based identification is no longer accurate, but the programming experience of the students can still be inferred, i.e. the data still has value to researchers. In a broader context, our results indicate that information and anonymity are not necessarily mutually exclusive.Peer reviewe

    Effects of Automated Interventions in Programming Assignments: Evidence from a Field Experiment

    Full text link
    A typical problem in MOOCs is the missing opportunity for course conductors to individually support students in overcoming their problems and misconceptions. This paper presents the results of automatically intervening on struggling students during programming exercises and offering peer feedback and tailored bonus exercises. To improve learning success, we do not want to abolish instructionally desired trial and error but reduce extensive struggle and demotivation. Therefore, we developed adaptive automatic just-in-time interventions to encourage students to ask for help if they require considerably more than average working time to solve an exercise. Additionally, we offered students bonus exercises tailored for their individual weaknesses. The approach was evaluated within a live course with over 5,000 active students via a survey and metrics gathered alongside. Results show that we can increase the call outs for help by up to 66% and lower the dwelling time until issuing action. Learnings from the experiments can further be used to pinpoint course material to be improved and tailor content to be audience specific.Comment: 10 page

    Privacy-Protecting Techniques for Behavioral Data: A Survey

    Get PDF
    Our behavior (the way we talk, walk, or think) is unique and can be used as a biometric trait. It also correlates with sensitive attributes like emotions. Hence, techniques to protect individuals privacy against unwanted inferences are required. To consolidate knowledge in this area, we systematically reviewed applicable anonymization techniques. We taxonomize and compare existing solutions regarding privacy goals, conceptual operation, advantages, and limitations. Our analysis shows that some behavioral traits (e.g., voice) have received much attention, while others (e.g., eye-gaze, brainwaves) are mostly neglected. We also find that the evaluation methodology of behavioral anonymization techniques can be further improved
    corecore