253 research outputs found

    BeCAPTCHA: Behavioral bot detection using touchscreen and mobile sensors benchmarked on HuMIdb

    Full text link
    In this paper we study the suitability of a new generation of CAPTCHA methods based on smartphone interactions. The heterogeneous flow of data generated during the interaction with the smartphones can be used to model human behavior when interacting with the technology and improve bot detection algorithms. For this, we propose BeCAPTCHA, a CAPTCHA method based on the analysis of the touchscreen information obtained during a single drag and drop task in combination with the accelerometer data. The goal of BeCAPTCHA is to determine whether the drag and drop task was realized by a human or a bot. We evaluate the method by generating fake samples synthesized with Generative Adversarial Neural Networks and handcrafted methods. Our results suggest the potential of mobile sensors to characterize the human behavior and develop a new generation of CAPTCHAs. The experiments are evaluated with HuMIdb1 (Human Mobile Interaction database), a novel multimodal mobile database that comprises 14 mobile sensors acquired from 600 users. HuMIdb is freely available to the research communityThis work has been supported by projects: PRIMA, Spain (H2020-MSCA-ITN-2019-860315), TRESPASS-ETN, Spain (H2020-MSCA-ITN-2019-860813), BIBECA RTI2018-101248-B-I00 (MINECO/FEDER), and BioGuard, Spain (Ayudas Fundación BBVA a Equipos de Investigación Científica 2017). Spanish Patent Application P20203006

    Vulnerability analysis of cyber-behavioral biometric authentication

    Get PDF
    Research on cyber-behavioral biometric authentication has traditionally assumed naïve (or zero-effort) impostors who make no attempt to generate sophisticated forgeries of biometric samples. Given the plethora of adversarial technologies on the Internet, it is questionable as to whether the zero-effort threat model provides a realistic estimate of how these authentication systems would perform in the wake of adversity. To better evaluate the efficiency of these authentication systems, there is need for research on algorithmic attacks which simulate the state-of-the-art threats. To tackle this problem, we took the case of keystroke and touch-based authentication and developed a new family of algorithmic attacks which leverage the intrinsic instability and variability exhibited by users\u27 behavioral biometric patterns. For both fixed-text (or password-based) keystroke and continuous touch-based authentication, we: 1) Used a wide range of pattern analysis and statistical techniques to examine large repositories of biometrics data for weaknesses that could be exploited by adversaries to break these systems, 2) Designed algorithmic attacks whose mechanisms hinge around the discovered weaknesses, and 3) Rigorously analyzed the impact of the attacks on the best verification algorithms in the respective research domains. When launched against three high performance password-based keystroke verification systems, our attacks increased the mean Equal Error Rates (EERs) of the systems by between 28.6% and 84.4% relative to the traditional zero-effort attack. For the touch-based authentication system, the attacks performed even better, as they increased the system\u27s mean EER by between 338.8% and 1535.6% depending on parameters such as the failure-to-enroll threshold and the type of touch gesture subjected to attack. For both keystroke and touch-based authentication, we found that there was a small proportion of users who saw considerably greater performance degradation than others as a result of the attack. There was also a sub-set of users who were completely immune to the attacks. Our work exposes a previously unexplored weakness of keystroke and touch-based authentication and opens the door to the design of behavioral biometric systems which are resistant to statistical attacks

    My Behavior is my Privacy & Secure Password !

    Get PDF
    International audienceMany studies propose strong user authentication based on biometric modalities. However, they often either, assume a trusted component, are modality-dependant, use only one biometric modality, are reversible , or does not enable the service to adapt the security on-the-fly. A recent work [1] introduced the concept of Personal Identity Code Respecting Privacy (PICRP), a non-cryptographic and non-reversible signature computed from any arbitrary information. In this paper, we extend this concept with the use of Keystroke Dynamics, IP and GPS geo-location by optimizing the pre-processing and merging of collected information. We demonstrate the performance of the proposed approach through experimental results and we present an example of its usage

    Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics

    Full text link
    Mouse dynamics is a potential means of authenticating users. Typically, the authentication process is based on classical machine learning techniques, but recently, deep learning techniques have been introduced for this purpose. Although prior research has demonstrated how machine learning and deep learning algorithms can be bypassed by carefully crafted adversarial samples, there has been very little research performed on the topic of behavioural biometrics in the adversarial domain. In an attempt to address this gap, we built a set of attacks, which are applications of several generative approaches, to construct adversarial mouse trajectories that bypass authentication models. These generated mouse sequences will serve as the adversarial samples in the context of our experiments. We also present an analysis of the attack approaches we explored, explaining their limitations. In contrast to previous work, we consider the attacks in a more realistic and challenging setting in which an attacker has access to recorded user data but does not have access to the authentication model or its outputs. We explore three different attack strategies: 1) statistics-based, 2) imitation-based, and 3) surrogate-based; we show that they are able to evade the functionality of the authentication models, thereby impacting their robustness adversely. We show that imitation-based attacks often perform better than surrogate-based attacks, unless, however, the attacker can guess the architecture of the authentication model. In such cases, we propose a potential detection mechanism against surrogate-based attacks.Comment: Accepted in 2019 International Joint Conference on Neural Networks (IJCNN). Update of DO

    2023 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium

    2023 SDSU Data Science Symposium Presentation Abstracts

    Get PDF
    This document contains abstracts for presentations and posters 2023 SDSU Data Science Symposium

    USER AUTHENTICATION ACROSS DEVICES, MODALITIES AND REPRESENTATION: BEHAVIORAL BIOMETRIC METHODS

    Get PDF
    Biometrics eliminate the need for a person to remember and reproduce complex secretive information or carry additional hardware in order to authenticate oneself. Behavioral biometrics is a branch of biometrics that focuses on using a person’s behavior or way of doing a task as means of authentication. These tasks can be any common, day to day tasks like walking, sleeping, talking, typing and so on. As interactions with computers and other smart-devices like phones and tablets have become an essential part of modern life, a person’s style of interaction with them can be used as a powerful means of behavioral biometrics. In this dissertation, we present insights from the analysis of our proposed set of contextsensitive or word-specific keystroke features on desktop, tablet and phone. We show that the conventional features are not highly discriminatory on desktops and are only marginally better on hand-held devices for user identification. By using information of the context, our proposed word-specific features offer superior discrimination among users on all devices. Classifiers, built using our proposed features, perform user identification with high accuracies in range of 90% to 97%, average precision and recall values of 0.914 and 0.901 respectively. Analysis of the word-based impact factors reveal that four or five character words, words with about 50% vowels, and those that are ranked higher on the frequency lists might give better results for the extraction and use of the proposed features for user identification. We also examine a large umbrella of behavioral biometric data such as; keystroke latencies, gait and swipe data on desktop, phone and tablet for the assumption of an underlying normal distribution, which is common in many research works. Using suitable nonparametric normality tests (Lilliefors test and Shapiro-Wilk test) we show that a majority of the features from all activities and all devices, do not follow a normal distribution. In most cases less than 25% of the samples that were tested had p values \u3e 0.05. We discuss alternate solutions to address the non-normality in behavioral biometric data. Openly available datasets did not provide the wide range of modalities and activities required for our research. Therefore, we have collected and shared an open access, large benchmark dataset for behavioral biometrics on IEEEDataport. We describe the collection and analysis of our Syracuse University and Assured Information Security - Behavioral Biometrics Multi-device and multi -Activity data from Same users (SU-AIS BB-MAS) Dataset. Which is an open access dataset on IEEEdataport, with data from 117 subjects for typing (both fixed and free text), gait (walking, upstairs and downstairs) and touch on Desktop, Tablet and Phone. The dataset consists a total of about: 3.5 million keystroke events; 57.1 million data-points for accelerometer and gyroscope each; 1.7 million datapoints for swipes and is listed as one of the most popular datasets on the portal (through IEEE emails to all members on 05/13/2020 and 07/21/2020). We also show that keystroke dynamics (KD) on a desktop can be used to classify the type of activity, either benign or adversarial, that a text sample originates from. We show the inefficiencies of popular temporal features for this task. With our proposed set of 14 features we achieve high accuracies (93% to 97%) and low Type 1 and Type 2 errors (3% to 8%) in classifying text samples of different sizes. We also present exploratory research in (a) authenticating users through musical notes generated by mapping their keystroke latencies to music and (b) authenticating users through the relationship between their keystroke latencies on multiple devices

    Securing Cloud Storage by Transparent Biometric Cryptography

    Get PDF
    With the capability of storing huge volumes of data over the Internet, cloud storage has become a popular and desirable service for individuals and enterprises. The security issues, nevertheless, have been the intense debate within the cloud community. Significant attacks can be taken place, the most common being guessing the (poor) passwords. Given weaknesses with verification credentials, malicious attacks have happened across a variety of well-known storage services (i.e. Dropbox and Google Drive) – resulting in loss the privacy and confidentiality of files. Whilst today's use of third-party cryptographic applications can independently encrypt data, it arguably places a significant burden upon the user in terms of manually ciphering/deciphering each file and administering numerous keys in addition to the login password. The field of biometric cryptography applies biometric modalities within cryptography to produce robust bio-crypto keys without having to remember them. There are, nonetheless, still specific flaws associated with the security of the established bio-crypto key and its usability. Users currently should present their biometric modalities intrusively each time a file needs to be encrypted/decrypted – thus leading to cumbersomeness and inconvenience while throughout usage. Transparent biometrics seeks to eliminate the explicit interaction for verification and thereby remove the user inconvenience. However, the application of transparent biometric within bio-cryptography can increase the variability of the biometric sample leading to further challenges on reproducing the bio-crypto key. An innovative bio-cryptographic approach is developed to non-intrusively encrypt/decrypt data by a bio-crypto key established from transparent biometrics on the fly without storing it somewhere using a backpropagation neural network. This approach seeks to handle the shortcomings of the password login, and concurrently removes the usability issues of the third-party cryptographic applications – thus enabling a more secure and usable user-oriented level of encryption to reinforce the security controls within cloud-based storage. The challenge represents the ability of the innovative bio-cryptographic approach to generate a reproducible bio-crypto key by selective transparent biometric modalities including fingerprint, face and keystrokes which are inherently noisier than their traditional counterparts. Accordingly, sets of experiments using functional and practical datasets reflecting a transparent and unconstrained sample collection are conducted to determine the reliability of creating a non-intrusive and repeatable bio-crypto key of a 256-bit length. With numerous samples being acquired in a non-intrusive fashion, the system would be spontaneously able to capture 6 samples within minute window of time. There is a possibility then to trade-off the false rejection against the false acceptance to tackle the high error, as long as the correct key can be generated via at least one successful sample. As such, the experiments demonstrate that a correct key can be generated to the genuine user once a minute and the average FAR was 0.9%, 0.06%, and 0.06% for fingerprint, face, and keystrokes respectively. For further reinforcing the effectiveness of the key generation approach, other sets of experiments are also implemented to determine what impact the multibiometric approach would have upon the performance at the feature phase versus the matching phase. Holistically, the multibiometric key generation approach demonstrates the superiority in generating the bio-crypto key of a 256-bit in comparison with the single biometric approach. In particular, the feature-level fusion outperforms the matching-level fusion at producing the valid correct key with limited illegitimacy attempts in compromising it – 0.02% FAR rate overall. Accordingly, the thesis proposes an innovative bio-cryptosystem architecture by which cloud-independent encryption is provided to protect the users' personal data in a more reliable and usable fashion using non-intrusive multimodal biometrics.Higher Committee of Education Development in Iraq (HCED

    Ranking to Learn and Learning to Rank: On the Role of Ranking in Pattern Recognition Applications

    Get PDF
    The last decade has seen a revolution in the theory and application of machine learning and pattern recognition. Through these advancements, variable ranking has emerged as an active and growing research area and it is now beginning to be applied to many new problems. The rationale behind this fact is that many pattern recognition problems are by nature ranking problems. The main objective of a ranking algorithm is to sort objects according to some criteria, so that, the most relevant items will appear early in the produced result list. Ranking methods can be analyzed from two different methodological perspectives: ranking to learn and learning to rank. The former aims at studying methods and techniques to sort objects for improving the accuracy of a machine learning model. Enhancing a model performance can be challenging at times. For example, in pattern classification tasks, different data representations can complicate and hide the different explanatory factors of variation behind the data. In particular, hand-crafted features contain many cues that are either redundant or irrelevant, which turn out to reduce the overall accuracy of the classifier. In such a case feature selection is used, that, by producing ranked lists of features, helps to filter out the unwanted information. Moreover, in real-time systems (e.g., visual trackers) ranking approaches are used as optimization procedures which improve the robustness of the system that deals with the high variability of the image streams that change over time. The other way around, learning to rank is necessary in the construction of ranking models for information retrieval, biometric authentication, re-identification, and recommender systems. In this context, the ranking model's purpose is to sort objects according to their degrees of relevance, importance, or preference as defined in the specific application.Comment: European PhD Thesis. arXiv admin note: text overlap with arXiv:1601.06615, arXiv:1505.06821, arXiv:1704.02665 by other author
    • …
    corecore