16 research outputs found

    PATH: Person Authentication using Trace Histories

    Full text link
    In this paper, a solution to the problem of Active Authentication using trace histories is addressed. Specifically, the task is to perform user verification on mobile devices using historical location traces of the user as a function of time. Considering the movement of a human as a Markovian motion, a modified Hidden Markov Model (HMM)-based solution is proposed. The proposed method, namely the Marginally Smoothed HMM (MSHMM), utilizes the marginal probabilities of location and timing information of the observations to smooth-out the emission probabilities while training. Hence, it can efficiently handle unforeseen observations during the test phase. The verification performance of this method is compared to a sequence matching (SM) method , a Markov Chain-based method (MC) and an HMM with basic Laplace Smoothing (HMM-lap). Experimental results using the location information of the UMD Active Authentication Dataset-02 (UMDAA02) and the GeoLife dataset are presented. The proposed MSHMM method outperforms the compared methods in terms of equal error rate (EER). Additionally, the effects of different parameters on the proposed method are discussed.Comment: 8 pages, 9 figures. Best Paper award at IEEE UEMCON 201

    Exploiting Wavelet and Prosody-related Features for the Detection of Voice Disorders

    Get PDF
    An approach for the detection of voice disorders exploiting wavelet and prosody-related properties of speech is presented in this paper. Based on the normalized energy contents of the Discrete Wavelet Transform (DWT) coefficients over all voice frames, several statistical measures are first determined. Then, the idea of some prosody-related voice properties, such as mean pitch, jitter and shimmer are utilized to compute similar statistical measures over all the frames. A set of statistical measures of the normalized energy contents of the DWT coefficients is combined with a set of statistical measures of the extracted prosody-related voice properties in order to form a feature vector to be used in both training and testing phases. Two categories of voice samples namely, healthy and disordered are considered here thus formulating the problem in the proposed method as a two-class problem to be solved. Finally, an Euclidean Distance based classifier is used to handle the feature vector for the purpose of detecting the disordered voice. A number of simulations is carried out and it is shown that the statistical analysis based on wavelet and prosody-related properties can effectively detect a variety of voice disorders from the mixture of healthy and disordered voices

    Multi-modal Active Authentication of Smartphone Users

    Get PDF
    With the increasing usage of smartphones not only as communication devices but also as the port of entry for a wide variety of user accounts at different information sensitivity levels, the need for hassle-free authentication is on the rise. Going beyond the traditional one-time authentication concept, active authentication (AA) schemes are emerging which authenticates users periodically in the background without the need for any user interaction. The purpose of this research is to explore different aspects of the AA problem and develop viable solutions by extracting unique biometric traits of the user from the wide variety of usage data obtained from Smartphone sensors. The key aspects of our research are the development of different components of user verification algorithms based on (a) face images from the front camera and (b) data from modalities other than the face. Since generic face detection algorithms do not perform very well in the mobile domain due to a significant presence of occluded and partially visible faces, we propose facial segment-based face detection technique to handle the challenge of partial faces in the mobile domain. We have developed three increasingly accurate proposal-based face detection methods, namely Facial Segment-based Face Detector (FSFD), SegFace and DeepSegFace, respectively, which perform binary classification on the results of a novel proposal generator that utilizes facial segments to obtain face-proposals. We also propose the Deep Regression-based User Image Detector (DRUID) network which shifts from the classification to the regression paradigm to avoid the need for proposal generation and thereby, achieves better processing speed and accuracy. DeepSegFace and DRUID have unique network architectures with customized loss functions and utilize a novel data augmentation scheme to train on a relatively small amount of data. The proposed methods, especially DRUID show superior performance over other state-of-the-art face detectors in terms of precision-recall and ROC curve on two mobile face datasets. We extended the concept of facial-segments to facial attribute detection for partially visible faces, a topic rarely addressed in the literature. We developed a deep convolutional neural network-based method named Segment-wise, Partial, Localized Inference in Training Facial Attribute Classification Ensembles (SPLITFACE) to detect attributes reliably from partially occluded faces. Taking several facial segments and the full face as input, SPLITFACE takes a data-driven approach to determine which attributes are localized in which facial segments. The unique architecture of the network allows each attribute to be predicted by multiple segments, which permits the implementation of committee machine techniques for combining local and global decisions to boost performance. Our evaluations on the full CelebA and LFWA datasets and their modified partial-visibility versions show that SPLITFACE significantly outperforms other recent attribute detection methods, especially for partial faces and for cross-domain experiments. We also explored the potentials of two less popular modalities namely, location history and application-usage, for active authentication. Aiming to discover the pattern of life of a user, we processed the location traces into separate state space models for each user and developed the Marginally Smoothed Hidden Markov Model (MSHMM) algorithm to authenticate the current user based on the most recent sequence of observations. The method takes into consideration the sparsity of the available data, the transition phases between states, the timing information and also the unforeseen states. We looked deeper into the impact of unforeseen and unknown states in another research work where we evaluated the feasibility of application usage behavior of the users as a potential solution to the active authentication problem. Our experiments show that it is essential to take unforeseen states into account when designing an authentication system with sparse data and marginal-smoothing techniques are very useful in this regard. We conclude this dissertation with the description of some ongoing efforts and future directions of research related the topics discussed in addition to a summary of all the contributions and impacts of this research work

    Active User Authentication for Smartphones: A Challenge Data Set and Benchmark Results

    Full text link
    In this paper, automated user verification techniques for smartphones are investigated. A unique non-commercial dataset, the University of Maryland Active Authentication Dataset 02 (UMDAA-02) for multi-modal user authentication research is introduced. This paper focuses on three sensors - front camera, touch sensor and location service while providing a general description for other modalities. Benchmark results for face detection, face verification, touch-based user identification and location-based next-place prediction are presented, which indicate that more robust methods fine-tuned to the mobile platform are needed to achieve satisfactory verification accuracy. The dataset will be made available to the research community for promoting additional research.Comment: 8 pages, 12 figures, 6 tables. Best poster award at BTAS 201

    MonoSelfRecon: Purely Self-Supervised Explicit Generalizable 3D Reconstruction of Indoor Scenes from Monocular RGB Views

    Full text link
    Current monocular 3D scene reconstruction (3DR) works are either fully-supervised, or not generalizable, or implicit in 3D representation. We propose a novel framework - MonoSelfRecon that for the first time achieves explicit 3D mesh reconstruction for generalizable indoor scenes with monocular RGB views by purely self-supervision on voxel-SDF (signed distance function). MonoSelfRecon follows an Autoencoder-based architecture, decodes voxel-SDF and a generalizable Neural Radiance Field (NeRF), which is used to guide voxel-SDF in self-supervision. We propose novel self-supervised losses, which not only support pure self-supervision, but can be used together with supervised signals to further boost supervised training. Our experiments show that "MonoSelfRecon" trained in pure self-supervision outperforms current best self-supervised indoor depth estimation models and is comparable to 3DR models trained in fully supervision with depth annotations. MonoSelfRecon is not restricted by specific model design, which can be used to any models with voxel-SDF for purely self-supervised manner

    V2CE: Video to Continuous Events Simulator

    Full text link
    Dynamic Vision Sensor (DVS)-based solutions have recently garnered significant interest across various computer vision tasks, offering notable benefits in terms of dynamic range, temporal resolution, and inference speed. However, as a relatively nascent vision sensor compared to Active Pixel Sensor (APS) devices such as RGB cameras, DVS suffers from a dearth of ample labeled datasets. Prior efforts to convert APS data into events often grapple with issues such as a considerable domain shift from real events, the absence of quantified validation, and layering problems within the time axis. In this paper, we present a novel method for video-to-events stream conversion from multiple perspectives, considering the specific characteristics of DVS. A series of carefully designed losses helps enhance the quality of generated event voxels significantly. We also propose a novel local dynamic-aware timestamp inference strategy to accurately recover event timestamps from event voxels in a continuous fashion and eliminate the temporal layering problem. Results from rigorous validation through quantified metrics at all stages of the pipeline establish our method unquestionably as the current state-of-the-art (SOTA).Comment: 6 pages, 7 figure
    corecore