16 research outputs found
PATH: Person Authentication using Trace Histories
In this paper, a solution to the problem of Active Authentication using trace
histories is addressed. Specifically, the task is to perform user verification
on mobile devices using historical location traces of the user as a function of
time. Considering the movement of a human as a Markovian motion, a modified
Hidden Markov Model (HMM)-based solution is proposed. The proposed method,
namely the Marginally Smoothed HMM (MSHMM), utilizes the marginal probabilities
of location and timing information of the observations to smooth-out the
emission probabilities while training. Hence, it can efficiently handle
unforeseen observations during the test phase. The verification performance of
this method is compared to a sequence matching (SM) method , a Markov
Chain-based method (MC) and an HMM with basic Laplace Smoothing (HMM-lap).
Experimental results using the location information of the UMD Active
Authentication Dataset-02 (UMDAA02) and the GeoLife dataset are presented. The
proposed MSHMM method outperforms the compared methods in terms of equal error
rate (EER). Additionally, the effects of different parameters on the proposed
method are discussed.Comment: 8 pages, 9 figures. Best Paper award at IEEE UEMCON 201
Exploiting Wavelet and Prosody-related Features for the Detection of Voice Disorders
An approach for the detection of voice disorders exploiting wavelet and prosody-related properties of speech is presented in this paper. Based on the normalized energy contents of the Discrete Wavelet Transform (DWT) coefficients over all voice frames, several statistical measures are first determined. Then, the idea of some prosody-related voice properties, such as mean pitch, jitter and shimmer are utilized to compute similar statistical measures over all the frames. A set of statistical measures of the normalized energy contents of the DWT coefficients is combined with a set of statistical measures of the extracted prosody-related voice properties in order to form a feature vector to be used in both training and testing phases. Two categories of voice samples namely, healthy and disordered are considered here thus formulating the problem in the proposed method as a two-class problem to be solved. Finally, an Euclidean Distance based classifier is used to handle the feature vector for the purpose of detecting the disordered voice. A number of simulations is carried out and it is shown that the statistical analysis based on wavelet and prosody-related properties can effectively detect a variety of voice disorders from the mixture of healthy and disordered voices
Multi-modal Active Authentication of Smartphone Users
With the increasing usage of smartphones not only as communication devices but also as the port of entry for a wide variety of user accounts at different information sensitivity levels, the need for hassle-free authentication is on the rise. Going beyond the traditional one-time authentication concept, active authentication (AA) schemes are emerging which authenticates users periodically in the background without the need for any user interaction. The purpose of this research is to explore different aspects of the AA problem and develop viable solutions by extracting unique biometric traits of the user from the wide variety of usage data obtained from Smartphone sensors. The key aspects of our research are the development of different components of user verification algorithms based on (a) face images from the front camera and (b) data from modalities other than the face.
Since generic face detection algorithms do not perform very well in the mobile domain due to a significant presence of occluded and partially visible faces, we propose facial segment-based face detection technique to handle the challenge of partial faces in the mobile domain. We have developed three increasingly accurate proposal-based face detection methods, namely Facial Segment-based Face Detector (FSFD), SegFace and DeepSegFace, respectively, which perform binary classification on the results of a novel proposal generator that utilizes facial segments to obtain face-proposals. We also propose the Deep Regression-based User Image Detector (DRUID) network which shifts from the classification to the regression paradigm to avoid the need for proposal generation and thereby, achieves better processing speed and accuracy. DeepSegFace and DRUID have unique network architectures with customized loss functions and utilize a novel data augmentation scheme to train on a relatively small amount of data. The proposed methods, especially DRUID show superior performance over other state-of-the-art face detectors in terms of precision-recall and ROC curve on two mobile face datasets.
We extended the concept of facial-segments to facial attribute detection for partially visible faces, a topic rarely addressed in the literature. We developed a deep convolutional neural network-based method named Segment-wise, Partial, Localized Inference in Training Facial Attribute Classification Ensembles (SPLITFACE) to detect attributes reliably from partially occluded faces. Taking several facial segments and the full face as input, SPLITFACE takes a data-driven approach to determine which attributes are localized in which facial segments. The unique architecture of the network allows each attribute to be predicted by multiple segments, which permits the implementation of committee machine techniques for combining local and global decisions to boost performance. Our evaluations on the full CelebA and LFWA datasets and their modified partial-visibility versions show that SPLITFACE significantly outperforms other recent attribute detection methods, especially for partial faces and for cross-domain experiments.
We also explored the potentials of two less popular modalities namely, location history and application-usage, for active authentication. Aiming to discover the pattern of life of a user, we processed the location traces into separate state space models for each user and developed the Marginally Smoothed Hidden Markov Model (MSHMM) algorithm to authenticate the current user based on the most recent sequence of observations. The method takes into consideration the sparsity of the available data, the transition phases between states, the timing information and also the unforeseen states. We looked deeper into the impact of unforeseen and unknown states in another research work where we evaluated the feasibility of application usage behavior of the users as a potential solution to the active authentication problem. Our experiments show that it is essential to take unforeseen states into account when designing an authentication system with sparse data and marginal-smoothing techniques are very useful in this regard.
We conclude this dissertation with the description of some ongoing efforts and future directions of research related the topics discussed in addition to a summary of all the contributions and impacts of this research work
Active User Authentication for Smartphones: A Challenge Data Set and Benchmark Results
In this paper, automated user verification techniques for smartphones are
investigated. A unique non-commercial dataset, the University of Maryland
Active Authentication Dataset 02 (UMDAA-02) for multi-modal user authentication
research is introduced. This paper focuses on three sensors - front camera,
touch sensor and location service while providing a general description for
other modalities. Benchmark results for face detection, face verification,
touch-based user identification and location-based next-place prediction are
presented, which indicate that more robust methods fine-tuned to the mobile
platform are needed to achieve satisfactory verification accuracy. The dataset
will be made available to the research community for promoting additional
research.Comment: 8 pages, 12 figures, 6 tables. Best poster award at BTAS 201
MonoSelfRecon: Purely Self-Supervised Explicit Generalizable 3D Reconstruction of Indoor Scenes from Monocular RGB Views
Current monocular 3D scene reconstruction (3DR) works are either
fully-supervised, or not generalizable, or implicit in 3D representation. We
propose a novel framework - MonoSelfRecon that for the first time achieves
explicit 3D mesh reconstruction for generalizable indoor scenes with monocular
RGB views by purely self-supervision on voxel-SDF (signed distance function).
MonoSelfRecon follows an Autoencoder-based architecture, decodes voxel-SDF and
a generalizable Neural Radiance Field (NeRF), which is used to guide voxel-SDF
in self-supervision. We propose novel self-supervised losses, which not only
support pure self-supervision, but can be used together with supervised signals
to further boost supervised training. Our experiments show that "MonoSelfRecon"
trained in pure self-supervision outperforms current best self-supervised
indoor depth estimation models and is comparable to 3DR models trained in fully
supervision with depth annotations. MonoSelfRecon is not restricted by specific
model design, which can be used to any models with voxel-SDF for purely
self-supervised manner
V2CE: Video to Continuous Events Simulator
Dynamic Vision Sensor (DVS)-based solutions have recently garnered
significant interest across various computer vision tasks, offering notable
benefits in terms of dynamic range, temporal resolution, and inference speed.
However, as a relatively nascent vision sensor compared to Active Pixel Sensor
(APS) devices such as RGB cameras, DVS suffers from a dearth of ample labeled
datasets. Prior efforts to convert APS data into events often grapple with
issues such as a considerable domain shift from real events, the absence of
quantified validation, and layering problems within the time axis. In this
paper, we present a novel method for video-to-events stream conversion from
multiple perspectives, considering the specific characteristics of DVS. A
series of carefully designed losses helps enhance the quality of generated
event voxels significantly. We also propose a novel local dynamic-aware
timestamp inference strategy to accurately recover event timestamps from event
voxels in a continuous fashion and eliminate the temporal layering problem.
Results from rigorous validation through quantified metrics at all stages of
the pipeline establish our method unquestionably as the current
state-of-the-art (SOTA).Comment: 6 pages, 7 figure