1,249 research outputs found

    Deep face tracking and parsing in the wild

    Get PDF
    Face analysis has been a long-standing research direction in the field of computer vision and pattern recognition. A complete face analysis system involves solving several tasks including face detection, face tracking, face parsing, and face recognition. Recently, the performance of methods in all tasks has significantly improved thanks to the employment of Deep Convolutional Neural Networks (DCNNs). However, existing face analysis algorithms mainly focus on solving facial images captured in the constrained laboratory environment, and their performance on real-world images has remained less explored. Compared with the lab environment, the in-the-wild settings involve greater diversity in face sizes, poses, facial expressions, background clutters, lighting conditions and imaging quality. This thesis investigates two fundamental tasks in face analysis under in-the-wild settings: face tracking and face parsing. Both tasks serve as important prerequisites for downstream face analysis applications. However, in-the-wild datasets remain scarce in both fields and models have not been rigorously evaluated in such settings. In this thesis, we aim to bridge that gap of lacking in-the-wild data, evaluate existing methods in these settings, and develop accurate, robust and efficient deep learning-based methods for the two tasks. For face tracking in the wild, we introduce the first in-the-wild face tracking dataset, MobiFace, that consists of 80 videos captured by mobile phones during mobile live-streaming. The environment of the live-streaming performance is fully unconstrained and the interactions between users and mobile phones are natural and spontaneous. Next, we evaluate existing tracking methods, including generic object trackers and dedicated face trackers. The results show that MobiFace represent unique challenges in face tracking in the wild and cannot be readily solved by existing methods. Finally, we present a DCNN-based framework, FT-RCNN, that significantly outperforms other methods in face tracking in the wild. For face parsing in the wild, we introduce the first large-scale in-the-wild face dataset, iBugMask, that contains 21, 866 training images and 1, 000 testing images. Unlike existing datasets, the images in iBugMask are captured in the fully unconstrained environment and are not cropped or preprocessed of any kind. Manually annotated per-pixel labels for eleven facial regions are provided for each target face. Next, we benchmark existing parsing methods and the results show that iBugMask is extremely challenging for all methods. By rigorous benchmarking, we observe that the pre-processing of facial images with bounding boxes in face parsing in the wild introduces bias. When cropping the face with a bounding box, a cropping margin has to be hand-picked. If face alignment is used, fiducial landmarks are required and a predefined alignment template has to be selected. These additional hyper-parameters have to be carefully considered and can have a significant impact on the face parsing performance. To solve this, we propose Region-of-Interest (RoI) Tanh-polar transform that warps the whole image to a fixed-sized representation. Moreover, the RoI Tanh-polar transform is differentiable and allows for rotation equivariance in 1 DCNNs. We show that when coupled with a simple Fully Convolutional Network, our RoI Tanh-polar transformer Network has achieved state-of-the-art results on face parsing in the wild. This thesis contributes towards in-the-wild face tracking and face parsing by providing novel datasets and proposing effective frameworks. Both tasks can benefit real-world downstream applications such as facial age estimation, facial expression recognition and lip-reading. The proposed RoI Tanh-polar transform also provides a new perspective in how to preprocess the face images and make the DCNNs truly end-to-end for real-world face analysis applications.Open Acces

    MobiFace: A Novel Dataset for Mobile Face Tracking in the Wild

    Full text link
    Face tracking serves as the crucial initial step in mobile applications trying to analyse target faces over time in mobile settings. However, this problem has received little attention, mainly due to the scarcity of dedicated face tracking benchmarks. In this work, we introduce MobiFace, the first dataset for single face tracking in mobile situations. It consists of 80 unedited live-streaming mobile videos captured by 70 different smartphone users in fully unconstrained environments. Over 95K95K bounding boxes are manually labelled. The videos are carefully selected to cover typical smartphone usage. The videos are also annotated with 14 attributes, including 6 newly proposed attributes and 8 commonly seen in object tracking. 36 state-of-the-art trackers, including facial landmark trackers, generic object trackers and trackers that we have fine-tuned or improved, are evaluated. The results suggest that mobile face tracking cannot be solved through existing approaches. In addition, we show that fine-tuning on the MobiFace training data significantly boosts the performance of deep learning-based trackers, suggesting that MobiFace captures the unique characteristics of mobile face tracking. Our goal is to offer the community a diverse dataset to enable the design and evaluation of mobile face trackers. The dataset, annotations and the evaluation server will be on \url{https://mobiface.github.io/}.Comment: To appear on The 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019

    Differential-Series-Fed Dual-Polarized Traveling-Wave Array for Full-Duplex Applications

    Get PDF

    Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph

    Full text link
    Business Intelligence (BI) is crucial in modern enterprises and billion-dollar business. Traditionally, technical experts like database administrators would manually prepare BI-models (e.g., in star or snowflake schemas) that join tables in data warehouses, before less-technical business users can run analytics using end-user dashboarding tools. However, the popularity of self-service BI (e.g., Tableau and Power-BI) in recent years creates a strong demand for less technical end-users to build BI-models themselves. We develop an Auto-BI system that can accurately predict BI models given a set of input tables, using a principled graph-based optimization problem we propose called \textit{k-Min-Cost-Arborescence} (k-MCA), which holistically considers both local join prediction and global schema-graph structures, leveraging a graph-theoretical structure called \textit{arborescence}. While we prove k-MCA is intractable and inapproximate in general, we develop novel algorithms that can solve k-MCA optimally, which is shown to be efficient in practice with sub-second latency and can scale to the largest BI-models we encounter (with close to 100 tables). Auto-BI is rigorously evaluated on a unique dataset with over 100K real BI models we harvested, as well as on 4 popular TPC benchmarks. It is shown to be both efficient and accurate, achieving over 0.9 F1-score on both real and synthetic benchmarks.Comment: full version of a paper accepted to VLDB 202
    • …
    corecore