7 research outputs found

    Grid Loss: Detecting Occluded Faces

    Full text link
    Detection of partially occluded objects is a challenging computer vision problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of the detection window are occluded, since not every sub-part of the window is discriminative on its own. To address this issue, we propose a novel loss layer for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a convolution layer independently rather than over the whole feature map. This results in parts being more discriminative on their own, enabling the detector to recover if the detection window is partially occluded. By mapping our loss layer back to a regular fully connected layer, no additional computational cost is incurred at runtime compared to standard CNNs. We demonstrate our method for face detection on several public face detection benchmarks and show that our method outperforms regular CNNs, is suitable for realtime applications and achieves state-of-the-art performance.Comment: accepted to ECCV 201

    S3^3FD: Single Shot Scale-invariant Face Detector

    Full text link
    This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S3^3FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces. Specifically, we try to solve the common problem that anchor-based detectors deteriorate dramatically as the objects become smaller. We make contributions in the following three aspects: 1) proposing a scale-equitable face detection framework to handle different scales of faces well. We tile anchors on a wide range of layers to ensure that all scales of faces have enough features for detection. Besides, we design anchor scales based on the effective receptive field and a proposed equal proportion interval principle; 2) improving the recall rate of small faces by a scale compensation anchor matching strategy; 3) reducing the false positive rate of small faces via a max-out background label. As a consequence, our method achieves state-of-the-art detection performance on all the common face detection benchmarks, including the AFW, PASCAL face, FDDB and WIDER FACE datasets, and can run at 36 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images.Comment: Accepted by ICCV 2017 + its supplementary materials; Updated the latest results on WIDER FAC

    Online learning and detection of faces with low human supervision

    Get PDF
    The final publication is available at link.springer.comWe present an efficient,online,and interactive approach for computing a classifier, called Wild Lady Ferns (WiLFs), for face learning and detection using small human supervision. More precisely, on the one hand, WiLFs combine online boosting and extremely randomized trees (Random Ferns) to compute progressively an efficient and discriminative classifier. On the other hand, WiLFs use an interactive human-machine approach that combines two complementary learning strategies to reduce considerably the degree of human supervision during learning. While the first strategy corresponds to query-by-boosting active learning, that requests human assistance over difficult samples in function of the classifier confidence, the second strategy refers to a memory-based learning which uses ¿ Exemplar-based Nearest Neighbors (¿ENN) to assist automatically the classifier. A pre-trained Convolutional Neural Network (CNN) is used to perform ¿ENN with high-level feature descriptors. The proposed approach is therefore fast (WilFs run in 1 FPS using a code not fully optimized), accurate (we obtain detection rates over 82% in complex datasets), and labor-saving (human assistance percentages of less than 20%). As a byproduct, we demonstrate that WiLFs also perform semi-automatic annotation during learning, as while the classifier is being computed, WiLFs are discovering faces instances in input images which are used subsequently for training online the classifier. The advantages of our approach are demonstrated in synthetic and publicly available databases, showing comparable detection rates as offline approaches that require larger amounts of handmade training data.Peer ReviewedPostprint (author's final draft

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

    Deteksi Gerakan Kepala Berdasarkan Analisis Bounding Box Pada Citra Digital Berbasis Raspberry Pi

    Get PDF
    Disabilitas fisik yaitu suatu gangguan pada tubuh seseorang yang bisa membatasi fungsi salah satu anggota tubuh manusia bahkan bisa lebih dan juga mengganggu kemampuan sistem motorik seseorang. Disabilitas fisik dibagi menjadi beberapa bagian yaitu cacat kaki, cacat tangan, dan cacat kaki dan tangan. Apabila terjadi kecacatan pada kedua anggota tubuh tersebut maka akan sulit untuk mengendalikan sesuatu secara mandiri misalnya mengontrol kursi roda elektronik, memilih menu pada monitor, dan lain sebagainya. Oleh karena itu dibutuhkan penelitian untuk membantu penyandang disabilitas fisik kaki dan tangan agar dapat mempermudah dalam mengontrol alat-alat tersebut. Dalam sistem ini yang dibuat adalah deteksi pergerakan kepala berdasarkan analisis bounding box bagi penyandang disabilitas fisik. Penelitian ini menggunakan kamera logitech C310. Kamera tersebut diletakkan lurus di depan kepala pengguna. Kamera ini digunakan untuk menangkap gerakan kepala pengguna lalu hasil capture akan dikirim ke raspberry yang nantinya akan diproses menggunakan image processing untuk mendeteksi warna kulit dan pergerakan kepala. Langkah selanjutanya adalah mengklasifikasi pergerakan kepala pengguna. Klasifikasi ini menggunakan analisis perhitungan piksel yang merepresentasikan objek pada tiap kuadran dalam bounding box. Dalam bounding box tersebut terdapat 4 kuadran. Keluaran yang dihasilkan pada sistem ini adalah Kontrol LED. Hasil akurasi pengujian pada deteksi warna kulit yang bagus untuk mendeteksi warna kulit wajah adalah jarak 50 cm, 75cm, 100cm, dan 125cm pada waktu pagi dan siang dan juga jarak 50cm, 75cm pada waktu malam. Untuk hasil akurasi pengujian deteksi pergerakan kepala jarak dan waktu yang paling bagus yaitu pada jarak 50 cm pada waktu pagi, siang dan malam dengan presentase keseluruhan sebesar 90.62%. Rata-rata waktu komputasi tiap gerakan untuk kanan sebesar 59.87 ms, untuk kiri sebesar 57.64ms, untuk tegak sebesar 55.72ms, dan untuk menunduk sebesar 44.62 ms. Untuk akurasi integrasi sistem dengan hardware atau LED sebesar 100%
    corecore