7 research outputs found
Grid Loss: Detecting Occluded Faces
Detection of partially occluded objects is a challenging computer vision
problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of
the detection window are occluded, since not every sub-part of the window is
discriminative on its own. To address this issue, we propose a novel loss layer
for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a
convolution layer independently rather than over the whole feature map. This
results in parts being more discriminative on their own, enabling the detector
to recover if the detection window is partially occluded. By mapping our loss
layer back to a regular fully connected layer, no additional computational cost
is incurred at runtime compared to standard CNNs. We demonstrate our method for
face detection on several public face detection benchmarks and show that our
method outperforms regular CNNs, is suitable for realtime applications and
achieves state-of-the-art performance.Comment: accepted to ECCV 201
SFD: Single Shot Scale-invariant Face Detector
This paper presents a real-time face detector, named Single Shot
Scale-invariant Face Detector (SFD), which performs superiorly on various
scales of faces with a single deep neural network, especially for small faces.
Specifically, we try to solve the common problem that anchor-based detectors
deteriorate dramatically as the objects become smaller. We make contributions
in the following three aspects: 1) proposing a scale-equitable face detection
framework to handle different scales of faces well. We tile anchors on a wide
range of layers to ensure that all scales of faces have enough features for
detection. Besides, we design anchor scales based on the effective receptive
field and a proposed equal proportion interval principle; 2) improving the
recall rate of small faces by a scale compensation anchor matching strategy; 3)
reducing the false positive rate of small faces via a max-out background label.
As a consequence, our method achieves state-of-the-art detection performance on
all the common face detection benchmarks, including the AFW, PASCAL face, FDDB
and WIDER FACE datasets, and can run at 36 FPS on a Nvidia Titan X (Pascal) for
VGA-resolution images.Comment: Accepted by ICCV 2017 + its supplementary materials; Updated the
latest results on WIDER FAC
Online learning and detection of faces with low human supervision
The final publication is available at link.springer.comWe present an efficient,online,and interactive approach for computing a classifier, called Wild Lady Ferns (WiLFs), for face learning and detection using small human supervision. More precisely, on the one hand, WiLFs combine online boosting and extremely randomized trees (Random Ferns) to compute progressively an efficient and discriminative classifier. On the other hand, WiLFs use an interactive human-machine approach that combines two complementary learning strategies to reduce considerably the degree of human supervision during learning. While the first strategy corresponds to query-by-boosting active learning, that requests human assistance over difficult samples in function of the classifier confidence, the second strategy refers to a memory-based learning which uses ¿ Exemplar-based Nearest Neighbors (¿ENN) to assist automatically the classifier. A pre-trained Convolutional Neural Network (CNN) is used to perform ¿ENN with high-level feature descriptors. The proposed approach is therefore fast (WilFs run in 1 FPS using a code not fully optimized), accurate (we obtain detection rates over 82% in complex datasets), and labor-saving (human assistance percentages of less than 20%).
As a byproduct, we demonstrate that WiLFs also perform semi-automatic annotation during learning, as while the classifier is being computed, WiLFs are discovering faces instances in input images which are used subsequently for training online the classifier. The advantages of our approach are demonstrated in synthetic and publicly available databases, showing comparable detection rates as offline approaches that require larger amounts of handmade training data.Peer ReviewedPostprint (author's final draft
A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"
Recently, technologies such as face detection, facial landmark localisation
and face recognition and verification have matured enough to provide effective
and efficient solutions for imagery captured under arbitrary conditions
(referred to as "in-the-wild"). This is partially attributed to the fact that
comprehensive "in-the-wild" benchmarks have been developed for face detection,
landmark localisation and recognition/verification. A very important technology
that has not been thoroughly evaluated yet is deformable face tracking
"in-the-wild". Until now, the performance has mainly been assessed
qualitatively by visually assessing the result of a deformable face tracking
technology on short videos. In this paper, we perform the first, to the best of
our knowledge, thorough evaluation of state-of-the-art deformable face tracking
pipelines using the recently introduced 300VW benchmark. We evaluate many
different architectures focusing mainly on the task of on-line deformable face
tracking. In particular, we compare the following general strategies: (a)
generic face detection plus generic facial landmark localisation, (b) generic
model free tracking plus generic facial landmark localisation, as well as (c)
hybrid approaches using state-of-the-art face detection, model free tracking
and facial landmark localisation technologies. Our evaluation reveals future
avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second
authorshi
Deteksi Gerakan Kepala Berdasarkan Analisis Bounding Box Pada Citra Digital Berbasis Raspberry Pi
Disabilitas fisik yaitu suatu gangguan pada tubuh seseorang yang bisa
membatasi fungsi salah satu anggota tubuh manusia bahkan bisa lebih dan juga
mengganggu kemampuan sistem motorik seseorang. Disabilitas fisik dibagi
menjadi beberapa bagian yaitu cacat kaki, cacat tangan, dan cacat kaki dan
tangan. Apabila terjadi kecacatan pada kedua anggota tubuh tersebut maka akan
sulit untuk mengendalikan sesuatu secara mandiri misalnya mengontrol kursi
roda elektronik, memilih menu pada monitor, dan lain sebagainya. Oleh karena
itu dibutuhkan penelitian untuk membantu penyandang disabilitas fisik kaki dan
tangan agar dapat mempermudah dalam mengontrol alat-alat tersebut. Dalam
sistem ini yang dibuat adalah deteksi pergerakan kepala berdasarkan analisis
bounding box bagi penyandang disabilitas fisik. Penelitian ini menggunakan
kamera logitech C310. Kamera tersebut diletakkan lurus di depan kepala
pengguna. Kamera ini digunakan untuk menangkap gerakan kepala pengguna
lalu hasil capture akan dikirim ke raspberry yang nantinya akan diproses
menggunakan image processing untuk mendeteksi warna kulit dan pergerakan
kepala. Langkah selanjutanya adalah mengklasifikasi pergerakan kepala
pengguna. Klasifikasi ini menggunakan analisis perhitungan piksel yang
merepresentasikan objek pada tiap kuadran dalam bounding box. Dalam
bounding box tersebut terdapat 4 kuadran. Keluaran yang dihasilkan pada sistem
ini adalah Kontrol LED. Hasil akurasi pengujian pada deteksi warna kulit yang
bagus untuk mendeteksi warna kulit wajah adalah jarak 50 cm, 75cm, 100cm,
dan 125cm pada waktu pagi dan siang dan juga jarak 50cm, 75cm pada waktu
malam. Untuk hasil akurasi pengujian deteksi pergerakan kepala jarak dan waktu
yang paling bagus yaitu pada jarak 50 cm pada waktu pagi, siang dan malam
dengan presentase keseluruhan sebesar 90.62%. Rata-rata waktu komputasi tiap
gerakan untuk kanan sebesar 59.87 ms, untuk kiri sebesar 57.64ms, untuk tegak
sebesar 55.72ms, dan untuk menunduk sebesar 44.62 ms. Untuk akurasi
integrasi sistem dengan hardware atau LED sebesar 100%