12 research outputs found

    Modification of the AdaBoost-based Detector for Partially Occluded Faces

    Full text link
    While face detection seems a solved problem under general conditions, most state-of-the-art systems degrade rapidly when faces are partially occluded by other objects. This paper presents a solution to detect partially occluded faces by reasonably modifying the AdaBoost-based face detector. Our basic idea is that the weak classifiers in the AdaBoost-based face detector, each corresponding to a Haar-like feature, are inherently a patch-based model. Therefore, one can divide the whole face region into multiple patches, and map those weak classifiers to the patches. The weak classifiers belonging to each patch are re-formed to be a new classifier to determine if it is a valid face patch—without occlusion. Finally, we combine all of the valid face patches by assigning the patches with different weights to make the final decision whether the input subwindow is a face. The experimental results show that the proposed method is promising for the detection of occluded faces. 1

    A Deep Cascade Model for Multi-Document Reading Comprehension

    Full text link
    A fundamental trade-off between effectiveness and efficiency needs to be balanced when designing an online question answering system. Effectiveness comes from sophisticated functions such as extractive machine reading comprehension (MRC), while efficiency is obtained from improvements in preliminary retrieval components such as candidate document selection and paragraph ranking. Given the complexity of the real-world multi-document MRC scenario, it is difficult to jointly optimize both in an end-to-end system. To address this problem, we develop a novel deep cascade learning model, which progressively evolves from the document-level and paragraph-level ranking of candidate texts to more precise answer extraction with machine reading comprehension. Specifically, irrelevant documents and paragraphs are first filtered out with simple functions for efficiency consideration. Then we jointly train three modules on the remaining texts for better tracking the answer: the document extraction, the paragraph extraction and the answer extraction. Experiment results show that the proposed method outperforms the previous state-of-the-art methods on two large-scale multi-document benchmark datasets, i.e., TriviaQA and DuReader. In addition, our online system can stably serve typical scenarios with millions of daily requests in less than 50ms.Comment: Accepted at AAAI 201

    Enhanced face detection framework based on skin color and false alarm rejection

    Get PDF
    Fast and precise face detection is a challenging task in computer vision. Human face detection plays an essential role in the first stage of face processing applications such as recognition tracking, and image database management. In the applications, face objects often come from an inconsequential part of images that contain variations namely different illumination, pose, and occlusion. These variations can decrease face detection rate noticeably. Besides that, detection time is an important factor, especially in real time systems. Most existing face detection approaches are not accurate as they have not been able to resolve unstructured images due to large appearance variations and can only detect human face under one particular variation. Existing frameworks of face detection need enhancement to detect human face under the stated variations to improve detection rate and reduce detection time. In this study, an enhanced face detection framework was proposed to improve detection rate based on skin color and provide a validity process. A preliminary segmentation of input images based on skin color can significantly reduce search space and accelerate the procedure of human face detection. The main detection process is based on Haar-like features and Adaboost algorithm. A validity process is introduced to reject non-face objects, which may be selected during a face detection process. The validity process is based on a two-stage Extended Local Binary Patterns. Experimental results on CMU-MIT and Caltech 10000 datasets over a wide range of facial variations in different colors, positions, scales, and lighting conditions indicated a successful face detection rate. As a conclusion, the proposed enhanced face detection framework in color images with the presence of varying lighting conditions and under different poses has resulted in high detection rate and reducing overall detection time

    Sistema de visão para detecção de pessoas em movimento

    Get PDF
    Estágio realizado na Acronym-I.T., e orientado pelo Eng. Carlos SilvaTese de mestrado integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201

    Tracking the emergence of visual recognition through multivariate approaches

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 121-130).The visual system is a dynamic entity whose response properties depend on context and experience. In this thesis, I examine how the brain changes as we learn to see - what changes occur during the onset of recognition, in the mature visual system on the one hand, and in a developmentally nascent one, on the other? Working with normal adults, I focus on the processes that underlie the interpretation of images as meaningful entities. This interpretation is greatly facilitated by prior information about a stimulus. What are the neural sites that exhibit experience dependent changes? Using multivariate decoding techniques, I find pervasive evidence of such changes throughout the visual system. Critically, cortical regions previously implicated in such learning are not the same loci as sites of increased information. Examining the temporal mechanisms of recognition, I identify the perceptual state transitions corresponding to the onset of meaning in an observed image. Furthermore, decoding techniques reveal the flow of information during this 'eureka moment.' I find feedback processing when a degraded image is first meaningfully interpreted, and then a rapid transition into feed-forward processing for more coherent images. Complementing the studies with mature subjects, my work with developmentally nascent observers explores the genesis of visual interpretation. What neural changes accompany the earliest stages of visual learning? I show that children treated for congenital blindness exhibit significant cortical re-organization after sight onset, in contrast to the classical notion of a critical period for visual plasticity. The specific kind of reorganization suggests that visual experience enhances information coding efficiency in visual cortex. Additionally, I present evidence of rapid development of functionally specialized cortical regions. Overall, the thesis presents two complementary perspectives on the genesis of visual meaning. The results help advance our understanding of how short-term experience, as well as developmental history, shapes our interpretation of the complex visual world.by Scott Gorlin.Ph.D
    corecore