8 research outputs found

    Who are you? - real-time person identification

    Full text link
    This paper presents a system for person identification that uses concise statis-tical models of facial features in a real-time realisation of the cast identifica-tion system of Everingham et al. [7]. Our system integrates the cascaded face detector of Viola and Jones with a kernel-based regressor for face tracking, which is trained on-line when new people are detected in the video stream. A pictorial model is used to compute the locations of facial features, which form a descriptor of the person’s face. When sufficient samples are collected, identification is performed using a random-ferns classifier by marginalising over the facial features. This confers robustness to localisation errors and occlusions, while enabling a real-time search of the database. These four different processes communicate within a real-time framework capable of tracking and identifying up to 5 people in real-time on a standard dual-core 1.86GHz machine.

    "'Who are you?' - Learning person specific classifiers from video"

    Get PDF
    We investigate the problem of automatically labelling faces of characters in TV or movie material with their names, using only weak supervision from automaticallyaligned subtitle and script text. Our previous work (Everingham et al. [8]) demonstrated promising results on the task, but the coverage of the method (proportion of video labelled) and generalization was limited by a restriction to frontal faces and nearest neighbour classification. In this paper we build on that method, extending the coverage greatly by the detection and recognition of characters in profile views. In addition, we make the following contributions: (i) seamless tracking, integration and recognition of profile and frontal detections, and (ii) a character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters. We report results on seven episodes of the TV series “Buffy the Vampire Slayer”, demonstrating significantly increased coverage and performance with respect to previous methods on this material

    Taking the bite out of automated naming of characters in TV video

    No full text
    We investigate the problem of automatically labelling appearances of characters in TV or film material with their names. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precision can be achieved by combining multiple sources of information, both visual and textual. The principal novelties that we introduce are: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying when characters are speaking. In addition, we incorporate complementary cues of face matching and clothing matching to propose common annotations for face tracks, and consider choices of classifier which can potentially correct errors made in the automatic extraction of training data from the weak textual annotation. Results are presented on episodes of the TV series ‘‘Buffy the Vampire Slayer”

    Recent Trends in Computing

    Get PDF
    ABSTRACT Huge amount of video data is being generated every day, with enormous growth of security and surveillance system. It is immensely challengeable for researcher to search and retrieve accurate human face of interest from video with utmost speed. The proposed work is stimulated from the same concern. It would be the future demand for searching, browsing, and retrieving human face of interest from video database for several applications. This paper proposes the novel algorithm for human face retrieval from video database based on holistic approach. The Viola and Jones frontal face detector detect the face region. The next stage is face extraction which have input for grouping individual faces. The individual group of faces has converted into single normalized mean face using PCA. The final face group contains single face for each person occurred in video. After the pre-processing of normalized faces, recognition is performed on the basis of query face image

    Interactive person re-identification in TV series

    Get PDF

    Agrupamento de faces em vídeos digitais.

    Get PDF
    Faces humanas são algumas das entidades mais importantes frequentemente encontradas em vídeos. Devido ao substancial volume de produção e consumo de vídeos digitais na atualidade (tanto vídeos pessoais quanto provenientes das indústrias de comunicação e entretenimento), a extração automática de informações relevantes de tais vídeos se tornou um tema ativo de pesquisa. Parte dos esforços realizados nesta área tem se concentrado no uso do reconhecimento e agrupamento facial para auxiliar o processo de anotação automática de faces em vídeos. No entanto, algoritmos de agrupamento de faces atuais ainda não são robustos às variações de aparência de uma mesma face em situações de aquisição típicas. Neste contexto, o problema abordado nesta tese é o agrupamento de faces em vídeos digitais, com a proposição de nova abordagem com desempenho superior (em termos de qualidade do agrupamento e custo computacional) em relação ao estado-da-arte, utilizando bases de vídeos de referência da literatura. Com fundamentação em uma revisão bibliográfica sistemática e em avaliações experimentais, chegou-se à proposição da abordagem, a qual é constituída por módulos de pré-processamento, detecção de faces, rastreamento, extração de características, agrupamento, análise de similaridade temporal e reagrupamento espacial. A abordagem de agrupamento de faces proposta alcançou os objetivos planejados obtendo resultados superiores (no tocante a diferentes métricas) a métodos avaliados utilizando as bases de vídeos YouTube Celebrities (KIM et al., 2008) e SAIVT-Bnews (GHAEMMAGHAMI, DEAN e SRIDHARAN, 2013).Human faces are some of the most important entities frequently encountered in videos. As a result of the currently high volumes of digital videos production and consumption both personal and profissional videos, automatic extraction of relevant information from those videos has become an active research topic. Many efforts in this area have focused on the use of face clustering and recognition in order to aid with the process of annotating faces in videos. However, current face clustering algorithms are not robust to variations of appearance that a same face may suffer due to typical changes in acquisition scenarios. Hence, this thesis proposes a novel approach to the problem of face clustering in digital videos which achieves superior performance (in terms of clustering quality and computational cost) in comparison to the state-of-the-art, using reference video databases according to the literature. After performing a systematic literature review and experimental evaluations, the current approach has been proposed, which has the following modules: preprocessing, face detection, tracking, feature extraction, clustering, temporal similarity analysis, and spatial reclustering. The proposed approach for face clustering achieved the planned objectives obtaining better results (according to different metrics) than those presented by methods evaluated on the YouTube Celebrities videos dataset (KIM et al., 2008) and SAIVT-Bnews videos dataset (GHAEMMAGHAMI, DEAN e SRIDHARAN, 2013)

    Real-time person re-identification for interactive environments

    Get PDF
    The work presented in this thesis was motivated by a vision of the future in which intelligent environments in public spaces such as galleries and museums, deliver useful and personalised services to people via natural interaction, that is, without the need for people to provide explicit instructions via tangible interfaces. Delivering the right services to the right people requires a means of biometrically identifying individuals and then re-identifying them as they move freely through the environment. Delivering the service they desire requires sensing their context, for example, sensing their location or proximity to resources. This thesis presents both a context-aware system and a person re-identification method. A tabletop display was designed and prototyped with an infrared person-sensing context function. In experimental evaluation it exhibited tracking performance comparable to other more complex systems. A real-time, viewpoint invariant, person re-identification method is proposed based on a novel set of Viewpoint Invariant Multi-modal (ViMM) feature descriptors collected from depth-sensing cameras. The method uses colour and a combination of anthropometric properties logged as a function of body orientation. A neural network classifier is used to perform re-identification
    corecore