949,684 research outputs found

    Pooling Faces: Template based Face Recognition with Pooled Face Images

    Full text link
    We propose a novel approach to template based face recognition. Our dual goal is to both increase recognition accuracy and reduce the computational and storage costs of template matching. To do this, we leverage on an approach which was proven effective in many other domains, but, to our knowledge, never fully explored for face images: average pooling of face photos. We show how (and why!) the space of a template's images can be partitioned and then pooled based on image quality and head pose and the effect this has on accuracy and template size. We perform extensive tests on the IJB-A and Janus CS2 template based face identification and verification benchmarks. These show that not only does our approach outperform published state of the art despite requiring far fewer cross template comparisons, but also, surprisingly, that image pooling performs on par with deep feature pooling.Comment: Appeared in the IEEE Computer Society Workshop on Biometrics, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), June, 201

    Entity-centric knowledge discovery for idiosyncratic domains

    Get PDF
    Technical and scientific knowledge is produced at an ever-accelerating pace, leading to increasing issues when trying to automatically organize or process it, e.g., when searching for relevant prior work. Knowledge can today be produced both in unstructured (plain text) and structured (metadata or linked data) forms. However, unstructured content is still themost dominant formused to represent scientific knowledge. In order to facilitate the extraction and discovery of relevant content, new automated and scalable methods for processing, structuring and organizing scientific knowledge are called for. In this context, a number of applications are emerging, ranging fromNamed Entity Recognition (NER) and Entity Linking tools for scientific papers to specific platforms leveraging information extraction techniques to organize scientific knowledge. In this thesis, we tackle the tasks of Entity Recognition, Disambiguation and Linking in idiosyncratic domains with an emphasis on scientific literature. Furthermore, we study the related task of co-reference resolution with a specific focus on named entities. We start by exploring Named Entity Recognition, a task that aims to identify the boundaries of named entities in textual contents. We propose a newmethod to generate candidate named entities based on n-gram collocation statistics and design several entity recognition features to further classify them. In addition, we show how the use of external knowledge bases (either domain-specific like DBLP or generic like DBPedia) can be leveraged to improve the effectiveness of NER for idiosyncratic domains. Subsequently, we move to Entity Disambiguation, which is typically performed after entity recognition in order to link an entity to a knowledge base. We propose novel semi-supervised methods for word disambiguation leveraging the structure of a community-based ontology of scientific concepts. Our approach exploits the graph structure that connects different terms and their definitions to automatically identify the correct sense that was originally picked by the authors of a scientific publication. We then turn to co-reference resolution, a task aiming at identifying entities that appear using various forms throughout the text. We propose an approach to type entities leveraging an inverted index built on top of a knowledge base, and to subsequently re-assign entities based on the semantic relatedness of the introduced types. Finally, we describe an application which goal is to help researchers discover and manage scientific publications. We focus on the problem of selecting relevant tags to organize collections of research papers in that context. We experimentally demonstrate that the use of a community-authored ontology together with information about the position of the concepts in the documents allows to significantly increase the precision of tag selection over standard methods

    Neural networks application to divergence-based passive ranging

    Get PDF
    The purpose of this report is to summarize the state of knowledge and outline the planned work in divergence-based/neural networks approach to the problem of passive ranging derived from optical flow. Work in this and closely related areas is reviewed in order to provide the necessary background for further developments. New ideas about devising a monocular passive-ranging system are then introduced. It is shown that image-plan divergence is independent of image-plan location with respect to the focus of expansion and of camera maneuvers because it directly measures the object's expansion which, in turn, is related to the time-to-collision. Thus, a divergence-based method has the potential of providing a reliable range complementing other monocular passive-ranging methods which encounter difficulties in image areas close to the focus of expansion. Image-plan divergence can be thought of as some spatial/temporal pattern. A neural network realization was chosen for this task because neural networks have generally performed well in various other pattern recognition applications. The main goal of this work is to teach a neural network to derive the divergence from the imagery

    Machine Conscious Architecture for State Exploitation and Decision Making

    Get PDF
    This research addressed a critical limitation in the area of computational intelligence by developing a general purpose architecture for information processing and decision making. Traditional computational intelligence methods are best suited for well-defined problems with extensive, long-term knowledge of the environmental and operational conditions the system will encounter during operation. These traditional approaches typically generate quick answers (i.e., reflexive responses) using pattern recognition methods. Most pattern recognition techniques are static processes which consist of a predefined series of computations. For these pattern recognition approaches to be effective, training data is required from all anticipated environments and operating conditions. The proposed framework, Conscious Architecture for State Exploitation (CASE), is a general purpose architecture designed to mimic key characteristics of human information processing. CASE combines low- and high-level cognitive processes into a common framework to enable goal-based decision making. The CASE approach is to generate artificial phenomenal states (i.e., generate qualia = consciousness) into a shared computational process to enhance goal-based decision making and adaptation. That is, this approach allows for the appropriate decision and corresponding adaptive behavior as the goals and environmental factors change. To demonstrate the engineering advantages of CASE, it was used in an airframe application to autonomously monitor the integrity of a flight critical structural component. In this demonstration, CASE automatically generated a timely maintenance recommendation when unacceptable cracking was detected. Over the lifetime of the investigated component, operational availability increased by a minimum of 10.7%, operational cost decreased by 79%, and maintenance intervals (i.e., MTBM) increased by a minimum of 900%

    Ontology driven voice-based interaction in mobile environment

    Get PDF
    The paper deals with a new approach for spoken dialogue handling in mobile environment. The goal of our project is to allow the user to retrieve information from a knowledge base defined by ontology, using speech in a mobile environment. This environment has specific features that should be taken into account when the speech recognition and synthesis is performed. First of all, it limits the size of the language that can be understood by speech recognizers. On the other hand, it allows us to use information about user context. Our approach is to use the knowledge and user context to allow the user to speak freely to the system. Our research has been performed in the framework of an EU funded project MUMMY. This project is targeted to the use of mobile devices on building sites. This fact determines the approach to the solution of the problem. The main issue is user context in which the interaction takes place. As the application (construction site) is rather specific it is possible to use the knowledge related to this particular application during the speech recognition process. Up-to now the voice based user interfaces are based on various techniques that usually contain various constraints which limit the communication context to strictly predefined application domain. The main idea behind our solution is usage of ontology that represents the knowledge related to our particular application in specific user context. The knowledge acquired from ontology allows the user to communicate in mobile environment as the user input analysis is heavily simplified. The crucial step in our solution was the design of proper system architecture that allows the system to access the knowledge in ontology and use it to enhance the recognition process. The model of environment in which the recognition process is performed has several parts: - Domain ontology (construction sites in general) - instance of the domain ontology (specific construction site) - conversation history + specific user context (location, type of mobile device etc.). The key part of the model is the access mechanism that allows to extract particular knowledge in specific context. This access mechanism is controlled by means of dialogue automaton that controls the course of dialogue. The acquired knowledge is used in the speech recognizer for generation of a specific grammar that defines the possible speech inputs in a particular moment of the dialogue - in the next state another access into ontology in different context is done resulting in generation of a grammar that defines new possible inputs. The same access mechanism is also used to produce a response to user\u27s input in natural language. There exists a pilot implementation of the voice based user interface system, which has been tested in various situations and the results obtained are very encouraging

    Reasoning with visual knowledge in an object recognition system

    Get PDF
    The impact of artificial intelligence on computer vision has provided various perspectives and approaches to solving problems of the human visual system. Some of the symbolic processing and knowledge-based techniques implemented in vision systems represent a meaningful extension to the low-level, algorithmic processing which has been emphasized since the advent of the computer vision field. The higher-level processes attempt to capture the essence of visual cognition, specifically by encompassing a model of the visual world and the reasoning processes that manipulate this stored visual knowledge and environmental cues. This thesis includes a discussion of existing computer vision systems surveyed from a high-level perspective. The goal of this thesis is to develop a high-level inference system that implements reasoning processes and utilizes a visual memory model to achieve object recognition in a specific domain. The focus is on symbolically representing and reasoning with high-level knowledge using a frame-based approach. The organization and structuring of domain knowledge, reasoning processes and control and search strategies are emphasized. The implementation utilizes a frame package written in Prolog
    corecore