21 research outputs found

    Analysis of a biologically-inspired system for real-time object recognition

    Get PDF
    We present a biologically-inspired system for real-time, feed-forward object recognition in cluttered scenes. Our system utilizes a vocabulary of very sparse features that are shared between and within different object models. To detect objects in a novel scene, these features are located in the image, and each detected feature votes for all objects that are consistent with its presence. Due to the sharing of features between object models our approach is more scalable to large object databases than traditional methods. To demonstrate the utility of this approach, we train our system to recognize any of 50 objects in everyday cluttered scenes with substantial occlusion. Without further optimization we also demonstrate near-perfect recognition on a standard 3-D recognition problem. Our system has an interpretation as a sparsely connected feed-forward neural network, making it a viable model for fast, feed-forward object recognition in the primate visual system

    Multiple-Cue Object Recognition for Interactionable Objects

    No full text
    Category-level object recognition is a fundamental capability for the potential use of robots in the assistance of humans in useful tasks. There have been numerous vision-based object recognition systems yielding fast and accurate results in constrained environments. However, by depending on visual cues, these techniques are susceptible to object variations in size, lighting, rotation, and pose, all of which cannot be avoided in real video data. Thus, the task of object recognition still remains very challenging. My thesis work builds upon the fact that robots can observe humans interacting with the objects in their environment. We refer to the set of objects, which can be involved in the interaction as `interactionable' objects. The interaction of humans with the `interactionable' objects provides numerous nonvisual cues to the identity of objects. In this thesis, I will introduce a flexible object recognition approach called Multiple-Cue Object Recognition (MCOR) that can use multiple cues of any predefined type, whether they are cues intrinsic to the object or provided by observation of a human. In pursuit of this goal, the thesis will provide several contributions: A representation for the multiple cues including an object definition that allows for the flexible addition of these cues; Weights that reflect the various strength of association between a particular cue and a particular object using a probabilistic relational model, as well as object displacement values for localizing the information in an image; Tools for defining visual features, segmentation, tracking, and the values for the non-visual cues; Lastly, an object recognition algorithm for the incremental discrimination of potential object categories. We evaluate these contributions through a number of methods including simulation to demonstrate the learning of weights and recognition based on an analytical model, an analytical model that demonstrates the robustness of the MCOR framework, and recognition results on real video data using a number of datasets including video taken from a humanoid robot (Sony QRIO), video captured from a meeting setting, scripted scenarios from outside universities, and unscripted TV cooking data. Using the datasets, we demonstrate the basic features of the MCOR algorithm including its ability to use multiple cues of different types. We demonstrate the applicability of MCOR to an outside dataset. We show that MCOR has better recognition results over vision-only recognition systems, and show that performance only improves with the addition of more cue types

    ABSTRACT Towards Using Multiple Cues for Robust Object Recognition

    No full text
    A robot’s ability to assist humans in a variety of tasks, e.g. in search and rescue or in a household, heavily depends on the robot’s reliable recognition of the objects in the environment. Numerous approaches attempt to recognize objects based only on the robot’s vision. However, the same type of object can have very different visual appearances, such as shape, size, pose, and color. Although such approaches are widely studied with relative success, the general object recognition task still remains very challenging. We build our work upon the fact that robots can observe humans interacting with the objects in their environment, and thus providing numerous non-visual cues to those objects ’ identities. We research on a flexible object recognition approach which can use any multiple cues, whether they are visual cues intrinsic to the object or provided by observation of a human. We realize the challenging issue that multiple cues can have different weight in their association with an object definition and need to be taken into account during recognition. In this paper, we contribute a probabilistic relational representation of the cue weights and an object recognition algorithm that can flexibly combine multiple cues of any type to robustly recognize objects. We show illustrative results of our implemented approach using visual, activity, gesture, and speech cues, provided by machine or human, to recognize objects more robustly than when using only a single cue

    Simulation and Weights of Multiple Cues for Robust Object Recognition ∗ ABSTRACT

    No full text
    Reliable recognition of objects is an important capabaility in the progress towards getting agents to accomplish and assist in a variety of useful tasks such as search and rescue or office assistance. Numerous approaches attempt to recognize objects based only on the robot’s vision. However, the same type of object can have very different visual appearances, such as shape, size, pose, color. Although such approaches are widely studied with relative success, the general object recognition task still remains difficult. In previous work, we introduced MCOR (Multiple-Cue Object Recognition), a flexible object recognition approach which can use any multiple cues, whether they are visual cues intrinsic to the object or provided by observation of a human. As part of the framework, weights were provided to reflect the variation in the strength of the association between a particular cue and an object. In this paper, we demonstrate how the probabilistic relational framework used to determine the weights can be used in complex scenarios with numerous objects, cues and the relationship between them. We develop a simulator that can generate these complex scenarios using cues based on real recognition systems
    corecore