21,873 research outputs found
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
The scene superiority effect: object recognition in the context of natural scenes
Four experiments investigate the effect of background scene semantics on object recognition. Although past research has found that semantically consistent scene backgrounds can facilitate recognition of a target object, these claims have been challenged as the result of post-perceptual response bias rather than the perceptual processes of object recognition itself. The current study takes advantage of a paradigm from linguistic processing known as the Word Superiority Effect. Humans can better discriminate letters (e.g., D vs. K) in the context of a word (WORD vs. WORK) than in a non-word context (e.g., WROD vs. WROK) even when the context is non-predictive of the target identity. We apply this paradigm to objects in natural scenes, having subjects discriminate between objects in the context of scenes. Because the target objects were equally semantically consistent with any given scene and could appear in either semantically consistent or inconsistent contexts with equal probability, response bias could not lead to an apparent improvement in object recognition. The current study found a benefit to object recognition from semantically consistent backgrounds, and the effect appeared to be modulated by awareness of background scene semantics
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Detecting Semantic Parts on Partially Occluded Objects
In this paper, we address the task of detecting semantic parts on partially
occluded objects. We consider a scenario where the model is trained using
non-occluded images but tested on occluded images. The motivation is that there
are infinite number of occlusion patterns in real world, which cannot be fully
covered in the training data. So the models should be inherently robust and
adaptive to occlusions instead of fitting / learning the occlusion patterns in
the training data. Our approach detects semantic parts by accumulating the
confidence of local visual cues. Specifically, the method uses a simple voting
method, based on log-likelihood ratio tests and spatial constraints, to combine
the evidence of local cues. These cues are called visual concepts, which are
derived by clustering the internal states of deep networks. We evaluate our
voting scheme on the VehicleSemanticPart dataset with dense part annotations.
We randomly place two, three or four irrelevant objects onto the target object
to generate testing images with various occlusions. Experiments show that our
algorithm outperforms several competitors in semantic part detection when
occlusions are present.Comment: Accepted to BMVC 2017 (13 pages, 3 figures
Associating object names with descriptions of shape that distinguish possible from impossible objects.
Five experiments examine the proposal that object names are closely linked torepresentations of global, 3D shape by comparing memory for simple line drawings of structurally possible and impossible novel objects.Objects were rendered impossible through local edge violations to global coherence (cf. Schacter, Cooper, & Delaney, 1990) and supplementary observations confirmed that the sets of possible and impossible objects were matched for their distinctiveness. Employing a test of explicit recognition memory, Experiment 1 confirmed that the possible and impossible objects were equally memorable. Experiments 2–4 demonstrated that adults learn names (single-syllable non-words presented as count nouns, e.g., “This is a dax”) for possible objectsmore easily than for impossible objects, and an item-based analysis showed that this effect was unrelated to either the memorability or the distinctiveness of the individual objects. Experiment 3 indicated that the effects of object possibility on name learning were long term (spanning at least 2months), implying that the cognitive processes being revealed can support the learning of object names in everyday life. Experiment 5 demonstrated that hearing someone else name an object at presentation improves recognition memory for possible objects, but not for impossible objects. Taken together, the results indicate that object names are closely linked to the descriptions of global, 3D shape that can be derived for structurally possible objects but not for structurally impossible objects. In addition, the results challenge the view that object decision and explicit recognition necessarily draw on separate memory systems,with only the former being supported by these descriptions of global object shape. It seems that recognition also can be supported by these descriptions, provided the original encoding conditions encourage their derivation. Hearing an object named at encoding appears to be just such a condition. These observations are discussed in relation to the effects of naming in other visual tasks, and to the role of visual attention in object identification
- …