291,312 research outputs found
Computer vision
The field of computer vision is surveyed and assessed, key research issues are identified, and possibilities for a future vision system are discussed. The problems of descriptions of two and three dimensional worlds are discussed. The representation of such features as texture, edges, curves, and corners are detailed. Recognition methods are described in which cross correlation coefficients are maximized or numerical values for a set of features are measured. Object tracking is discussed in terms of the robust matching algorithms that must be devised. Stereo vision, camera control and calibration, and the hardware and systems architecture are discussed
Are object detection assessment criteria ready for maritime computer vision?
Maritime vessels equipped with visible and infrared cameras can complement
other conventional sensors for object detection. However, application of
computer vision techniques in maritime domain received attention only recently.
The maritime environment offers its own unique requirements and challenges.
Assessment of the quality of detections is a fundamental need in computer
vision. However, the conventional assessment metrics suitable for usual object
detection are deficient in the maritime setting. Thus, a large body of related
work in computer vision appears inapplicable to the maritime setting at the
first sight. We discuss the problem of defining assessment metrics suitable for
maritime computer vision. We consider new bottom edge proximity metrics as
assessment metrics for maritime computer vision. These metrics indicate that
existing computer vision approaches are indeed promising for maritime computer
vision and can play a foundational role in the emerging field of maritime
computer vision
Human-Centered Computer Vision
Contains fulltext :
241512.pdf (Publisher’s version ) (Open Access)Symposium on The Art and Science of Pattern Recognitio
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
WarpedGANSpace: Finding non-linear RBF paths in GAN latent space
This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs, so as to provide an intuitive and easy way of controlling the underlying generative factors. In doing so, it addresses some of the limitations of the state-of-the-art works, namely, a) that they discover directions that are independent of the latent code, i.e., paths that are linear, and b) that their evaluation relies either on visual inspection or on laborious human labeling. More specifically, we propose to learn non-linear warpings on the latent space, each one parametrized by a set of RBF-based latent space warping functions, and where each warping gives rise to a family of non-linear paths via the gradient of the function. Building on the work of Voynov and Babenko that discovers linear paths, we optimize the trainable parameters of the set of RBFs, so as that images that are generated by codes along different paths, are easily distinguishable by a discriminator network. This leads to easily distinguishable image transformations, such as pose and facial expressions in facial images. We show that linear paths can be derived as a special case of our method, and show experimentally that non-linear paths in the latent space lead to steeper, more disentangled and interpretable changes in the image space than in state-of-the art methods, both qualitatively and quantitatively. We make the code and the pretrained models publicly available at: https://github.com/chi0tzp/WarpedGANSpace
Recommended from our members
Investigating the Intelligibility of a Computer Vision System for Blind Users
Computer vision systems to help blind usersare becoming increasingly common yet often these systems are not intelligible. Our work investigates the intelligibility of a wearable computer vision system to help blind users locate and identify people in their vicinity. Providing a continuous stream of information, this system allows us to explore intelligibility through interaction and instructions, going beyond studies of intelligibility that focus on explaining a decision a computer vision system might make. In a study with 13 blind users, we explored whether varying instructions (either basic or enhanced) about how the system worked would change blind users’ experience of the system. We found offering a more detailed set of instructions did not affect how successful users were using the system nor their perceived workload. We did, however, find evidence of significant differences in what they knew about the system, and they employed different, and potentially more effective, use strategies. Our findings have important implications for researchers and designers of computer vision systemsfor blind users, as well more general implications for understanding what it means to make interactive computer vision systems intelligible
- …