4,388 research outputs found
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
Modeling visual search not only offers an opportunity to predict the
usability of an interface before actually testing it on real users, but also
advances scientific understanding about human behavior. In this work, we first
conduct a set of analyses on a large-scale dataset of visual search tasks on
realistic webpages. We then present a deep neural network that learns to
predict the scannability of webpage content, i.e., how easy it is for a user to
find a specific target. Our model leverages both heuristic-based features such
as target size and unstructured features such as raw image pixels. This
approach allows us to model complex interactions that might be involved in a
realistic visual search task, which can not be easily achieved by traditional
analytical models. We analyze the model behavior to offer our insights into how
the salience map learned by the model aligns with human intuition and how the
learned semantic representation of each target type relates to its visual
search performance.Comment: the 2020 CHI Conference on Human Factors in Computing System
Enabling High-Level Machine Reasoning with Cognitive Neuro-Symbolic Systems
High-level reasoning can be defined as the capability to generalize over
knowledge acquired via experience, and to exhibit robust behavior in novel
situations. Such form of reasoning is a basic skill in humans, who seamlessly
use it in a broad spectrum of tasks, from language communication to decision
making in complex situations. When it manifests itself in understanding and
manipulating the everyday world of objects and their interactions, we talk
about common sense or commonsense reasoning. State-of-the-art AI systems don't
possess such capability: for instance, Large Language Models have recently
become popular by demonstrating remarkable fluency in conversing with humans,
but they still make trivial mistakes when probed for commonsense competence; on
a different level, performance degradation outside training data prevents
self-driving vehicles to safely adapt to unseen scenarios, a serious and
unsolved problem that limits the adoption of such technology. In this paper we
propose to enable high-level reasoning in AI systems by integrating cognitive
architectures with external neuro-symbolic components. We illustrate a hybrid
framework centered on ACT-R and we discuss the role of generative models in
recent and future applications
- …