19,545 research outputs found
Multi-Context Attention for Human Pose Estimation
In this paper, we propose to incorporate convolutional neural networks with a
multi-context attention mechanism into an end-to-end framework for human pose
estimation. We adopt stacked hourglass networks to generate attention maps from
features at multiple resolutions with various semantics. The Conditional Random
Field (CRF) is utilized to model the correlations among neighboring regions in
the attention map. We further combine the holistic attention model, which
focuses on the global consistency of the full human body, and the body part
attention model, which focuses on the detailed description for different body
parts. Hence our model has the ability to focus on different granularity from
local salient regions to global semantic-consistent spaces. Additionally, we
design novel Hourglass Residual Units (HRUs) to increase the receptive field of
the network. These units are extensions of residual units with a side branch
incorporating filters with larger receptive fields, hence features with various
scales are learned and combined within the HRUs. The effectiveness of the
proposed multi-context attention mechanism and the hourglass residual units is
evaluated on two widely used human pose estimation benchmarks. Our approach
outperforms all existing methods on both benchmarks over all the body parts.Comment: The first two authors contribute equally to this wor
Cognitive Deficit of Deep Learning in Numerosity
Subitizing, or the sense of small natural numbers, is an innate cognitive
function of humans and primates; it responds to visual stimuli prior to the
development of any symbolic skills, language or arithmetic. Given successes of
deep learning (DL) in tasks of visual intelligence and given the primitivity of
number sense, a tantalizing question is whether DL can comprehend numbers and
perform subitizing. But somewhat disappointingly, extensive experiments of the
type of cognitive psychology demonstrate that the examples-driven black box DL
cannot see through superficial variations in visual representations and distill
the abstract notion of natural number, a task that children perform with high
accuracy and confidence. The failure is apparently due to the learning method
not the CNN computational machinery itself. A recurrent neural network capable
of subitizing does exist, which we construct by encoding a mechanism of
mathematical morphology into the CNN convolutional kernels. Also, we
investigate, using subitizing as a test bed, the ways to aid the black box DL
by cognitive priors derived from human insight. Our findings are mixed and
interesting, pointing to both cognitive deficit of pure DL, and some measured
successes of boosting DL by predetermined cognitive implements. This case study
of DL in cognitive computing is meaningful for visual numerosity represents a
minimum level of human intelligence.Comment: Accepted for presentation at the AAAI-1
Assessment of algorithms for mitosis detection in breast cancer histopathology images
The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers. However, mitosis counting is laborious, subjective and may suffer from low inter-observer agreement. With the wider acceptance of whole slide images in pathology labs, automatic image analysis has been proposed as a potential solution for these issues.
In this paper, the results from the Assessment of Mitosis Detection Algorithms 2013 (AMIDA13) challenge are described. The challenge was based on a data set consisting of 12 training and 11 testing subjects, with more than one thousand annotated mitotic figures by multiple observers. Short descriptions and results from the evaluation of eleven methods are presented. The top performing method has an error rate that is comparable to the inter-observer agreement among pathologists
People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting
In this paper we propose a technique to adapt a convolutional neural network
(CNN) based object counter to additional visual domains and object types while
still preserving the original counting function. Domain-specific normalisation
and scaling operators are trained to allow the model to adjust to the
statistical distributions of the various visual domains. The developed
adaptation technique is used to produce a singular patch-based counting
regressor capable of counting various object types including people, vehicles,
cell nuclei and wildlife. As part of this study a challenging new cell counting
dataset in the context of tissue culture and patient diagnosis is constructed.
This new collection, referred to as the Dublin Cell Counting (DCC) dataset, is
the first of its kind to be made available to the wider computer vision
community. State-of-the-art object counting performance is achieved in both the
Shanghaitech (parts A and B) and Penguins datasets while competitive
performance is observed on the TRANCOS and Modified Bone Marrow (MBM) datasets,
all using a shared counting model.Comment: 10 page
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
The primate visual system achieves remarkable visual object recognition
performance even in brief presentations and under changes to object exemplar,
geometric transformations, and background variation (a.k.a. core visual object
recognition). This remarkable performance is mediated by the representation
formed in inferior temporal (IT) cortex. In parallel, recent advances in
machine learning have led to ever higher performing models of object
recognition using artificial deep neural networks (DNNs). It remains unclear,
however, whether the representational performance of DNNs rivals that of the
brain. To accurately produce such a comparison, a major difficulty has been a
unifying metric that accounts for experimental limitations such as the amount
of noise, the number of neural recording sites, and the number trials, and
computational limitations such as the complexity of the decoding classifier and
the number of classifier training examples. In this work we perform a direct
comparison that corrects for these experimental limitations and computational
considerations. As part of our methodology, we propose an extension of "kernel
analysis" that measures the generalization accuracy as a function of
representational complexity. Our evaluations show that, unlike previous
bio-inspired models, the latest DNNs rival the representational performance of
IT cortex on this visual object recognition task. Furthermore, we show that
models that perform well on measures of representational performance also
perform well on measures of representational similarity to IT and on measures
of predicting individual IT multi-unit responses. Whether these DNNs rely on
computational mechanisms similar to the primate visual system is yet to be
determined, but, unlike all previous bio-inspired models, that possibility
cannot be ruled out merely on representational performance grounds.Comment: 35 pages, 12 figures, extends and expands upon arXiv:1301.353
Fractals in the Nervous System: conceptual Implications for Theoretical Neuroscience
This essay is presented with two principal objectives in mind: first, to
document the prevalence of fractals at all levels of the nervous system, giving
credence to the notion of their functional relevance; and second, to draw
attention to the as yet still unresolved issues of the detailed relationships
among power law scaling, self-similarity, and self-organized criticality. As
regards criticality, I will document that it has become a pivotal reference
point in Neurodynamics. Furthermore, I will emphasize the not yet fully
appreciated significance of allometric control processes. For dynamic fractals,
I will assemble reasons for attributing to them the capacity to adapt task
execution to contextual changes across a range of scales. The final Section
consists of general reflections on the implications of the reviewed data, and
identifies what appear to be issues of fundamental importance for future
research in the rapidly evolving topic of this review
A Binary Neural Shape Matcher using Johnson Counters and Chain Codes
In this paper, we introduce a neural network-based shape matching algorithm that uses Johnson Counter codes coupled with chain codes. Shape matching is a fundamental requirement in content-based image retrieval systems. Chain codes describe shapes using sequences of numbers. They are simple and flexible. We couple this power with the efficiency and flexibility of a binary associative-memory neural network. We focus on the implementation details of the algorithm when it is constructed using the neural network. We demonstrate how the binary associative-memory neural network can index and match chain codes where the chain code elements are represented by Johnson codes
- …