8,839 research outputs found
Optimum Selection of DNN Model and Framework for Edge Inference
This paper describes a methodology to select the optimum combination of deep neuralnetwork and software framework for visual inference on embedded systems. As a first step, benchmarkingis required. In particular, we have benchmarked six popular network models running on four deep learningframeworks implemented on a low-cost embedded platform. Three key performance metrics have beenmeasured and compared with the resulting 24 combinations: accuracy, throughput, and power consumption.Then, application-level specifications come into play. We propose a figure of merit enabling the evaluationof each network/framework pair in terms of relative importance of the aforementioned metrics for a targetedapplication. We prove through numerical analysis and meaningful graphical representations that only areduced subset of the combinations must actually be considered for real deployment. Our approach can beextended to other networks, frameworks, and performance parameters, thus supporting system-level designdecisions in the ever-changing ecosystem of embedded deep learning technology.Ministerio de EconomĂa y Competitividad (TEC2015-66878-C3-1-R)Junta de AndalucĂa (TIC 2338-2013)European Union Horizon 2020 (Grant 765866
Knowledge-rich Image Gist Understanding Beyond Literal Meaning
We investigate the problem of understanding the message (gist) conveyed by
images and their captions as found, for instance, on websites or news articles.
To this end, we propose a methodology to capture the meaning of image-caption
pairs on the basis of large amounts of machine-readable knowledge that has
previously been shown to be highly effective for text understanding. Our method
identifies the connotation of objects beyond their denotation: where most
approaches to image understanding focus on the denotation of objects, i.e.,
their literal meaning, our work addresses the identification of connotations,
i.e., iconic meanings of objects, to understand the message of images. We view
image understanding as the task of representing an image-caption pair on the
basis of a wide-coverage vocabulary of concepts such as the one provided by
Wikipedia, and cast gist detection as a concept-ranking problem with
image-caption pairs as queries. To enable a thorough investigation of the
problem of gist understanding, we produce a gold standard of over 300
image-caption pairs and over 8,000 gist annotations covering a wide variety of
topics at different levels of abstraction. We use this dataset to
experimentally benchmark the contribution of signals from heterogeneous
sources, namely image and text. The best result with a Mean Average Precision
(MAP) of 0.69 indicate that by combining both dimensions we are able to better
understand the meaning of our image-caption pairs than when using language or
vision information alone. We test the robustness of our gist detection approach
when receiving automatically generated input, i.e., using automatically
generated image tags or generated captions, and prove the feasibility of an
end-to-end automated process
Are v1 simple cells optimized for visual occlusions? : A comparative study
Abstract: Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of âglobularâ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of âglobularâ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of âglobularâ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of âglobularâ fields well. Our computational study, therefore, suggests that âglobularâ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex.
Author Summary: The statistics of our visual world is dominated by occlusions. Almost every image processed by our brain consists of mutually occluding objects, animals and plants. Our visual cortex is optimized through evolution and throughout our lifespan for such stimuli. Yet, the standard computational models of primary visual processing do not consider occlusions. In this study, we ask what effects visual occlusions may have on predicted response properties of simple cells which are the first cortical processing units for images. Our results suggest that recently observed differences between experiments and predictions of the standard simple cell models can be attributed to occlusions. The most significant consequence of occlusions is the prediction of many cells sensitive to center-surround stimuli. Experimentally, large quantities of such cells are observed since new techniques (reverse correlation) are used. Without occlusions, they are only obtained for specific settings and none of the seminal studies (sparse coding, ICA) predicted such fields. In contrast, the new type of response naturally emerges as soon as occlusions are considered. In comparison with recent in vivo experiments we find that occlusive models are consistent with the high percentages of center-surround simple cells observed in macaque monkeys, ferrets and mice
On the 'Reality' of Observable Properties
This note contains some initial work on attempting to bring recent
developments in the foundations of quantum mechanics concerning the nature of
the wavefunction within the scope of more logical and structural methods. A
first step involves generalising and reformulating a criterion for the reality
of the wavefunction proposed by Harrigan & Spekkens, which was central to the
PBR theorem. The resulting criterion has several advantages, including the
avoidance of certain technical difficulties relating to sets of measure zero.
By considering the 'reality' not of the wavefunction but of the observable
properties of any ontological physical theory a novel characterisation of
non-locality and contextuality is found.
Secondly, a careful analysis of preparation independence, one of the key
assumptions of the PBR theorem, leads to an analogy with Bell locality, and
thence to a proposal to weaken it to an assumption of
`no-preparation-signalling' in analogy with no-signalling. This amounts to
introducing non-local correlations in the joint ontic state, which is, at
least, consistent with the Bell and Kochen-Specker theorems. The question of
whether the PBR result can be strengthened to hold under this relaxed
assumption is therefore posed.Comment: 8 pages, re-written with new section
Gradings, Braidings, Representations, Paraparticles: some open problems
A long-term research proposal on the algebraic structure, the representations
and the possible applications of paraparticle algebras is structured in three
modules: The first part stems from an attempt to classify the inequivalent
gradings and braided group structures present in the various parastatistical
algebraic models. The second part of the proposal aims at refining and
utilizing a previously published methodology for the study of the Fock-like
representations of the parabosonic algebra, in such a way that it can also be
directly applied to the other parastatistics algebras. Finally, in the third
part, a couple of Hamiltonians is proposed, and their sutability for modeling
the radiation matter interaction via a parastatistical algebraic model is
discussed.Comment: 25 pages, some typos correcte
The prevalence of AGN feedback in massive galaxies at z~1
We use the optical--infrared imaging in the UKIDSS Ultra Deep Survey field,
in combination with the new deep radio map of Arumugam et al., to calculate the
distribution of radio luminosities among galaxies as a function of stellar mass
in two redshift bins across the interval 0.4<z<1.2. This is done with the use
of a new Bayesian method to classify stars and galaxies in surveys with
multi-band photometry, and to derive photometric redshifts and stellar masses
for those galaxies. We compare the distribution to that observed locally and
find agreement if we consider only objects believed to be weak-lined radio-loud
galaxies. Since the local distribution is believed to be the result of an
energy balance between radiative cooling of the gaseous halo and mechanical AGN
heating, we infer that this balance was also present as long ago as z~1. This
supports the existence of a direct link between the presence of a
low-luminosity ('hot-mode') radio-loud active galactic nucleus and the absence
of ongoing star formation.Comment: 10 pages, MNRAS, in pres
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks
In this paper we propose and investigate a novel nonlinear unit, called
unit, for deep neural networks. The proposed unit receives signals from
several projections of a subset of units in the layer below and computes a
normalized norm. We notice two interesting interpretations of the
unit. First, the proposed unit can be understood as a generalization of a
number of conventional pooling operators such as average, root-mean-square and
max pooling widely used in, for instance, convolutional neural networks (CNN),
HMAX models and neocognitrons. Furthermore, the unit is, to a certain
degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013)
which achieved the state-of-the-art object recognition results on a number of
benchmark datasets. Secondly, we provide a geometrical interpretation of the
activation function based on which we argue that the unit is more
efficient at representing complex, nonlinear separating boundaries. Each
unit defines a superelliptic boundary, with its exact shape defined by the
order . We claim that this makes it possible to model arbitrarily shaped,
curved boundaries more efficiently by combining a few units of different
orders. This insight justifies the need for learning different orders for each
unit in the model. We empirically evaluate the proposed units on a number
of datasets and show that multilayer perceptrons (MLP) consisting of the
units achieve the state-of-the-art results on a number of benchmark datasets.
Furthermore, we evaluate the proposed unit on the recently proposed deep
recurrent neural networks (RNN).Comment: ECML/PKDD 201
Analysis of Vocal Disorders in a Feature Space
This paper provides a way to classify vocal disorders for clinical
applications. This goal is achieved by means of geometric signal separation in
a feature space. Typical quantities from chaos theory (like entropy,
correlation dimension and first lyapunov exponent) and some conventional ones
(like autocorrelation and spectral factor) are analysed and evaluated, in order
to provide entries for the feature vectors. A way of quantifying the amount of
disorder is proposed by means of an healthy index that measures the distance of
a voice sample from the centre of mass of both healthy and sick clusters in the
feature space. A successful application of the geometrical signal separation is
reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering
& Physic
- âŚ