8,839 research outputs found

    Optimum Selection of DNN Model and Framework for Edge Inference

    Get PDF
    This paper describes a methodology to select the optimum combination of deep neuralnetwork and software framework for visual inference on embedded systems. As a first step, benchmarkingis required. In particular, we have benchmarked six popular network models running on four deep learningframeworks implemented on a low-cost embedded platform. Three key performance metrics have beenmeasured and compared with the resulting 24 combinations: accuracy, throughput, and power consumption.Then, application-level specifications come into play. We propose a figure of merit enabling the evaluationof each network/framework pair in terms of relative importance of the aforementioned metrics for a targetedapplication. We prove through numerical analysis and meaningful graphical representations that only areduced subset of the combinations must actually be considered for real deployment. Our approach can beextended to other networks, frameworks, and performance parameters, thus supporting system-level designdecisions in the ever-changing ecosystem of embedded deep learning technology.Ministerio de EconomĂ­a y Competitividad (TEC2015-66878-C3-1-R)Junta de AndalucĂ­a (TIC 2338-2013)European Union Horizon 2020 (Grant 765866

    Knowledge-rich Image Gist Understanding Beyond Literal Meaning

    Full text link
    We investigate the problem of understanding the message (gist) conveyed by images and their captions as found, for instance, on websites or news articles. To this end, we propose a methodology to capture the meaning of image-caption pairs on the basis of large amounts of machine-readable knowledge that has previously been shown to be highly effective for text understanding. Our method identifies the connotation of objects beyond their denotation: where most approaches to image understanding focus on the denotation of objects, i.e., their literal meaning, our work addresses the identification of connotations, i.e., iconic meanings of objects, to understand the message of images. We view image understanding as the task of representing an image-caption pair on the basis of a wide-coverage vocabulary of concepts such as the one provided by Wikipedia, and cast gist detection as a concept-ranking problem with image-caption pairs as queries. To enable a thorough investigation of the problem of gist understanding, we produce a gold standard of over 300 image-caption pairs and over 8,000 gist annotations covering a wide variety of topics at different levels of abstraction. We use this dataset to experimentally benchmark the contribution of signals from heterogeneous sources, namely image and text. The best result with a Mean Average Precision (MAP) of 0.69 indicate that by combining both dimensions we are able to better understand the meaning of our image-caption pairs than when using language or vision information alone. We test the robustness of our gist detection approach when receiving automatically generated input, i.e., using automatically generated image tags or generated captions, and prove the feasibility of an end-to-end automated process

    Are v1 simple cells optimized for visual occlusions? : A comparative study

    Get PDF
    Abstract: Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex. Author Summary: The statistics of our visual world is dominated by occlusions. Almost every image processed by our brain consists of mutually occluding objects, animals and plants. Our visual cortex is optimized through evolution and throughout our lifespan for such stimuli. Yet, the standard computational models of primary visual processing do not consider occlusions. In this study, we ask what effects visual occlusions may have on predicted response properties of simple cells which are the first cortical processing units for images. Our results suggest that recently observed differences between experiments and predictions of the standard simple cell models can be attributed to occlusions. The most significant consequence of occlusions is the prediction of many cells sensitive to center-surround stimuli. Experimentally, large quantities of such cells are observed since new techniques (reverse correlation) are used. Without occlusions, they are only obtained for specific settings and none of the seminal studies (sparse coding, ICA) predicted such fields. In contrast, the new type of response naturally emerges as soon as occlusions are considered. In comparison with recent in vivo experiments we find that occlusive models are consistent with the high percentages of center-surround simple cells observed in macaque monkeys, ferrets and mice

    On the 'Reality' of Observable Properties

    Full text link
    This note contains some initial work on attempting to bring recent developments in the foundations of quantum mechanics concerning the nature of the wavefunction within the scope of more logical and structural methods. A first step involves generalising and reformulating a criterion for the reality of the wavefunction proposed by Harrigan & Spekkens, which was central to the PBR theorem. The resulting criterion has several advantages, including the avoidance of certain technical difficulties relating to sets of measure zero. By considering the 'reality' not of the wavefunction but of the observable properties of any ontological physical theory a novel characterisation of non-locality and contextuality is found. Secondly, a careful analysis of preparation independence, one of the key assumptions of the PBR theorem, leads to an analogy with Bell locality, and thence to a proposal to weaken it to an assumption of `no-preparation-signalling' in analogy with no-signalling. This amounts to introducing non-local correlations in the joint ontic state, which is, at least, consistent with the Bell and Kochen-Specker theorems. The question of whether the PBR result can be strengthened to hold under this relaxed assumption is therefore posed.Comment: 8 pages, re-written with new section

    Gradings, Braidings, Representations, Paraparticles: some open problems

    Full text link
    A long-term research proposal on the algebraic structure, the representations and the possible applications of paraparticle algebras is structured in three modules: The first part stems from an attempt to classify the inequivalent gradings and braided group structures present in the various parastatistical algebraic models. The second part of the proposal aims at refining and utilizing a previously published methodology for the study of the Fock-like representations of the parabosonic algebra, in such a way that it can also be directly applied to the other parastatistics algebras. Finally, in the third part, a couple of Hamiltonians is proposed, and their sutability for modeling the radiation matter interaction via a parastatistical algebraic model is discussed.Comment: 25 pages, some typos correcte

    The prevalence of AGN feedback in massive galaxies at z~1

    Full text link
    We use the optical--infrared imaging in the UKIDSS Ultra Deep Survey field, in combination with the new deep radio map of Arumugam et al., to calculate the distribution of radio luminosities among galaxies as a function of stellar mass in two redshift bins across the interval 0.4<z<1.2. This is done with the use of a new Bayesian method to classify stars and galaxies in surveys with multi-band photometry, and to derive photometric redshifts and stellar masses for those galaxies. We compare the distribution to that observed locally and find agreement if we consider only objects believed to be weak-lined radio-loud galaxies. Since the local distribution is believed to be the result of an energy balance between radiative cooling of the gaseous halo and mechanical AGN heating, we infer that this balance was also present as long ago as z~1. This supports the existence of a direct link between the presence of a low-luminosity ('hot-mode') radio-loud active galactic nucleus and the absence of ongoing star formation.Comment: 10 pages, MNRAS, in pres

    Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks

    Full text link
    In this paper we propose and investigate a novel nonlinear unit, called LpL_p unit, for deep neural networks. The proposed LpL_p unit receives signals from several projections of a subset of units in the layer below and computes a normalized LpL_p norm. We notice two interesting interpretations of the LpL_p unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the LpL_p unit is, to a certain degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013) which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the LpL_p unit is more efficient at representing complex, nonlinear separating boundaries. Each LpL_p unit defines a superelliptic boundary, with its exact shape defined by the order pp. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few LpL_p units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed LpL_p units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the LpL_p units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed LpL_p unit on the recently proposed deep recurrent neural networks (RNN).Comment: ECML/PKDD 201

    Analysis of Vocal Disorders in a Feature Space

    Full text link
    This paper provides a way to classify vocal disorders for clinical applications. This goal is achieved by means of geometric signal separation in a feature space. Typical quantities from chaos theory (like entropy, correlation dimension and first lyapunov exponent) and some conventional ones (like autocorrelation and spectral factor) are analysed and evaluated, in order to provide entries for the feature vectors. A way of quantifying the amount of disorder is proposed by means of an healthy index that measures the distance of a voice sample from the centre of mass of both healthy and sick clusters in the feature space. A successful application of the geometrical signal separation is reported, concerning distinction between normal and disordered phonation.Comment: 12 pages, 3 figures, accepted for publication in Medical Engineering & Physic
    • …
    corecore