Search CORE

1,633 research outputs found

Low-Power CMOS Vision Sensor for Gaussian Pyramid Extraction

Author: Brea Sánchez Víctor Manuel
Cabello D.
Carmona Galán Ricardo
Fernández Berni Jorge
Rodríguez Vázquez Ángel Benito
Suárez Cambre Manuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

This paper introduces a CMOS vision sensor chip in a standard 0.18 μm CMOS technology for Gaussian pyramid extraction. The Gaussian pyramid provides computer vision algorithms with scale invariance, which permits having the same response regardless of the distance of the scene to the camera. The chip comprises 176×120 photosensors arranged into 88×60 processing elements (PEs). The Gaussian pyramid is generated with a double-Euler switched capacitor (SC) network. Every PE comprises four photodiodes, one 8 b single-slope analog-to-digital converter, one correlated double sampling circuit, and four state capacitors with their corresponding switches to implement the double-Euler SC network. Every PE occupies 44×44 μm2 . Measurements from the chip are presented to assess the accuracy of the generated Gaussian pyramid for visual tracking applications. Error levels are below 2% full-scale output, thus making the chip feasible for these applications. Also, energy cost is 26.5 nJ/px at 2.64 Mpx/s, thus outperforming conventional solutions of imager plus microprocessor unit.Office of Naval Research, USA N00014-14-1-0355Ministerio de Economía y Competitividad TEC2015-66878- C3-1-R, TEC2015-66878-C3-3-RJunta de Andalucía TIC 2338, EM2013/038, EM2014/01

Digital.CSIC

idUS. Depósito de Investigación Universidad de Sevilla

Gaussian Pyramid Extraction with a CMOS Vision Sensor

Author: Brea Sánchez Víctor Manuel
Cabello D.
Carmona Galán Ricardo
Fernández Berni Jorge
Rodríguez Vázquez Ángel Benito
Suárez Cambre Manuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Comunicación presentada en 2014 14th International Workshop on Cellular Nanoscale Networks and Their Applications, CNNA 2014; University of Notre Dame; United States; 29 July 2014 through 31 July 2014This paper addresses a CMOS vision sensor with 176 × 120 pixels in standard 0.18 μm CMOS technology that computes the Gaussian pyramid. The Gaussian pyramid is extracted with a double-Euler switched-capacitor network, giving RMSE errors below 1.2% of full-scale value. The chip provides a Gaussian pyramid of 3 octaves with 6 scales each with an energy cost of 26.5 nJ at 2.64 Mpx/s.Gobierno de España ONR N000141410355 TEC2009-12686 MICINNMINECO TEC2012- 38921-C02 (FEDER)MINECO IPT-2011-1625-430000 IPC-20111009Junta de Andalucía TIC 2338-2013Xunta de Galicia EM2013 / 038 (FEDER)FEDER CN2012/151 GPC2013 / 04

idUS. Depósito de Investigación Universidad de Sevilla

Beyond Gaussian Pyramid: Multi-skip Feature Stacking for Action Recognition

Author: Hauptmann Alexander G.
Lan Zhenzhong
Li Xuanchong
Lin Ming
Raj Bhiksha
Publication venue
Publication date: 01/01/2015
Field of study

Most state-of-the-art action feature extractors involve differential operators, which act as highpass filters and tend to attenuate low frequency action information. This attenuation introduces bias to the resulting features and generates ill-conditioned feature matrices. The Gaussian Pyramid has been used as a feature enhancing technique that encodes scale-invariant characteristics into the feature space in an attempt to deal with this attenuation. However, at the core of the Gaussian Pyramid is a convolutional smoothing operation, which makes it incapable of generating new features at coarse scales. In order to address this problem, we propose a novel feature enhancing technique called Multi-skIp Feature Stacking (MIFS), which stacks features extracted using a family of differential filters parameterized with multiple time skips and encodes shift-invariance into the frequency space. MIFS compensates for information lost from using differential operators by recapturing information at coarse scales. This recaptured information allows us to match actions at different speeds and ranges of motion. We prove that MIFS enhances the learnability of differential-based features exponentially. The resulting feature matrices from MIFS have much smaller conditional numbers and variances than those from conventional methods. Experimental results show significantly improved performance on challenging action recognition and event detection tasks. Specifically, our method exceeds the state-of-the-arts on Hollywood2, UCF101 and UCF50 datasets and is comparable to state-of-the-arts on HMDB51 and Olympics Sports datasets. MIFS can also be used as a speedup strategy for feature extraction with minimal or no accuracy cost

arXiv.org e-Print Archive

CiteSeerX

Crossref

Multi-scale digital soil mapping with deep learning

Author: A Biswas
A. Biswas
A.-Xing Zhu
Bruno Lashermes
C Grinand
C-z Qin
CZ Qin
D Silver
DE Rumelhart
DG Lowe
F Rosenblatt
Hongfen Teng
I Rey-Otero
Jürgen Schmidhuber
K Schmidt
K Schmidt
L Breiman
L Drăguţ
LW Zevenbergen
M Aitkenhead
MB Bodaghabadi
MP Smith
NE Huang
P Burt
R Grimm
R. M. Lark
R.A MacMillan
R.A. Viscarra Rossel
Raphael A. Viscarra Rossel
RHR Hahnloser
Ruth Kerry
T Behrens
T Behrens
T Behrens
T Behrens
T Behrens
T Lindeberg
Tony Lindeberg
TR Green
W Liu
WS McCulloch
XL Sun
Y Guo
Y Lecun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We compared different methods of multi-scale terrain feature construction and their relative effectiveness for digital soil mapping with a Deep Learning algorithm. The most common approach for multi-scale feature construction in DSM is to filter terrain attributes based on different neighborhood sizes, however results can be difficult to interpret because the approach is affected by outliers. Alternatively, one can derive the terrain attributes on decomposed elevation data, but the resulting maps can have artefacts rendering the approach undesirable. Here, we introduce ‘mixed scaling’ a new method that overcomes these issues and preserves the landscape features that are identifiable at different scales. The new method also extends the Gaussian pyramid by introducing additional intermediate scales. This minimizes the risk that the scales that are important for soil formation are not available in the model. In our extended implementation of the Gaussian pyramid, we tested four intermediate scales between any two consecutive octaves of the Gaussian pyramid and modelled the data with Deep Learning and Random Forests. We performed the experiments using three different datasets and show that mixed scaling with the extended Gaussian pyramid produced the best performing set of covariates and that modelling with Deep Learning produced the most accurate predictions, which on average were 4–7% more accurate compared to modelling with Random Forests

Crossref

Publikationsserver der Universität Tübingen

espace@Curtin

CMOS-3D smart imager architectures for feature detection

Author: D. Cabello
G. Linan
J. Fernandez-Berni
Manuel Suarez
R. Carmona-Galan
Víctor M. Brea
Ángel Rodriguez-Vazquez
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

This paper reports a multi-layered smart image sensor architecture for feature extraction based on detection of interest points. The architecture is conceived for 3-D integrated circuit technologies consisting of two layers (tiers) plus memory. The top tier includes sensing and processing circuitry aimed to perform Gaussian filtering and generate Gaussian pyramids in fully concurrent way. The circuitry in this tier operates in mixed-signal domain. It embeds in-pixel correlated double sampling, a switched-capacitor network for Gaussian pyramid generation, analog memories and a comparator for in-pixel analog-to-digital conversion. This tier can be further split into two for improved resolution; one containing the sensors and another containing a capacitor per sensor plus the mixed-signal processing circuitry. Regarding the bottom tier, it embeds digital circuitry entitled for the calculation of Harris, Hessian, and difference-of-Gaussian detectors. The overall system can hence be configured by the user to detect interest points by using the algorithm out of these three better suited to practical applications. The paper describes the different kind of algorithms featured and the circuitry employed at top and bottom tiers. The Gaussian pyramid is implemented with a switched-capacitor network in less than 50 μs, outperforming more conventional solutions.Xunta de Galicia 10PXIB206037PRMinisterio de Ciencia e Innovación TEC2009-12686, IPT-2011-1625-430000Office of Naval Research N00014111031

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Multiscale approaches to music audio feature learning

Author: Dieleman Sander
Schrauwen Benjamin
Publication venue: 'Pontificia Universidade Catolica do Parana - PUCPR'
Publication date: 01/01/2013
Field of study

Content-based music information retrieval tasks are typically solved with a two-stage approach: features are extracted from music audio signals, and are then used as input to a regressor or classifier. These features can be engineered or learned from data. Although the former approach was dominant in the past, feature learning has started to receive more attention from the MIR community in recent years. Recent results in feature learning indicate that simple algorithms such as K-means can be very effective, sometimes surpassing more complicated approaches based on restricted Boltzmann machines, autoencoders or sparse coding. Furthermore, there has been increased interest in multiscale representations of music audio recently. Such representations are more versatile because music audio exhibits structure on multiple timescales, which are relevant for different MIR tasks to varying degrees. We develop and compare three approaches to multiscale audio feature learning using the spherical K-means algorithm. We evaluate them in an automatic tagging task and a similarity metric learning task on the Magnatagatune dataset

Ghent University Academic Bibliography

Single image example-based super-resolution using cross-scale patch matching and Markov random field modelling

Author: H. Takeda
J.S. Yedidia
K. Kim
M. Ebrahimi
M. Irani
S. Baker
S. Farsiu
S.Z. Li
W.T. Freeman
Z. Wang
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

Example-based super-resolution has become increasingly popular over the last few years for its ability to overcome the limitations of classical multi-frame approach. In this paper we present a new example-based method that uses the input low-resolution image itself as a search space for high-resolution patches by exploiting self-similarity across different resolution scales. Found examples are combined in a high-resolution image by the means of Markov Random Field modelling that forces their global agreement. Additionally, we apply back-projection and steering kernel regression as post-processing techniques. In this way, we are able to produce sharp and artefact-free results that are comparable or better than standard interpolation and state-of-the-art super-resolution techniques

Crossref

Ghent University Academic Bibliography

Salient Regions for Query by Image Content

Author: Hare Jonathon S.
Lewis Paul H.
Publication venue
Publication date: 01/01/2004
Field of study

Much previous work on image retrieval has used global features such as colour and texture to describe the content of the image. However, these global features are insufficient to accurately describe the image content when different parts of the image have different characteristics. This paper discusses how this problem can be circumvented by using salient interest points and compares and contrasts an extension to previous work in which the concept of scale is incorporated into the selection of salient regions to select the areas of the image that are most interesting and generate local descriptors to describe the image characteristics in that region. The paper describes and contrasts two such salient region descriptors and compares them through their repeatability rate under a range of common image transforms. Finally, the paper goes on to investigate the performance of one of the salient region detectors in an image retrieval situation

Southampton (e-Prints Soton)