Search CORE

105,245 research outputs found

Convolutional Drift Networks for Video Classification

Author: Graham Dillon
Kanan Christopher
Kudithipudi Dhireesha
Langroudi Seyed Hamed Fatemi
Publication venue
Publication date: 03/11/2017
Field of study

Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively. Convolutional Neural Networks have shown promise as baseline fixed feature extractors through transfer learning, a technique that helps minimize the training cost on visual information. Temporal information is often handled using hand-crafted features or Recurrent Neural Networks, but this can be overly specific or prohibitively complex. Building a fully trainable system that can efficiently analyze spatio-temporal data without hand-crafted features or complex training is an open challenge. We present a new neural network architecture to address this challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines the visual feature extraction power of deep Convolutional Neural Networks with the intrinsically efficient temporal processing provided by Reservoir Computing. In this introductory paper on the CDN, we provide a very simple baseline implementation tested on two egocentric (first-person) video activity datasets.We achieve video-level activity classification results on-par with state-of-the art methods. Notably, performance on this complex spatio-temporal task was produced by only training a single feed-forward layer in the CDN.Comment: Published in IEEE Rebooting Computin

arXiv.org e-Print Archive

Crossref

A Neural Network Model for the Spatial and Temporal Response of Retinal Ganglion Cells

Author: Gaudiano Paolo
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/1991
Field of study

This article introduces a quantitative model of early visual system function. The model is formulated to unify analyses of spatial and temporal information processing by the nervous system. Functional constraints of the model suggest mechanisms analogous to photoreceptors, bipolar cells, and retinal ganglion cells, which can be formally represented with first order differential equations. Preliminary numerical simulations and analytical results show that the same formal mechanisms can explain the behavior of both X (linear) and Y (nonlinear) retinal ganglion cell classes by simple changes in the relative width of the receptive field (RF) center and surround mechanisms. Specifically, an increase in the width of the RF center results in a change from X-like to Y-like response, in agreement with anatomical data on the relationship between α- and -cell RF profiles. Simulations of model response to various spatio-temporal input patterns replicate many of the classical properties of X and Y cells, including transient (Y) versus sustained (X) responses, null-phase responses to alternating gratings in X cells, on-off or frequency doubling responses in Y cells, and phase-independent on-off responses in Y cells at high spatial frequencies. The model's formal mechanisms may be used in other portions of the visual system and more generally in nervous system structures involved with spatio-temporal information processing

Boston University Institutional Repository (OpenBU)

Visual Analysis of Spatio-Temporal Event Predictions: Investigating the Spread Dynamics of Invasive Species

Author: Engelke Ulrich
Hundt Michael
Häußler Johannes
Keim Daniel
Müller Hannes
Seebacher Daniel
Stein Manuel
Publication venue
Publication date: 01/01/2017
Field of study

Invasive species are a major cause of ecological damage and commercial losses. A current problem spreading in North America and Europe is the vinegar fly Drosophila suzukii. Unlike other Drosophila, it infests non-rotting and healthy fruits and is therefore of concern to fruit growers, such as vintners. Consequently, large amounts of data about infestations have been collected in recent years. However, there is a lack of interactive methods to investigate this data. We employ ensemble-based classification to predict areas susceptible to infestation by D. suzukii and bring them into a spatio-temporal context using maps and glyph-based visualizations. Following the information-seeking mantra, we provide a visual analysis system Drosophigator for spatio-temporal event prediction, enabling the investigation of the spread dynamics of invasive species. We demonstrate the usefulness of this approach in two use cases

arXiv.org e-Print Archive

KOPS - The Institutional Repository of the University of Konstanz

Crossref

Cortical spatio-temporal dimensionality reduction for visual grouping

Author: Barbieri Davide
Citti Giovanna
Cocci Giacomo
Sarti Alessandro
Publication venue
Publication date: 03/10/2014
Field of study

The visual systems of many mammals, including humans, is able to integrate the geometric information of visual stimuli and to perform cognitive tasks already at the first stages of the cortical processing. This is thought to be the result of a combination of mechanisms, which include feature extraction at single cell level and geometric processing by means of cells connectivity. We present a geometric model of such connectivities in the space of detected features associated to spatio-temporal visual stimuli, and show how they can be used to obtain low-level object segmentation. The main idea is that of defining a spectral clustering procedure with anisotropic affinities over datasets consisting of embeddings of the visual stimuli into higher dimensional spaces. Neural plausibility of the proposed arguments will be discussed

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Biblos-e Archivo

Modeling geometric-temporal context with directional pyramid co-occurrence for action recognition

Author: Hu W.
Li X.
Ling H.
Maybank Stephen J.
Yuan C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved

Crossref

Adelaide Research & Scholarship

Birkbeck Institutional Research Online

What has been missed for predicting human attention in viewing driving clips?

Author: Acik
Anderson
Ban
Berg
Betz
Borji
Borji
Bruce
Carmi
Cunningham
Dorr
Einhäuser
Gabbiani
Gavin
Goferman
Green
Guo
Guo
Guo
Hall
Henderson
Hill
Hou
Itti
Itti
Judd
Kanan
Kandil
Lander
Lappi
Le Meur
Ma
Mahadevan
Mannan
Marat
Nabatilan
Parkhurst
Pollux
Pollux
Pollux
Reinagel
Rind
Rind
Roebuck
Rothenstein
Röhrbein
Tatler
Tatler
Torralba
Tseng
Vuong
Wang
Xu
Xu
Yue
Yue
Zou
Publication venue: 'PeerJ'
Publication date: 01/02/2017
Field of study

Recent research progress on the topic of human visual attention allocation in scene perception and its simulation is based mainly on studies with static images. However, natural vision requires us to extract visual information that constantly changes due to egocentric movements or dynamics of the world. It is unclear to what extent spatio-temporal regularity, an inherent regularity in dynamic vision, affects human gaze distribution and saliency computation in visual attention models. In this free-viewing eye-tracking study we manipulated the spatio-temporal regularity of traffic videos by presenting them in normal video sequence, reversed video sequence, normal frame sequence, and randomised frame sequence. The recorded human gaze allocation was then used as the ‘ground truth’ to examine the predictive ability of a number of state-of-the-art visual attention models. The analysis revealed high inter-observer agreement across individual human observers, but all the tested attention models performed significantly worse than humans. The inferior predictability of the models was evident from indistinguishable gaze prediction irrespective of stimuli presentation sequence, and weak central fixation bias. Our findings suggest that a realistic visual attention model for the processing of dynamic scenes should incorporate human visual sensitivity with spatio-temporal regularity and central fixation bias

University of Lincoln Institutional Repository

Crossref

Directory of Open Access Journals

PubMed Central

Using treemaps for variable selection in spatio-temporal visualisation

Author: Dykes J.
Slingsby A.
Wood J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

We demonstrate and reflect upon the use of enhanced treemaps that incorporate spatial and temporal ordering for exploring a large multivariate spatio-temporal data set. The resulting data-dense views summarise and simultaneously present hundreds of space-, time-, and variable-constrained subsets of a large multivariate data set in a structure that facilitates their meaningful comparison and supports visual analysis. Interactive techniques allow localised patterns to be explored and subsets of interest selected and compared with the spatial aggregate. Spatial variation is considered through interactive raster maps and high-resolution local road maps. The techniques are developed in the context of 42.2 million records of vehicular activity in a 98 km(2) area of central London and informally evaluated through a design used in the exploratory visualisation of this data set. The main advantages of our technique are the means to simultaneously display hundreds of summaries of the data and to interactively browse hundreds of variable combinations with ordering and symbolism that are consistent and appropriate for space- and time- based variables. These capabilities are difficult to achieve in the case of spatio-temporal data with categorical attributes using existing geovisualisation methods. We acknowledge limitations in the treemap representation but enhance the cognitive plausibility of this popular layout through our two-dimensional ordering algorithm and interactions. Patterns that are expected (e.g. more traffic in central London), interesting (e.g. the spatial and temporal distribution of particular vehicle types) and anomalous (e.g. low speeds on particular road sections) are detected at various scales and locations using the approach. In many cases, anomalies identify biases that may have implications for future use of the data set for analyses and applications. Ordered treemaps appear to have potential as interactive interfaces for variable selection in spatio-temporal visualisation. Information Visualization (2008) 7, 210-224. doi: 10.1057/palgrave.ivs.950018