Search CORE

4,814 research outputs found

Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection

Author: Lee Kisuk
Seung H. Sebastian
Vishwanathan Ashwin
Zlateski Aleksandar
Publication venue
Publication date: 01/01/2015
Field of study

Efforts to automate the reconstruction of neural circuits from 3D electron microscopic (EM) brain images are critical for the field of connectomics. An important computation for reconstruction is the detection of neuronal boundaries. Images acquired by serial section EM, a leading 3D EM technique, are highly anisotropic, with inferior quality along the third dimension. For such images, the 2D max-pooling convolutional network has set the standard for performance at boundary detection. Here we achieve a substantial gain in accuracy through three innovations. Following the trend towards deeper networks for object recognition, we use a much deeper network than previously employed for boundary detection. Second, we incorporate 3D as well as 2D filters, to enable computations that use 3D context. Finally, we adopt a recursively trained architecture in which a first network generates a preliminary boundary map that is provided as input along with the original image to a second network that generates a final boundary map. Backpropagation training is accelerated by ZNN, a new implementation of 3D convolutional networks that uses multicore CPU parallelism for speed. Our hybrid 2D-3D architecture could be more generally applicable to other types of anisotropic 3D images, including video, and our recursive framework for any image labeling problem

arXiv.org e-Print Archive

Princeton University Open Access Repository

Data compression techniques applied to high resolution high frame rate video technology

Author: Alexovich Robert E.
Hartz William G.
Neustadter Marc S.
Publication venue
Publication date
Field of study

An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended

NASA Technical Reports Server

Recommended from our members

Methods and Approaches for Real-Time Hierarchical Motion Detection

Author: Allen Peter K.
Singh Ajit
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1988
Field of study

The recent work on perception and measurement of visual motion has consistently advocated the use of a hierarchical representation and analysis. In most of the practical applications of motion perception it is absolutely necessary to be able to construct and process these hierarchical image representations in real-time. First, we discuss a simple scheme for coarse motion detection that highlights the capabilities of the PIPE image processor, showing its ability to work in both the spatial and temporal dimensions in real-time. Secondly, we show how this architecture can be used to build pyramid structures useful for motion detection, again emphasizing the real-time nature of the computations. Using the PIPE architecture, we have constructed a Pyramid of Oriented Edges (POE) which is a logical extension of Burt's pyramid and also a version of Mallat's pyramid. The results of these algorithms are available on a video tape to highlight their real-time performance on moving images. Third, we propose a new method using PIPE that will allow dense optic flow computation and which relates the intensity-correlation and spatio-temporal frequency based methods of determining optic flow

Columbia University Academic Commons

3D Tracking Using Multi-view Based Particle Filters

Author: García Santos Narciso
Jaureguizar Núñez Fernando
Mohedano del Pozo Raúl
Salgado Álvarez de Sotomayor Luis
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2009
Field of study

Visual surveillance and monitoring of indoor environments using multiple cameras has become a field of great activity in computer vision. Usual 3D tracking and positioning systems rely on several independent 2D tracking modules applied over individual camera streams, fused using geometrical relationships across cameras. As 2D tracking systems suffer inherent difficulties due to point of view limitations (perceptually similar foreground and background regions causing fragmentation of moving objects, occlusions), 3D tracking based on partially erroneous 2D tracks are likely to fail when handling multiple-people interaction. To overcome this problem, this paper proposes a Bayesian framework for combining 2D low-level cues from multiple cameras directly into the 3D world through 3D Particle Filters. This method allows to estimate the probability of a certain volume being occupied by a moving object, and thus to segment and track multiple people across the monitored area. The proposed method is developed on the basis of simple, binary 2D moving region segmentation on each camera, considered as different state observations. In addition, the method is proved well suited for integrating additional 2D low-level cues to increase system robustness to occlusions: in this line, a naïve color-based (HSI) appearance model has been integrated, resulting in clear performance improvements when dealing with complex scenarios

Archivo Digital UPM

Complexity Analysis Of Next-Generation VVC Encoding and Decoding

Author: Adelimanesh Mohammad Ali
Gabbouj Moncef
Hashemi Mahmoud Reza
Pakdaman Farhad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/05/2020
Field of study

While the next generation video compression standard, Versatile Video Coding (VVC), provides a superior compression efficiency, its computational complexity dramatically increases. This paper thoroughly analyzes this complexity for both encoder and decoder of VVC Test Model 6, by quantifying the complexity break-down for each coding tool and measuring the complexity and memory requirements for VVC encoding/decoding. These extensive analyses are performed for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD), Random-Access (RA), and All-Intra (AI) conditions (a total of 320 encoding/decoding). Results indicate that the VVC encoder and decoder are 5x and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI, respectively. Detailed analysis of coding tools reveals that in LD on average, motion estimation tools with 53%, transformation and quantization with 22%, and entropy coding with 7% dominate the encoding complexity. In decoding, loop filters with 30%, motion compensation with 20%, and entropy decoding with 16%, are the most complex modules. Moreover, the required memory bandwidth for VVC encoding/decoding are measured through memory profiling, which are 30x and 3x of HEVC. The reported results and insights are a guide for future research and implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202

arXiv.org e-Print Archive

Crossref

Optimization of the motion estimation for parallel embedded systems in the context of new video standards

Author: Déforges Olivier
Nezan Jean François
Urban Fabrice
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 13/08/2012
Field of study

15 pagesInternational audienceThe effciency of video compression methods mainly depends on the motion compensation stage, and the design of effcient motion estimation techniques is still an important issue. An highly accurate motion estimation can significantly reduce the bit-rate, but involves a high computational complexity. This is particularly true for new generations of video compression standards, MPEG AVC and HEVC, which involves techniques such as different reference frames, sub-pixel estimation, variable block sizes. In this context, the design of fast motion estimation solutions is necessary, and can concerned two linked aspects: a high quality algorithm and its effcient implementation. This paper summarizes our main contributions in this domain. In particular, we first present the HME (Hierarchical Motion Estimation) technique. It is based on a multi-level refinement process where the motion estimation vectors are first estimated on a sub-sampled image. The multi-levels decomposition provides robust predictions and is particularly suited for variable block sizes motion estimations. The HME method has been integrated in a AVC encoder, and we propose a parallel implementation of this technique, with the motion estimation at pixel level performed by a DSP processor, and the sub-pixel refinement realized in an FPGA. The second technique that we present is called HDS for Hierarchical Diamond Search. It combines the multi-level refinement of HME, with a fast search at pixel-accuracy inspired by the EPZS method. This paper also presents its parallel implementation onto a multi-DSP platform and the its use in the HEVC context

HAL-CentraleSupelec

Crossref

HAL-Rennes 1

Structured Sequence Modeling with Graph Convolutional Recurrent Networks

Author: Bresson Xavier
Defferrard Michaël
Seo Youngjoo
Vandergheynst Pierre
Publication venue
Publication date: 22/12/2016
Field of study

This paper introduces Graph Convolutional Recurrent Network (GCRN), a deep learning model able to predict structured sequences of data. Precisely, GCRN is a generalization of classical recurrent neural networks (RNN) to data structured by an arbitrary graph. Such structured sequences can represent series of frames in videos, spatio-temporal measurements on a network of sensors, or random walks on a vocabulary graph for natural language modeling. The proposed model combines convolutional neural networks (CNN) on graphs to identify spatial structures and RNN to find dynamic patterns. We study two possible architectures of GCRN, and apply the models to two practical problems: predicting moving MNIST data, and modeling natural language with the Penn Treebank dataset. Experiments show that exploiting simultaneously graph spatial and dynamic information about data can improve both precision and learning speed

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Image sequence restoration by median filtering

Author: Jackson Shawn R.
Publication venue: RIT Scholar Works
Publication date: 01/01/2004
Field of study

Median filters are non-linear filters that fit in the generic category of order-statistic filters. Median filters are widely used for reducing random defects, commonly characterized by impulse or salt and pepper noise in a single image. Motion estimation is the process of estimating the displacement vector between like pixels in the current frame and the reference frame. When dealing with a motion sequence, the motion vectors are the key for operating on corresponding pixels in several frames. This work explores the use of various motion estimation algorithms in combination with various median filter algorithms to provide noise suppression. The results are compared using two sets of metrics: performance-based and objective image quality-based. These results are used to determine the best motion estimation / median filter combination for image sequence restoration. The primary goals of this work are to implement a motion estimation and median filter algorithm in hardware and develop and benchmark a flexible software alternative restoration process. There are two unique median filter algorithms to this work. The first filter is a modification to a single frame adaptive median filter. The modification applied motion compensation and temporal concepts. The other is an adaptive extension to the multi-level (ML3D) filter, called adaptive multi-level (AML3D) filter. The extension provides adaptable filter window sizes to the multiple filter sets that comprise the ML3D filter. The adaptive median filter is capable of filtering an image in 26.88 seconds per frame and results in a PSNR improvement of 5.452dB. The AML3D is capable of filtering an image in 14.73 seconds per frame and results in a PSNR improvement of 6.273dB. The AML3D is a suitable alternative to the other median filters

RIT Scholar Works

Vision Science and Technology at NASA: Results of a Workshop

Author: Mulligan Jeffrey B.
Watson Andrew B.
Publication venue
Publication date
Field of study

A broad review is given of vision science and technology within NASA. The subject is defined and its applications in both NASA and the nation at large are noted. A survey of current NASA efforts is given, noting strengths and weaknesses of the NASA program

NASA Technical Reports Server