4,814 research outputs found
Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection
Efforts to automate the reconstruction of neural circuits from 3D electron
microscopic (EM) brain images are critical for the field of connectomics. An
important computation for reconstruction is the detection of neuronal
boundaries. Images acquired by serial section EM, a leading 3D EM technique,
are highly anisotropic, with inferior quality along the third dimension. For
such images, the 2D max-pooling convolutional network has set the standard for
performance at boundary detection. Here we achieve a substantial gain in
accuracy through three innovations. Following the trend towards deeper networks
for object recognition, we use a much deeper network than previously employed
for boundary detection. Second, we incorporate 3D as well as 2D filters, to
enable computations that use 3D context. Finally, we adopt a recursively
trained architecture in which a first network generates a preliminary boundary
map that is provided as input along with the original image to a second network
that generates a final boundary map. Backpropagation training is accelerated by
ZNN, a new implementation of 3D convolutional networks that uses multicore CPU
parallelism for speed. Our hybrid 2D-3D architecture could be more generally
applicable to other types of anisotropic 3D images, including video, and our
recursive framework for any image labeling problem
Data compression techniques applied to high resolution high frame rate video technology
An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended
Recommended from our members
Methods and Approaches for Real-Time Hierarchical Motion Detection
The recent work on perception and measurement of visual motion has consistently advocated the use of a hierarchical representation and analysis. In most of the practical applications of motion perception it is absolutely necessary to be able to construct and process these hierarchical image representations in real-time. First, we discuss a simple scheme for coarse motion detection that highlights the capabilities of the PIPE image processor, showing its ability to work in both the spatial and temporal dimensions in real-time. Secondly, we show how this architecture can be used to build pyramid structures useful for motion detection, again emphasizing the real-time nature of the computations. Using the PIPE architecture, we have constructed a Pyramid of Oriented Edges (POE) which is a logical extension of Burt's pyramid and also a version of Mallat's pyramid. The results of these algorithms are available on a video tape to highlight their real-time performance on moving images. Third, we propose a new method using PIPE that will allow dense optic flow computation and which relates the intensity-correlation and spatio-temporal frequency based methods of determining optic flow
3D Tracking Using Multi-view Based Particle Filters
Visual surveillance and monitoring of indoor environments using multiple cameras has become a field of great activity in computer vision. Usual 3D tracking and positioning systems rely on several independent 2D tracking modules applied over individual camera streams, fused using geometrical relationships across cameras. As 2D tracking systems suffer inherent difficulties due to point of view limitations (perceptually similar foreground and background regions causing fragmentation of moving objects, occlusions), 3D tracking based on partially erroneous 2D tracks are likely to fail when handling multiple-people interaction. To overcome this problem, this paper proposes a Bayesian framework for combining 2D low-level cues from multiple cameras directly into the 3D world through 3D Particle Filters. This method allows to estimate the probability of a certain volume being occupied by a moving object, and thus to segment and track multiple people across the monitored area. The proposed method is developed on the basis of simple, binary 2D moving region segmentation on each camera, considered as different state observations. In addition, the method is proved well suited for integrating additional 2D low-level cues to increase system robustness to occlusions: in this line, a naïve color-based (HSI) appearance model has been integrated, resulting in clear performance improvements when dealing with complex scenarios
Complexity Analysis Of Next-Generation VVC Encoding and Decoding
While the next generation video compression standard, Versatile Video Coding
(VVC), provides a superior compression efficiency, its computational complexity
dramatically increases. This paper thoroughly analyzes this complexity for both
encoder and decoder of VVC Test Model 6, by quantifying the complexity
break-down for each coding tool and measuring the complexity and memory
requirements for VVC encoding/decoding. These extensive analyses are performed
for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD),
Random-Access (RA), and All-Intra (AI) conditions (a total of 320
encoding/decoding). Results indicate that the VVC encoder and decoder are 5x
and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI,
respectively. Detailed analysis of coding tools reveals that in LD on average,
motion estimation tools with 53%, transformation and quantization with 22%, and
entropy coding with 7% dominate the encoding complexity. In decoding, loop
filters with 30%, motion compensation with 20%, and entropy decoding with 16%,
are the most complex modules. Moreover, the required memory bandwidth for VVC
encoding/decoding are measured through memory profiling, which are 30x and 3x
of HEVC. The reported results and insights are a guide for future research and
implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202
Optimization of the motion estimation for parallel embedded systems in the context of new video standards
15 pagesInternational audienceThe effciency of video compression methods mainly depends on the motion compensation stage, and the design of effcient motion estimation techniques is still an important issue. An highly accurate motion estimation can significantly reduce the bit-rate, but involves a high computational complexity. This is particularly true for new generations of video compression standards, MPEG AVC and HEVC, which involves techniques such as different reference frames, sub-pixel estimation, variable block sizes. In this context, the design of fast motion estimation solutions is necessary, and can concerned two linked aspects: a high quality algorithm and its effcient implementation. This paper summarizes our main contributions in this domain. In particular, we first present the HME (Hierarchical Motion Estimation) technique. It is based on a multi-level refinement process where the motion estimation vectors are first estimated on a sub-sampled image. The multi-levels decomposition provides robust predictions and is particularly suited for variable block sizes motion estimations. The HME method has been integrated in a AVC encoder, and we propose a parallel implementation of this technique, with the motion estimation at pixel level performed by a DSP processor, and the sub-pixel refinement realized in an FPGA. The second technique that we present is called HDS for Hierarchical Diamond Search. It combines the multi-level refinement of HME, with a fast search at pixel-accuracy inspired by the EPZS method. This paper also presents its parallel implementation onto a multi-DSP platform and the its use in the HEVC context
Structured Sequence Modeling with Graph Convolutional Recurrent Networks
This paper introduces Graph Convolutional Recurrent Network (GCRN), a deep
learning model able to predict structured sequences of data. Precisely, GCRN is
a generalization of classical recurrent neural networks (RNN) to data
structured by an arbitrary graph. Such structured sequences can represent
series of frames in videos, spatio-temporal measurements on a network of
sensors, or random walks on a vocabulary graph for natural language modeling.
The proposed model combines convolutional neural networks (CNN) on graphs to
identify spatial structures and RNN to find dynamic patterns. We study two
possible architectures of GCRN, and apply the models to two practical problems:
predicting moving MNIST data, and modeling natural language with the Penn
Treebank dataset. Experiments show that exploiting simultaneously graph spatial
and dynamic information about data can improve both precision and learning
speed
Image sequence restoration by median filtering
Median filters are non-linear filters that fit in the generic category of order-statistic filters. Median filters are widely used for reducing random defects, commonly characterized by impulse or salt and pepper noise in a single image. Motion estimation is the process of estimating the displacement vector between like pixels in the current frame and the reference frame. When dealing with a motion sequence, the motion vectors are the key for operating on corresponding pixels in several frames. This work explores the use of various motion estimation algorithms in combination with various median filter algorithms to provide noise suppression. The results are compared using two sets of metrics: performance-based and objective image quality-based. These results are used to determine the best motion estimation / median filter combination for image sequence restoration. The primary goals of this work are to implement a motion estimation and median filter algorithm in hardware and develop and benchmark a flexible software alternative restoration process. There are two unique median filter algorithms to this work. The first filter is a modification to a single frame adaptive median filter. The modification applied motion compensation and temporal concepts. The other is an adaptive extension to the multi-level (ML3D) filter, called adaptive multi-level (AML3D) filter. The extension provides adaptable filter window sizes to the multiple filter sets that comprise the ML3D filter. The adaptive median filter is capable of filtering an image in 26.88 seconds per frame and results in a PSNR improvement of 5.452dB. The AML3D is capable of filtering an image in 14.73 seconds per frame and results in a PSNR improvement of 6.273dB. The AML3D is a suitable alternative to the other median filters
Vision Science and Technology at NASA: Results of a Workshop
A broad review is given of vision science and technology within NASA. The subject is defined and its applications in both NASA and the nation at large are noted. A survey of current NASA efforts is given, noting strengths and weaknesses of the NASA program
- …