1,633,910 research outputs found
UNSUPERVISED CONVOLUTIONAL NEURAL NETWORKS FOR MOTION ESTIMATION
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.Traditional methods for motion estimation estimate the motion field F between a pair of images as the one that minimizes a predesigned cost function. In this paper, we propose a direct method and train a Convolutional Neural Network (CNN) that when, at test time, is given a pair of images as input it produces a dense motion field F at its output layer. In the absence of large datasets with ground truth motion that would allow classical supervised training, we propose to train the network in an unsupervised manner. The proposed cost function that is optimized during training, is based on the classical optical flow constraint. The latter is differentiable with respect to the motion field and, therefore, allows backpropagation of the error to previous layers of the network. Our method is tested on both synthetic and real image sequences and performs similarly to the state-of-the-art methods
A Large Imaging Database and Novel Deep Neural Architecture for Covid-19 Diagnosis
Deep learning methodologies constitute nowadays the main approach for medical image analysis and disease prediction. Large annotated databases are necessary for developing these methodologies; such databases are difficult to obtain and to make publicly available for use by researchers and medical experts. In this paper, we focus on diagnosis of Covid-19 based on chest 3-D CT scans and develop a dual knowledge framework, including a large imaging database and a novel deep neural architecture. We introduce COV19-CT-DB, a very large database annotated for COVID-19 that consists of 7,750 3-D CT scans, 1,650 of which refer to COVID-19 cases and 6,100 to non-COVID19 cases. We use this database to train and develop the RACNet architecture. This architecture performs 3-D analysis based on a CNN-RNN network and handles input CT scans of different lengths, through the introduction of dynamic routing, feature alignment and a mask layer. We conduct a large experimental study that illustrates that the RACNet network has the best performance compared to other deep neural networks i) when trained and tested on COV19-CT-DB; ii) when tested, or when applied, through transfer learning, to other public databases
Generalized Inpainting Method for Hyperspectral Image Acquisition
A recently designed hyperspectral imaging device enables multiplexed
acquisition of an entire data volume in a single snapshot thanks to
monolithically-integrated spectral filters. Such an agile imaging technique
comes at the cost of a reduced spatial resolution and the need for a
demosaicing procedure on its interleaved data. In this work, we address both
issues and propose an approach inspired by recent developments in compressed
sensing and analysis sparse models. We formulate our superresolution and
demosaicing task as a 3-D generalized inpainting problem. Interestingly, the
target spatial resolution can be adjusted for mitigating the compression level
of our sensing. The reconstruction procedure uses a fast greedy method called
Pseudo-inverse IHT. We also show on simulations that a random arrangement of
the spectral filters on the sensor is preferable to regular mosaic layout as it
improves the quality of the reconstruction. The efficiency of our technique is
demonstrated through numerical experiments on both synthetic and real data as
acquired by the snapshot imager.Comment: Keywords: Hyperspectral, inpainting, iterative hard thresholding,
sparse models, CMOS, Fabry-P\'ero
Depth coding using depth discontinuity prediction and in-loop boundary reconstruction filtering
This paper presents a depth coding strategy that employs K-means clustering to segment the sequence of depth images into K clusters. The resulting clusters are losslessly compressed and transmitted as supplemental enhancement information to aid the decoder in predicting macroblocks containing depth discontinuities. This method further employs an in-loop boundary reconstruction filter to reduce distortions at the edges. The proposed algorithm was integrated within both H.264/AVC and H.264/MVC video coding standards. Simulation results demonstrate that the proposed scheme outperforms the state of the art depth coding schemes, where rendered Peak Signal to Noise Ratio (PSNR) gains between 0.1 dB and 0.5 dB were observed.peer-reviewe
Improved rate-adaptive codes for distributed video coding
The research work is partially funded by the STEPS Malta.This scholarship is partly financed by the European Union - European Social Fund (ESF 1.25).Distributed Video Coding (DVC) is a coding paradigm which shifts the major computational intensive tasks from the encoder to the decoder. Temporal correlation is exploited at the decoder by predicting the Wyner-Ziv (WZ) frames from the adjacent key frames. Compression is then achieved by transmitting just the parity information required to correct the predicted frame and recover the original frame. This paper proposes an algorithm which identifies most of the unreliable bits in the predicted bit planes, by considering the discrepancies in the previously decoded bit plane. The design of the used Low Density Parity Check (LDPC) codes is then biased to provide better protection to the unreliable bits. Simulation results show that, for the same target quality, the proposed scheme can reduce the WZ bit rates by up to 7% compared to traditional schemes.peer-reviewe
Adaptive rounding operator for efficient Wyner-Ziv video coding
The research work disclosed in this publication is partially funded by the Strategic Educational Pathways Scholarship Scheme (Malta). The scholarship is part-financed by the
European Union – European Social Fund. (ESF 1.25).The Distributed Video Coding (DVC) paradigm can theoretically reach the same coding efficiencies of predictive block-based video coding schemes, like H.264/AVC. However, current DVC architectures are still far from this ideal performance. This is mainly attributed to inaccuracies in the Side Information (SI) predicted at the decoder. The work in this paper presents a coding scheme which tries to avoid mismatch in the SI predictions caused by small variations in light intensity. Using the appropriate rounding operator for every coefficient, the proposed method significantly reduces the correlation noise between the Wyner-Ziv (WZ) frame and the corresponding SI, achieving higher coding efficiencies. Experimental results demonstrate that the average Peak Signal-to-Noise Ratio (PSNR) is improved by up to 0.56dB relative to the DISCOVER codec.peer-reviewe
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Multifocal image processing
In this paper, we present a processing method for digital images from
an optical microscope. High-pass type filters are generally used for image focusing. They enhance the high spatial frequencies. These filters are not appropriate if the lack of sharpness is caused by other factors. On the other hand, the (un)sharpness can be taken as an advantage and can be used for studies of the spatial distribution of structures in the observed scene. In many cases, it is possible to construct a three- dimensional model of the observed object by analyzing image sharpness. Interesting two-dimensional images and a three-dimensional model can be obtained by applying the theory for multifocal image processing described in this paper. We improve the quality of the results compared to the previous methods using the Fourier transform
for the analysis of local sharpness in the images
Improved Wyner-Ziv video coding efficiency using bit plane prediction
The research work is partially funded by STEPS-Malta and partially by the European Union - ESF 1.25.Distributed Video Coding (DVC) is a coding paradigm where video statistics are exploited, partially or totally, at the decoder. The performance of such a codec depends on the accuracy of the soft-input information estimated at the decoder, which is affected by the quality of the side information (SI) and the dependency model. This paper studies the discrepancies between the bit planes of the Wyner-Ziv (WZ) frames and the corresponding bit planes of the SI. The relationship between these discrepancies is then exploited to predict the locations where the bit plane of the SI is expected to differ from that of the original WZ frame. This information is then used to derive more accurate soft-input values that achieve better compression efficiencies. Simulation results demonstrate that a WZ bit-rate reduction of 9.4% is achieved for a given video quality.peer-reviewe
- …