72 research outputs found

    A new ADMM algorithm for the Euclidean median and its application to robust patch regression

    Full text link
    The Euclidean Median (EM) of a set of points Ω\Omega in an Euclidean space is the point x minimizing the (weighted) sum of the Euclidean distances of x to the points in Ω\Omega. While there exits no closed-form expression for the EM, it can nevertheless be computed using iterative methods such as the Wieszfeld algorithm. The EM has classically been used as a robust estimator of centrality for multivariate data. It was recently demonstrated that the EM can be used to perform robust patch-based denoising of images by generalizing the popular Non-Local Means algorithm. In this paper, we propose a novel algorithm for computing the EM (and its box-constrained counterpart) using variable splitting and the method of augmented Lagrangian. The attractive feature of this approach is that the subproblems involved in the ADMM-based optimization of the augmented Lagrangian can be resolved using simple closed-form projections. The proposed ADMM solver is used for robust patch-based image denoising and is shown to exhibit faster convergence compared to an existing solver.Comment: 5 pages, 3 figures, 1 table. To appear in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, April 19-24, 201

    Bayesian image restoration and bacteria detection in optical endomicroscopy

    Get PDF
    Optical microscopy systems can be used to obtain high-resolution microscopic images of tissue cultures and ex vivo tissue samples. This imaging technique can be translated for in vivo, in situ applications by using optical fibres and miniature optics. Fibred optical endomicroscopy (OEM) can enable optical biopsy in organs inaccessible by any other imaging systems, and hence can provide rapid and accurate diagnosis in a short time. The raw data the system produce is difficult to interpret as it is modulated by a fibre bundle pattern, producing what is called the “honeycomb effect”. Moreover, the data is further degraded due to the fibre core cross coupling problem. On the other hand, there is an unmet clinical need for automatic tools that can help the clinicians to detect fluorescently labelled bacteria in distal lung images. The aim of this thesis is to develop advanced image processing algorithms that can address the above mentioned problems. First, we provide a statistical model for the fibre core cross coupling problem and the sparse sampling by imaging fibre bundles (honeycomb artefact), which are formulated here as a restoration problem for the first time in the literature. We then introduce a non-linear interpolation method, based on Gaussian processes regression, in order to recover an interpretable scene from the deconvolved data. Second, we develop two bacteria detection algorithms, each of which provides different characteristics. The first approach considers joint formulation to the sparse coding and anomaly detection problems. The anomalies here are considered as candidate bacteria, which are annotated with the help of a trained clinician. Although this approach provides good detection performance and outperforms existing methods in the literature, the user has to carefully tune some crucial model parameters. Hence, we propose a more adaptive approach, for which a Bayesian framework is adopted. This approach not only outperforms the proposed supervised approach and existing methods in the literature but also provides computation time that competes with optimization-based methods

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Multi-frame reconstruction using super-resolution, inpainting, segmentation and codecs

    Get PDF
    In this thesis, different aspects of video and light field reconstruction are considered such as super-resolution, inpainting, segmentation and codecs. For this purpose, each of these strategies are analyzed based on a specific goal and a specific database. Accordingly, databases which are relevant to film industry, sport videos, light fields and hyperspectral videos are used for the sake of improvement. This thesis is constructed around six related manuscripts, in which several approaches are proposed for multi-frame reconstruction. Initially, a novel multi-frame reconstruction strategy is proposed for lightfield super-resolution in which graph-based regularization is applied along with edge preserving filtering for improving the spatio-angular quality of lightfield. Second, a novel video reconstruction is proposed which is built based on compressive sensing (CS), Gaussian mixture models (GMM) and sparse 3D transform-domain block matching. The motivation of the proposed technique is the improvement in visual quality performance of the video frames and decreasing the reconstruction error in comparison with the former video reconstruction methods. In the next approach, student-t mixture models and edge preserving filtering are applied for the purpose of video super-resolution. Student-t mixture model has a heavy tail which makes it robust and suitable as a video frame patch prior and rich in terms of log likelihood for information retrieval. In another approach, a hyperspectral video database is considered, and a Bayesian dictionary learning process is used for hyperspectral video super-resolution. To that end, Beta process is used in Bayesian dictionary learning and a sparse coding is generated regarding the hyperspectral video super-resolution. The spatial super-resolution is followed by a spectral video restoration strategy, and the whole process leveraged two different dictionary learnings, in which the first one is trained for spatial super-resolution and the second one is trained for the spectral restoration. Furthermore, in another approach, a novel framework is proposed for replacing advertisement contents in soccer videos in an automatic way by using deep learning strategies. For this purpose, a UNET architecture is applied (an image segmentation convolutional neural network technique) for content segmentation and detection. Subsequently, after reconstructing the segmented content in the video frames (considering the apparent loss in detection), the unwanted content is replaced by new one using a homography mapping procedure. In addition, in another research work, a novel video compression framework is presented using autoencoder networks that encode and decode videos by using less chroma information than luma information. For this purpose, instead of converting Y'CbCr 4:2:2/4:2:0 videos to and from RGB 4:4:4, the video is kept in Y'CbCr 4:2:2/4:2:0 and merged the luma and chroma channels after the luma is downsampled to match the chroma size. An inverse function is performed for the decoder. The performance of these models is evaluated by using CPSNR, MS-SSIM, and VMAF metrics. The experiments reveal that, as compared to video compression involving conversion to and from RGB 4:4:4, the proposed method increases the video quality by about 5.5% for Y'CbCr 4:2:2 and 8.3% for Y'CbCr 4:2:0 while reducing the amount of computation by nearly 37% for Y'CbCr 4:2:2 and 40% for Y'CbCr 4:2:0. The thread that ties these approaches together is reconstruction of the video and light field frames based on different aspects of problems such as having loss of information, blur in the frames, existing noise after reconstruction, existing unpleasant content, excessive size of information and high computational overhead. In three of the proposed approaches, we have used Plug-and-Play ADMM model for the first time regarding reconstruction of videos and light fields in order to address both information retrieval in the frames and tackling noise/blur at the same time. In two of the proposed models, we applied sparse dictionary learning to reduce the data dimension and demonstrate them as an efficient linear combination of basis frame patches. Two of the proposed approaches are developed in collaboration with industry, in which deep learning frameworks are used to handle large set of features and to learn high-level features from the data

    Robust motion segmentation with subspace constraints

    No full text
    Motion segmentation is an important task in computer vision with many applications such as dynamic scene understanding and multi-body structure from motion. When the point correspondences across frames are given, motion segmentation can be addressed as a subspace clustering problem under an affine camera model. In the first two parts of this thesis, we target the general subspace clustering problem and propose two novel methods, namely Efficient Dense Subspace Clustering (EDSC) and the Robust Shape Interaction Matrix (RSIM) method. Instead of following the standard compressive sensing approach, in EDSC we formulate subspace clustering as a Frobenius norm minimization problem, which inherently yields denser connections between data points. While in the noise-free case we rely on the self-expressiveness of the observations, in the presence of noise we recover a clean dictionary to represent the data. Our formulation lets us solve the subspace clustering problem efficiently. More specifically, for outlier-free observations, the solution can be obtained in closed-form, and in the presence of outliers, we solve the problem by performing a series of linear operations. Furthermore, we show that our Frobenius norm formulation shares the same solution as the popular nuclear norm minimization approach when the data is free of any noise. In RSIM, we revisit the Shape Interaction Matrix (SIM) method, one of the earliest approaches for motion segmentation (or subspace clustering), and reveal its connections to several recent subspace clustering methods. We derive a simple, yet effective algorithm to robustify the SIM method and make it applicable to real-world scenarios where the data is corrupted by noise. We validate the proposed method by intuitive examples and justify it with the matrix perturbation theory. Moreover, we show that RSIM can be extended to handle missing data with a Grassmannian gradient descent method. The above subspace clustering methods work well for motion segmentation, yet they require that point trajectories across frames are known {\it a priori}. However, finding point correspondences is in itself a challenging task. Existing approaches tackle the correspondence estimation and motion segmentation problems separately. In the third part of this thesis, given a set of feature points detected in each frame of the sequence, we develop an approach which simultaneously performs motion segmentation and finds point correspondences across the frames. We formulate this problem in terms of Partial Permutation Matrices (PPMs) and aim to match feature descriptors while simultaneously encouraging point trajectories to satisfy subspace constraints. This lets us handle outliers in both point locations and feature appearance. The resulting optimization problem is solved via the Alternating Direction Method of Multipliers (ADMM), where each subproblem has an efficient solution. In particular, we show that most of the subproblems can be solved in closed-form, and one binary assignment subproblem can be solved by the Hungarian algorithm. Obtaining reliable feature tracks in a frame-by-frame manner is desirable in applications such as online motion segmentation. In the final part of the thesis, we introduce a novel multi-body feature tracker that exploits a multi-body rigidity assumption to improve tracking robustness under a general perspective camera model. A conventional approach to addressing this problem would consist of alternating between solving two subtasks: motion segmentation and feature tracking under rigidity constraints for each segment. This approach, however, requires knowing the number of motions, as well as assigning points to motion groups, which is typically sensitive to motion estimates. By contrast, we introduce a segmentation-free solution to multi-body feature tracking that bypasses the motion assignment step and reduces to solving a series of subproblems with closed-form solutions. In summary, in this thesis, we exploit the powerful subspace constraints and develop robust motion segmentation methods in different challenging scenarios where the trajectories are either given as input, or unknown beforehand. We also present a general robust multi-body feature tracker which can be used as the first step of motion segmentation to get reliable trajectories

    Advanced Restoration Techniques for Images and Disparity Maps

    Get PDF
    With increasing popularity of digital cameras, the field of Computa- tional Photography emerges as one of the most demanding areas of research. In this thesis we study and develop novel priors and op- timization techniques to solve inverse problems, including disparity estimation and image restoration. The disparity map estimation method proposed in this thesis incor- porates multiple frames of a stereo video sequence to ensure temporal coherency. To enforce smoothness, we use spatio-temporal connec- tions between the pixels of the disparity map to constrain our solution. Apart from smoothness, we enforce a consistency constraint for the disparity assignments by using connections between the left and right views. These constraints are then formulated in a graphical model, which we solve using mean-field approximation. We use a filter-based mean-field optimization that perform efficiently by updating the dis- parity variables in parallel. The parallel updates scheme, however, is not guaranteed to converge to a stationary point. To compare and demonstrate the effectiveness of our approach, we developed a new optimization technique that uses sequential updates, which runs ef- ficiently and guarantees convergence. Our empirical results indicate that with proper initialization, we can employ the parallel update scheme and efficiently optimize our disparity maps without loss of quality. Our method ranks amongst the state of the art in common benchmarks, and significantly reduces the temporal flickering artifacts in the disparity maps. In the second part of this thesis, we address several image restora- tion problems such as image deblurring, demosaicing and super- resolution. We propose to use denoising autoencoders to learn an approximation of the true natural image distribution. We parametrize our denoisers using deep neural networks and show that they learn the gradient of the smoothed density of natural images. Based on this analysis, we propose a restoration technique that moves the so- lution towards the local extrema of this distribution by minimizing the difference between the input and output of our denoiser. Weii demonstrate the effectiveness of our approach using a single trained neural network in several restoration tasks such as deblurring and super-resolution. In a more general framework, we define a new Bayes formulation for the restoration problem, which leads to a more efficient and robust estimator. The proposed framework achieves state of the art performance in various restoration tasks such as deblurring and demosaicing, and also for more challenging tasks such as noise- and kernel-blind image deblurring. Keywords. disparity map estimation, stereo matching, mean-field optimization, graphical models, image processing, linear inverse prob- lems, image restoration, image deblurring, image denoising, single image super-resolution, image demosaicing, deep neural networks, denoising autoencoder

    What's in a Prior? Learned Proximal Networks for Inverse Problems

    Full text link
    Proximal operators are ubiquitous in inverse problems, commonly appearing as part of algorithmic strategies to regularize problems that are otherwise ill-posed. Modern deep learning models have been brought to bear for these tasks too, as in the framework of plug-and-play or deep unrolling, where they loosely resemble proximal operators. Yet, something essential is lost in employing these purely data-driven approaches: there is no guarantee that a general deep network represents the proximal operator of any function, nor is there any characterization of the function for which the network might provide some approximate proximal. This not only makes guaranteeing convergence of iterative schemes challenging but, more fundamentally, complicates the analysis of what has been learned by these networks about their training data. Herein we provide a framework to develop learned proximal networks (LPN), prove that they provide exact proximal operators for a data-driven nonconvex regularizer, and show how a new training strategy, dubbed proximal matching, provably promotes the recovery of the log-prior of the true data distribution. Such LPN provide general, unsupervised, expressive proximal operators that can be used for general inverse problems with convergence guarantees. We illustrate our results in a series of cases of increasing complexity, demonstrating that these models not only result in state-of-the-art performance, but provide a window into the resulting priors learned from data

    Generative Models for Preprocessing of Hospital Brain Scans

    Get PDF
    I will in this thesis present novel computational methods for processing routine clinical brain scans. Such scans were originally acquired for qualitative assessment by trained radiologists, and present a number of difficulties for computational models, such as those within common neuroimaging analysis software. The overarching objective of this work is to enable efficient and fully automated analysis of large neuroimaging datasets, of the type currently present in many hospitals worldwide. The methods presented are based on probabilistic, generative models of the observed imaging data, and therefore rely on informative priors and realistic forward models. The first part of the thesis will present a model for image quality improvement, whose key component is a novel prior for multimodal datasets. I will demonstrate its effectiveness for super-resolving thick-sliced clinical MR scans and for denoising CT images and MR-based, multi-parametric mapping acquisitions. I will then show how the same prior can be used for within-subject, intermodal image registration, for more robustly registering large numbers of clinical scans. The second part of the thesis focusses on improved, automatic segmentation and spatial normalisation of routine clinical brain scans. I propose two extensions to a widely used segmentation technique. First, a method for this model to handle missing data, which allows me to predict entirely missing modalities from one, or a few, MR contrasts. Second, a principled way of combining the strengths of probabilistic, generative models with the unprecedented discriminative capability of deep learning. By introducing a convolutional neural network as a Markov random field prior, I can model nonlinear class interactions and learn these using backpropagation. I show that this model is robust to sequence and scanner variability. Finally, I show examples of fitting a population-level, generative model to various neuroimaging data, which can model, e.g., CT scans with haemorrhagic lesions
    corecore