220 research outputs found
Wavelet-based image compression for mobile applications.
The transmission of digital colour images is rapidly becoming popular on mobile telephones, Personal Digital Assistant (PDA) technology and other wireless based image services. However, transmitting digital colour images via mobile devices is badly affected by low air bandwidth. Advances in communications Channels (example 3G communication network) go some way to addressing this problem but the rapid increase in traffic and demand for ever better quality images, means that effective data compression techniques are essential for transmitting and storing digital images. The main objective of this thesis is to offer a novel image compression technique that can help to overcome the bandwidth problem. This thesis has investigated and implemented three different wavelet-based compression schemes with a focus on a suitable compression method for mobile applications.
The first described algorithm is a dual wavelet compression algorithm, which is a modified conventional wavelet compression method. The algorithm uses different wavelet filters to decompose the luminance and chrominance components separately. In addition, different levels of decomposition can also be applied to each component separately. The second algorithm is segmented wavelet-based, which segments an image into its smooth and nonsmooth
parts. Different wavelet filters are then applied to the segmented parts of the image. Finally, the third algorithm is the hybrid wavelet-based compression System (HWCS), where
the subject of interest is cropped and is then compressed using a wavelet-based method. The details of the background are reduced by averaging it and sending the background separately from the compressed subject of interest. The final image is reconstructed by replacing the averaged background image pixels with the compressed cropped image.
For each algorithm the experimental results presented in this thesis clearly demonstrated that encoder output can be effectively reduced while maintaining an acceptable image visual quality particularly when compared to a conventional wavelet-based compression scheme
3D Scene Geometry Estimation from 360 Imagery: A Survey
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D
scene geometry estimation methodologies based on single, two, or multiple
images captured under the omnidirectional optics. We first revisit the basic
concepts of the spherical camera model, and review the most common acquisition
technologies and representation formats suitable for omnidirectional (also
called 360, spherical or panoramic) images and videos. We then survey
monocular layout and depth inference approaches, highlighting the recent
advances in learning-based solutions suited for spherical data. The classical
stereo matching is then revised on the spherical domain, where methodologies
for detecting and describing sparse and dense features become crucial. The
stereo matching concepts are then extrapolated for multiple view camera setups,
categorizing them among light fields, multi-view stereo, and structure from
motion (or visual simultaneous localization and mapping). We also compile and
discuss commonly adopted datasets and figures of merit indicated for each
purpose and list recent results for completeness. We conclude this paper by
pointing out current and future trends.Comment: Published in ACM Computing Survey
Methods for Real-time Visualization and Interaction with Landforms
This thesis presents methods to enrich data modeling and analysis in the geoscience domain with a particular focus on geomorphological applications. First, a short overview of the relevant characteristics of the used remote sensing data and basics of its processing and visualization are provided. Then, two new methods for the visualization of vector-based maps on digital elevation models (DEMs) are presented. The first method uses a texture-based approach that generates a texture from the input maps at runtime taking into account the current viewpoint. In contrast to that, the second method utilizes the stencil buffer to create a mask in image space that is then used to render the map on top of the DEM. A particular challenge in this context is posed by the view-dependent level-of-detail representation of the terrain geometry. After suitable visualization methods for vector-based maps have been investigated, two landform mapping tools for the interactive generation of such maps are presented. The user can carry out the mapping directly on the textured digital elevation model and thus benefit from the 3D visualization of the relief. Additionally, semi-automatic image segmentation techniques are applied in order to reduce the amount of user interaction required and thus make the mapping process more efficient and convenient. The challenge in the adaption of the methods lies in the transfer of the algorithms to the quadtree representation of the data and in the application of out-of-core and hierarchical methods to ensure interactive performance. Although high-resolution remote sensing data are often available today, their effective resolution at steep slopes is rather low due to the oblique acquisition angle. For this reason, remote sensing data are suitable to only a limited extent for visualization as well as landform mapping purposes. To provide an easy way to supply additional imagery, an algorithm for registering uncalibrated photos to a textured digital elevation model is presented. A particular challenge in registering the images is posed by large variations in the photos concerning resolution, lighting conditions, seasonal changes, etc. The registered photos can be used to increase the visual quality of the textured DEM, in particular at steep slopes. To this end, a method is presented that combines several georegistered photos to textures for the DEM. The difficulty in this compositing process is to create a consistent appearance and avoid visible seams between the photos. In addition to that, the photos also provide valuable means to improve landform mapping. To this end, an extension of the landform mapping methods is presented that allows the utilization of the registered photos during mapping. This way, a detailed and exact mapping becomes feasible even at steep slopes
CamP: Camera Preconditioning for Neural Radiance Fields
Neural Radiance Fields (NeRF) can be optimized to obtain high-fidelity 3D
scene reconstructions of objects and large-scale scenes. However, NeRFs require
accurate camera parameters as input -- inaccurate camera parameters result in
blurry renderings. Extrinsic and intrinsic camera parameters are usually
estimated using Structure-from-Motion (SfM) methods as a pre-processing step to
NeRF, but these techniques rarely yield perfect estimates. Thus, prior works
have proposed jointly optimizing camera parameters alongside a NeRF, but these
methods are prone to local minima in challenging settings. In this work, we
analyze how different camera parameterizations affect this joint optimization
problem, and observe that standard parameterizations exhibit large differences
in magnitude with respect to small perturbations, which can lead to an
ill-conditioned optimization problem. We propose using a proxy problem to
compute a whitening transform that eliminates the correlation between camera
parameters and normalizes their effects, and we propose to use this transform
as a preconditioner for the camera parameters during joint optimization. Our
preconditioned camera optimization significantly improves reconstruction
quality on scenes from the Mip-NeRF 360 dataset: we reduce error rates (RMSE)
by 67% compared to state-of-the-art NeRF approaches that do not optimize for
cameras like Zip-NeRF, and by 29% relative to state-of-the-art joint
optimization approaches using the camera parameterization of SCNeRF. Our
approach is easy to implement, does not significantly increase runtime, can be
applied to a wide variety of camera parameterizations, and can
straightforwardly be incorporated into other NeRF-like models.Comment: SIGGRAPH Asia 2023, Project page: https://camp-nerf.github.i
Recommended from our members
3D Shape Understanding and Generation
In recent years, Machine Learning techniques have revolutionized solutions to longstanding image-based problems, like image classification, generation, semantic segmentation, object detection and many others. However, if we want to be able to build agents that can successfully interact with the real world, those techniques need to be capable of reasoning about the world as it truly is: a tridimensional space. There are two main challenges while handling 3D information in machine learning models. First, it is not clear what is the best 3D representation. For images, convolutional neural networks (CNNs) operating on raster images yield the best results in virtually all image-based benchmarks. For 3D data, the best combination of model and representation is still an open question. Second, 3D data is not available on the same scale as images – taking pictures is a common procedure in our daily lives, whereas capturing 3D content is an activity usually restricted to specialized professionals. This thesis is focused on addressing both of these issues. Which model and representation should we use for generating and recognizing 3D data? What are efficient ways of learning 3D representations from a few examples? Is it possible to leverage image data to build models capable of reasoning about the world in 3D?
Our research findings show that it is possible to build models that efficiently generate 3D shapes as irregularly structured representations. Those models require significantly less memory while generating higher quality shapes than the ones based on voxels and multi-view representations. We start by developing techniques to generate shapes represented as point clouds. This class of models leads to high quality reconstructions and better unsupervised feature learning. However, since point clouds are not amenable to editing and human manipulation, we also present models capable of generating shapes as sets of shape handles -- simpler primitives that summarize complex 3D shapes and were specifically designed for high-level tasks and user interaction. Despite their effectiveness, those approaches require some form of 3D supervision, which is scarce. We present multiple alternatives to this problem. First, we investigate how approximate convex decomposition techniques can be used as self-supervision to improve recognition models when only a limited number of labels are available. Second, we study how neural network architectures induce shape priors that can be used in multiple reconstruction tasks -- using both volumetric and manifold representations. In this regime, reconstruction is performed from a single example -- either a sparse point cloud or multiple silhouettes. Finally, we demonstrate how to train generative models of 3D shapes without using any 3D supervision by combining differentiable rendering techniques and Generative Adversarial Networks
- …