171 research outputs found
Pre-Training LiDAR-Based 3D Object Detectors Through Colorization
Accurate 3D object detection and understanding for self-driving cars heavily
relies on LiDAR point clouds, necessitating large amounts of labeled data to
train. In this work, we introduce an innovative pre-training approach, Grounded
Point Colorization (GPC), to bridge the gap between data and labels by teaching
the model to colorize LiDAR point clouds, equipping it with valuable semantic
cues. To tackle challenges arising from color variations and selection bias, we
incorporate color as "context" by providing ground-truth colors as hints during
colorization. Experimental results on the KITTI and Waymo datasets demonstrate
GPC's remarkable effectiveness. Even with limited labeled data, GPC
significantly improves fine-tuning performance; notably, on just 20% of the
KITTI dataset, GPC outperforms training from scratch with the entire dataset.
In sum, we introduce a fresh perspective on pre-training for 3D object
detection, aligning the objective with the model's intended role and ultimately
advancing the accuracy and efficiency of 3D object detection for autonomous
vehicles
Colorization and Automated Segmentation of Human T2 MR Brain Images for Characterization of Soft Tissues
Characterization of tissues like brain by using magnetic resonance (MR) images and colorization of the gray scale image has been reported in the literature, along with the advantages and drawbacks. Here, we present two independent methods; (i) a novel colorization method to underscore the variability in brain MR images, indicative of the underlying physical density of bio tissue, (ii) a segmentation method (both hard and soft segmentation) to characterize gray brain MR images. The segmented images are then transformed into color using the above-mentioned colorization method, yielding promising results for manual tracing. Our color transformation incorporates the voxel classification by matching the luminance of voxels of the source MR image and provided color image by measuring the distance between them. The segmentation method is based on single-phase clustering for 2D and 3D image segmentation with a new auto centroid selection method, which divides the image into three distinct regions (gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) using prior anatomical knowledge). Results have been successfully validated on human T2-weighted (T2) brain MR images. The proposed method can be potentially applied to gray-scale images from other imaging modalities, in bringing out additional diagnostic tissue information contained in the colorized image processing approach as described
An interactive color pre-processing method to improve tumor segmentation in digital medical images
In the last few decades the medical imaging field has grown considerably, and new techniques such as computerized axial tomography (CAT) and Magnetic Resonance Imaging (MRI) are able to obtain medical images in noninvasive ways. These new technologies have opened the medical field, offering opportunities to improve patient diagnosis, education and training, treatment monitoring, and surgery planning. One of these opportunities is in the tumor segmentation field.
Tumor segmentation is the process of virtually extracting the tumor from the healthy tissues of the body by computer algorithms. This is a complex process since tumors have different shapes, sizes, tissue densities, and locations. The algorithms that have been developed cannot take into account all these variations and higher accuracy is achieved with specialized methods that generally work with specific types of tissue data.
In this thesis a color pre-processing method for segmentation is presented. Most tumor segmentation methods are based on grayscale values of the medical images. The method proposed in this thesis adds color information to the original values of the image. The user selects the region of interest (ROI), usually the tumor, from the grayscale medical image and from this initial selection, the image is mapped into a colored space. Tissue densities that are part of the tumor are assigned an RGB component and any tissues outside the tumor are set to black. The user can tweak the color ranges in real time to achieve better results, in cases where the tumor pixels are non-homogenous in terms of intensity. The user then places a seed in the center of the tumor and begins segmentation. A pixel in the image is segmented as part of the tumor if it\u27s within an initial 10% threshold. This threshold is determined if the seed is within the average RGB values of the tumor, and within the search region. The search region is calculated by growing or shrinking the previous region using the information or previous segmented regions of the set of slices. The method automatically segments all the slices on the set from the inputs of the first slice. All through the segmentation process the user can tweak different parameters and visualize the segmentation results in real time.
The method was run on ten test cases several runs were performed for each test cases. 10 out of the 20 test runs gave false positives of 25% or less, and 10 out of the 20 test runs gave false negatives of 25% or less. Using only grayscale thresholding methods the results for the same test cases show a false positive of up to 52% on the easy cases and up to 284% on the difficult cases, and false negatives of up to 14% on the easy cases and up to 99% on the difficult cases. While the results of the grayscale and color pre-processing methods on easy cases were similar, the results of color pre-processing were much better on difficult cases, thus supporting the claim that adding color to medical images for segmentation can significantly improve accuracy of tumor segmentation
Colorization of Multispectral Image Fusion using Convolutional Neural Network approach
The proposed technique offers a significant advantage in enhancing multiband nighttime imagery for surveillance and navigation purposes., The multi-band image data set comprises visual and infrared motion sequences with various military and civilian surveillance scenarios which include people that are stationary, walking or running, Vehicles and buildings or other man-made structures. Colorization method led to provide superior discrimination, identification of objects (Lesions), faster reaction times and an increased scene understanding than monochrome fused image. The guided filtering approach is used to decompose the source images hence they are divided into two parts: approximation part and detail content part further the weighted-averaging method is used to fuse the approximation part. The multi-layer features are extracted from the detail content part using the VGG-19 network. Finally, the approximation part and detail content part will be combined to reconstruct the fused image. The proposed approach has offers better outcomes equated to prevailing state-of-the-art techniques in terms of quantitative and qualitative parameters. In future, propose technique will help Battlefield monitoring, Defence for situation awareness, Surveillance, Target tracking and Person authentication
IST Austria Thesis
Modern computer vision systems heavily rely on statistical machine learning models, which typically require large amounts of labeled data to be learned reliably. Moreover, very recently computer vision research widely adopted techniques for representation learning, which further increase the demand for labeled data. However, for many important practical problems there is relatively small amount of labeled data available, so it is problematic to leverage full potential of the representation learning methods. One way to overcome this obstacle is to invest substantial resources into producing large labelled datasets. Unfortunately, this can be prohibitively expensive in practice. In this thesis we focus on the alternative way of tackling the aforementioned issue. We concentrate on methods, which make use of weakly-labeled or even unlabeled data. Specifically, the first half of the thesis is dedicated to the semantic image segmentation task. We develop a technique, which achieves competitive segmentation performance and only requires annotations in a form of global image-level labels instead of dense segmentation masks. Subsequently, we present a new methodology, which further improves segmentation performance by leveraging tiny additional feedback from a human annotator. By using our methods practitioners can greatly reduce the amount of data annotation effort, which is required to learn modern image segmentation models. In the second half of the thesis we focus on methods for learning from unlabeled visual data. We study a family of autoregressive models for modeling structure of natural images and discuss potential applications of these models. Moreover, we conduct in-depth study of one of these applications, where we develop the state-of-the-art model for the probabilistic image colorization task
Two Decades of Colorization and Decolorization for Images and Videos
Colorization is a computer-aided process, which aims to give color to a gray
image or video. It can be used to enhance black-and-white images, including
black-and-white photos, old-fashioned films, and scientific imaging results. On
the contrary, decolorization is to convert a color image or video into a
grayscale one. A grayscale image or video refers to an image or video with only
brightness information without color information. It is the basis of some
downstream image processing applications such as pattern recognition, image
segmentation, and image enhancement. Different from image decolorization, video
decolorization should not only consider the image contrast preservation in each
video frame, but also respect the temporal and spatial consistency between
video frames. Researchers were devoted to develop decolorization methods by
balancing spatial-temporal consistency and algorithm efficiency. With the
prevalance of the digital cameras and mobile phones, image and video
colorization and decolorization have been paid more and more attention by
researchers. This paper gives an overview of the progress of image and video
colorization and decolorization methods in the last two decades.Comment: 12 pages, 19 figure
Principal Uncertainty Quantification with Spatial Correlation for Image Restoration Problems
Uncertainty quantification for inverse problems in imaging has drawn much
attention lately. Existing approaches towards this task define uncertainty
regions based on probable values per pixel, while ignoring spatial correlations
within the image, resulting in an exaggerated volume of uncertainty. In this
paper, we propose PUQ (Principal Uncertainty Quantification) -- a novel
definition and corresponding analysis of uncertainty regions that takes into
account spatial relationships within the image, thus providing reduced volume
regions. Using recent advancements in generative models, we derive uncertainty
intervals around principal components of the empirical posterior distribution,
forming an ambiguity region that guarantees the inclusion of true unseen values
with a user-defined confidence probability. To improve computational efficiency
and interpretability, we also guarantee the recovery of true unseen values
using only a few principal directions, resulting in more informative
uncertainty regions. Our approach is verified through experiments on image
colorization, super-resolution, and inpainting; its effectiveness is shown
through comparison to baseline methods, demonstrating significantly tighter
uncertainty regions
3D Human Face Reconstruction and 2D Appearance Synthesis
3D human face reconstruction has been an extensive research for decades due to its wide applications, such as animation, recognition and 3D-driven appearance synthesis. Although commodity depth sensors are widely available in recent years, image based face reconstruction are significantly valuable as images are much easier to access and store.
In this dissertation, we first propose three image-based face reconstruction approaches according to different assumption of inputs.
In the first approach, face geometry is extracted from multiple key frames of a video sequence with different head poses. The camera should be calibrated under this assumption.
As the first approach is limited to videos, we propose the second approach then focus on single image. This approach also improves the geometry by adding fine grains using shading cue. We proposed a novel albedo estimation and linear optimization algorithm in this approach.
In the third approach, we further loose the constraint of the input image to arbitrary in the wild images. Our proposed approach can robustly reconstruct high quality model even with extreme expressions and large poses.
We then explore the applicability of our face reconstructions on four interesting applications: video face beautification, generating personalized facial blendshape from image sequences, face video stylizing and video face replacement. We demonstrate great potentials of our reconstruction approaches on these real-world applications. In particular, with the recent surge of interests in VR/AR, it is increasingly common to see people wearing head-mounted displays. However, the large occlusion on face is a big obstacle for people to communicate in a face-to-face manner. Our another application is that we explore hardware/software solutions for synthesizing the face image with presence of HMDs. We design two setups (experimental and mobile) which integrate two near IR cameras and one color camera to solve this problem. With our algorithm and prototype, we can achieve photo-realistic results.
We further propose a deep neutral network to solve the HMD removal problem considering it as a face inpainting problem. This approach doesn\u27t need special hardware and run in real-time with satisfying results
- …