622 research outputs found
TRansPose: Large-Scale Multispectral Dataset for Transparent Object
Transparent objects are encountered frequently in our daily lives, yet
recognizing them poses challenges for conventional vision sensors due to their
unique material properties, not being well perceived from RGB or depth cameras.
Overcoming this limitation, thermal infrared cameras have emerged as a
solution, offering improved visibility and shape information for transparent
objects. In this paper, we present TRansPose, the first large-scale
multispectral dataset that combines stereo RGB-D, thermal infrared (TIR)
images, and object poses to promote transparent object research. The dataset
includes 99 transparent objects, encompassing 43 household items, 27 recyclable
trashes, 29 chemical laboratory equivalents, and 12 non-transparent objects. It
comprises a vast collection of 333,819 images and 4,000,056 annotations,
providing instance-level segmentation masks, ground-truth poses, and completed
depth information. The data was acquired using a FLIR A65 thermal infrared
(TIR) camera, two Intel RealSense L515 RGB-D cameras, and a Franka Emika Panda
robot manipulator. Spanning 87 sequences, TRansPose covers various challenging
real-life scenarios, including objects filled with water, diverse lighting
conditions, heavy clutter, non-transparent or translucent containers, objects
in plastic bags, and multi-stacked objects. TRansPose dataset can be accessed
from the following link: https://sites.google.com/view/transpose-datasetComment: Under revie
Multisensory Imagery Cues for Object Separation, Specularity Detection and Deep Learning based Inpainting
Multisensory imagery cues have been actively investigated in diverse applications in the computer vision community to provide additional geometric information that is either absent or difficult to capture from mainstream two-dimensional imaging. The inherent features of multispectral polarimetric light field imagery (MSPLFI) include object distribution over spectra, surface properties, shape, shading and pixel flow in light space. The aim of this dissertation is to explore these inherent properties to exploit new structures and methodologies for the tasks of object separation, specularity detection and deep learning-based inpainting in MSPLFI.
In the first part of this research, an application to separate foreground objects from the background in both outdoor and indoor scenes using multispectral polarimetric imagery (MSPI) cues is examined. Based on the pixel neighbourhood relationship, an on-demand clustering technique is proposed and implemented to separate artificial objects from natural background in a complex outdoor scene. However, due to indoor scenes only containing artificial objects, with vast variations in energy levels among spectra, a multiband fusion technique followed by a background segmentation algorithm is proposed to separate the foreground from the background. In this regard, first, each spectrum is decomposed into low and high frequencies using the fast Fourier transform (FFT) method. Second, principal component analysis (PCA) is applied on both frequency images of the individual spectrum and then combined with the first principal components as a fused image.
Finally, a polarimetric background segmentation (BS) algorithm based on the Stokes vector is proposed and implemented on the fused image. The performance of the proposed approaches are evaluated and compared using publicly available MSPI datasets and the dice similarity coefficient (DSC). The proposed multiband fusion and BS methods demonstrate better fusion quality and higher segmentation accuracy compared with other studies for several metrics, including mean absolute percentage error (MAPE), peak signal-to-noise ratio (PSNR), Pearson correlation coefficient (PCOR) mutual information (MI), accuracy, Geometric Mean (G-mean), precision, recall and F1-score.
In the second part of this work, a twofold framework for specular reflection detection (SRD) and specular reflection inpainting (SRI) in transparent objects is proposed. The SRD algorithm is based on the mean, the covariance and the Mahalanobis distance for predicting anomalous pixels in MSPLFI. The SRI algorithm first selects four-connected neighbouring pixels from sub-aperture images and then replaces the SRD pixel with the closest matched pixel. For both algorithms, a 6D MSPLFI transparent object dataset is captured from multisensory imagery cues due to the unavailability of this kind of dataset. The experimental results demonstrate that the proposed algorithms predict higher SRD accuracy and better SRI quality than the existing approaches reported in this part in terms of F1-score, G-mean, accuracy, the structural similarity index (SSIM), the PSNR, the mean squared error (IMMSE) and the mean absolute deviation (MAD). However, due to synthesising SRD pixels based on the pixel neighbourhood relationship, the proposed inpainting method in this research produces artefacts and errors when inpainting large specularity areas with irregular holes.
Therefore, in the last part of this research, the emphasis is on inpainting large specularity areas with irregular holes based on the deep feature extraction from multisensory imagery cues. The proposed six-stage deep learning inpainting (DLI) framework is based on the generative adversarial network (GAN) architecture and consists of a generator network and a discriminator network. First, pixels’ global flow in the sub-aperture images is calculated by applying the large displacement optical flow (LDOF) method. The proposed training algorithm combines global flow with local flow and coarse inpainting results predicted from the baseline method. The generator attempts to generate best-matched features, while the discriminator seeks to predict the maximum difference between the predicted results and the actual results. The experimental results demonstrate that in terms of the PSNR, MSSIM, IMMSE and MAD, the proposed DLI framework predicts superior inpainting quality to the baseline method and the previous part of this research
There and Back Again: Self-supervised Multispectral Correspondence Estimation
Across a wide range of applications, from autonomous vehicles to medical
imaging, multi-spectral images provide an opportunity to extract additional
information not present in color images. One of the most important steps in
making this information readily available is the accurate estimation of dense
correspondences between different spectra.
Due to the nature of cross-spectral images, most correspondence solving
techniques for the visual domain are simply not applicable. Furthermore, most
cross-spectral techniques utilize spectra-specific characteristics to perform
the alignment. In this work, we aim to address the dense correspondence
estimation problem in a way that generalizes to more than one spectrum. We do
this by introducing a novel cycle-consistency metric that allows us to
self-supervise. This, combined with our spectra-agnostic loss functions, allows
us to train the same network across multiple spectra.
We demonstrate our approach on the challenging task of dense RGB-FIR
correspondence estimation. We also show the performance of our unmodified
network on the cases of RGB-NIR and RGB-RGB, where we achieve higher accuracy
than similar self-supervised approaches. Our work shows that cross-spectral
correspondence estimation can be solved in a common framework that learns to
generalize alignment across spectra
Neural Spectro-polarimetric Fields
Modeling the spatial radiance distribution of light rays in a scene has been
extensively explored for applications, including view synthesis. Spectrum and
polarization, the wave properties of light, are often neglected due to their
integration into three RGB spectral bands and their non-perceptibility to human
vision. Despite this, these properties encompass substantial material and
geometric information about a scene. In this work, we propose to model
spectro-polarimetric fields, the spatial Stokes-vector distribution of any
light ray at an arbitrary wavelength. We present Neural Spectro-polarimetric
Fields (NeSpoF), a neural representation that models the physically-valid
Stokes vector at given continuous variables of position, direction, and
wavelength. NeSpoF manages inherently noisy raw measurements, showcases memory
efficiency, and preserves physically vital signals, factors that are crucial
for representing the high-dimensional signal of a spectro-polarimetric field.
To validate NeSpoF, we introduce the first multi-view
hyperspectral-polarimetric image dataset, comprised of both synthetic and
real-world scenes. These were captured using our compact
hyperspectral-polarimetric imaging system, which has been calibrated for
robustness against system imperfections. We demonstrate the capabilities of
NeSpoF on diverse scenes
An automatic building extraction and regularisation technique using LiDAR point cloud data and orthoimage
The development of robust and accurate methods for automatic building detection and regularisation using multisource data continues to be a challenge due to point cloud sparsity, high spectral variability, urban objects differences, surrounding complexity, and data misalignment. To address these challenges, constraints on object's size, height, area, and orientation are generally benefited which adversely affect the detection performance. Often the buildings either small in size, under shadows or partly occluded are ousted during elimination of superfluous objects. To overcome the limitations, a methodology is developed to extract and regularise the buildings using features from point cloud and orthoimagery. The building delineation process is carried out by identifying the candidate building regions and segmenting them into grids. Vegetation elimination, building detection and extraction of their partially occluded parts are achieved by synthesising the point cloud and image data. Finally, the detected buildings are regularised by exploiting the image lines in the building regularisation process. Detection and regularisation processes have been evaluated using the ISPRS benchmark and four Australian data sets which differ in point density (1 to 29 points/m2), building sizes, shadows, terrain, and vegetation. Results indicate that there is 83% to 93% per-area completeness with the correctness of above 95%, demonstrating the robustness of the approach. The absence of over- and many-to-many segmentation errors in the ISPRS data set indicate that the technique has higher per-object accuracy. While compared with six existing similar methods, the proposed detection and regularisation approach performs significantly better on more complex data sets (Australian) in contrast to the ISPRS benchmark, where it does better or equal to the counterparts. © 2016 by the authors
Multispectral lensless digital holographic microscope: imaging MCF-7 and MDA-MB-231 cancer cell cultures
Digital holography is the process where an object’s phase and amplitude information is retrieved from intensity images
obtained using a digital camera (e.g. CCD or CMOS sensor). In-line digital holographic techniques offer full use of the
recording device’s sampling bandwidth, unlike off-axis holography where object information is not modulated onto
carrier fringes. Reconstructed images are obscured by the linear superposition of the unwanted, out of focus, twin
images. In addition to this, speckle noise degrades overall quality of the reconstructed images. The speckle effect is a
phenomenon of laser sources used in digital holographic systems. Minimizing the effects due to speckle noise, removal
of the twin image and using the full sampling bandwidth of the capture device aids overall reconstructed image quality.
Such improvements applied to digital holography can benefit applications such as holographic microscopy where the
reconstructed images are obscured with twin image information. Overcoming such problems allows greater flexibility in
current image processing techniques, which can be applied to segmenting biological cells (e.g. MCF-7 and MDA-MB-
231) to determine their overall cell density and viability. This could potentially be used to distinguish between apoptotic
and necrotic cells in large scale mammalian cell processes, currently the system of choice, within the biopharmaceutical
industry
Long-Term Monitoring of the Flooding Regime and Hydroperiod of Doñana Marshes with Landsat Time Series (1974–2014)
This paper presents a semi-automatic procedure to discriminate seasonally flooded areas in the shallow temporary marshes of Doñana National Park (SW Spain) by using a radiommetrically normalized long time series of Landsat MSS, TM, and ETM+ images (1974–2014). Extensive field campaigns for ground truth data retrieval were carried out simultaneous to Landsat overpasses. Ground truth was used as training and testing areas to check the performance of the method. Simple thresholds on TM and ETM band 5 (1.55–1.75 μm) worked significantly better than other empirical modeling techniques and supervised classification methods to delineate flooded areas at Doñana marshes. A classification tree was applied to band 5 reflectance values to classify flooded versus non-flooded pixels for every scene. Inter-scene cross-validation identified the most accurate threshold on band 5 reflectance (ρ < 0.186) to classify flooded areas (Kappa = 0.65). A joint TM-MSS acquisition was used to find the MSS band 4 (0.8 a 1.1 μm) threshold. The TM flooded area was identical to the results from MSS 4 band threshold ρ < 0.10 despite spectral and spatial resolution differences. Band slicing was retrospectively applied to the complete time series of MSS and TM images. About 391 flood masks were used to reconstruct historical spatial and temporal patterns of Doñana marshes flooding, including hydroperiod. Hydroperiod historical trends were used as a baseline to understand Doñana’s flooding regime, test hydrodynamic models, and give an assessment of relevant management and restoration decisions. The historical trends in the hydroperiod of Doñana marshes show two opposite spatial patterns. While the north-western part of the marsh is increasing its hydroperiod, the southwestern part shows a steady decline. Anomalies in each flooding cycle allowed us to assess recent management decisions and monitor their hydrological effects.: This study was funded by the Spanish Ministry of Science and Innovation through the
research projects HYDRA (#CGL2006-02247/BOS) and HYDRA2 (CGL2009-09801/BOS), and by funding from
the European Union’s Horizon 2020 research and innovation program under grant agreement No. 641762 to
ECOPOTENTIAL project. The Espacio Natural de Doñana provided permits for fieldwork in protected areas with
restricted access and historical data from water column readings. We are grateful to many MSc students who
helped in image processing and field sampling.
We acknowledge support by the CSIC Open Access Publication Initiative through its Unit of Information Resources for Research (URICI).Peer reviewe
Multi feature-rich synthetic colour to improve human visual perception of point clouds
Although point features have shown their usefulness in classification with Machine Learning, point cloud visualization enhancement methods focus mainly on lighting. The visualization of point features helps to improve the perception of the 3D environment. This paper proposes Multi Feature-Rich Synthetic Colour (MFRSC) as an alternative non-photorealistic colour approach of natural-coloured point clouds. The method is based on the selection of nine features (reflectance, return number, inclination, depth, height, point density, linearity, planarity, and scattering) associated with five human perception descriptors (edges, texture, shape, size, depth, orientation). The features are reduced to fit the RGB display channels. All feature permutations are analysed according to colour distance with the natural-coloured point cloud and Image Quality Assessment. As a result, the selected feature permutations allow a clear visualization of the scene's rendering objects, highlighting edges, planes, and volumetric objects. MFRSC effectively replaces natural colour, even with less distorted visualization according to BRISQUE, NIQUE and PIQE. In addition, the assignment of features in RGB channels enables the use of MFRSC in software that does not support colorization based on point attributes (most commercially available software). MFRSC can be combined with other non-photorealistic techniques such as Eye-Dome Lighting or Ambient Occlusion.Xunta de Galicia | Ref. ED481B-2019-061Xunta de Galicia | Ref. ED431F 2022/08Agencia Estatal de Investigación | Ref. PID2019-105221RB-C43Universidade de Vigo/CISU
Application of Multi-Sensor Fusion Technology in Target Detection and Recognition
Application of multi-sensor fusion technology has drawn a lot of industrial and academic interest in recent years. The multi-sensor fusion methods are widely used in many applications, such as autonomous systems, remote sensing, video surveillance, and the military. These methods can obtain the complementary properties of targets by considering multiple sensors. On the other hand, they can achieve a detailed environment description and accurate detection of interest targets based on the information from different sensors.This book collects novel developments in the field of multi-sensor, multi-source, and multi-process information fusion. Articles are expected to emphasize one or more of the three facets: architectures, algorithms, and applications. Published papers dealing with fundamental theoretical analyses, as well as those demonstrating their application to real-world problems
- …