8 research outputs found

    Computation of outer inverse of tensors based on tt-product

    Full text link
    Tensor operations play an essential role in various fields of science and engineering, including multiway data analysis. In this study, we establish a few basic properties of the range and null space of a tensor using block circulant matrices and the discrete Fourier matrix. We then discuss the outer inverse of tensors based on tt-product with a prescribed range and kernel of third-order tensors. We address the relation of this outer inverse with other generalized inverses, such as the Moore-Penrose inverse, group inverse, and Drazin inverse. In addition, we present a few algorithms for computing the outer inverses of the tensors. In particular, a tt-QR decomposition based algorithm is developed for computing the outer inverses.Comment: 2

    Uncertainty-aware Salient Object Detection

    Get PDF
    Saliency detection models are trained to discover the region(s) of an image that attract human attention. According to whether depth data is used, static image saliency detection models can be divided into RGB image saliency detection models, and RGB-D image saliency detection models. The former predict salient regions of the RGB image, while the latter take both the RGB image and the depth data as input. Conventional saliency prediction models typically learn a deterministic mapping from images to the corresponding ground truth saliency maps without modeling the uncertainty of predictions, following the supervised learning pipeline. This thesis is dedicated to learning a conditional distribution over saliency maps, given an input image, and modeling the uncertainty of predictions. For RGB-D saliency detection, we present the first generative model based framework to achieve uncertainty-aware prediction. Our framework includes two main models: 1) a generator model and 2) an inference model. The generator model is an encoder-decoder saliency network. To infer the latent variable, we introduce two different solutions: i) a Conditional Variational Auto-encoder with an extra encoder to approximate the posterior distribution of the latent variable; and ii) an Alternating Back-Propagation technique, which directly samples the latent variable from the true posterior distribution. One drawback of above model is that it fails to explicitly model the connection between RGB image and depth data to achieve effective cooperative learning. We further introduce a novel latent variable model based complementary learning framework to explicitly model the complementary information between the two modes, namely the RGB mode and depth mode. Specifically, we first design a regularizer using mutual-information minimization to reduce the redundancy between appearance features from RGB and geometric features from depth in the latent space. Then we fuse the latent features of each mode to achieve multi-modal feature fusion. Extensive experiments on benchmark RGB-D saliency datasets illustrate the effectiveness of our framework. For RGB saliency detection, we propose a generative saliency prediction model based on the conditional generative cooperative network, where a conditional latent variable model and a conditional energy-based model are jointly trained to predict saliency in a cooperative manner. The latent variable model serves as a coarse saliency model to produce a fast initial prediction, which is then refined by Langevin revision of the energy-based model that serves as a fine saliency model. Apart from the fully supervised learning framework, we also investigate weakly supervised learning, and propose the first scribble-based weakly-supervised salient object detection model. In doing so, we first relabel an existing large-scale salient object detection dataset with scribbles, namely S-DUTS dataset. To mitigate the missing structure information in scribble annotation, we propose an auxiliary edge detection task to localize object edges explicitly, and a gated structure-aware loss to place constraints on the scope of structure to be recovered. To further reduce the labeling burden, we introduce a noise-aware encoder-decoder framework to disentangle a clean saliency predictor from noisy training examples, where the noisy labels are generated by unsupervised handcrafted feature-based methods. The whole model that represents noisy labels is a sum of the two sub-models. The goal of training the model is to estimate the parameters of both sub-models, and simultaneously infer the corresponding latent vector of each noisy label. We propose to train the model by using an alternating back-propagation algorithm. To prevent the network from converging to trivial solutions, we utilize an edge-aware smoothness loss to regularize hidden saliency maps to have similar structures as their corresponding images
    corecore