Search CORE

5,372 research outputs found

Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

Author: A Newell
AN Rajagopalan
D Scharstein
H Park
J Zbontar
M Moeller
R Held
S Kuthirummal
SW Hasinoff
Yang Yang
YY Schechner
Publication venue
Publication date: 10/08/2020
Field of study

Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception. In computer vision, the two problems are traditionally solved in separate tracks. In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference. Specifically, we use a pair of focal stacks as input to emulate human perception. We first construct a comprehensive focal stack training dataset synthesized by depth-guided light field rendering. We then construct three individual networks: a Focus-Net to extract depth from a single focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from the focal stack, and a Stereo-Net to conduct stereo matching. We show how to integrate them into a unified BDfF-Net to obtain high-quality depth maps. Comprehensive experiments show that our approach outperforms the state-of-the-art in both accuracy and speed and effectively emulates human vision systems

arXiv.org e-Print Archive

Crossref

Receiver Architectures for MIMO-OFDM Based on a Combined VMP-SP Algorithm

Author: Christensen Lars P. B.
Fleury Bernard H.
Kirkelund Gunvor E.
Manchón Carles Navarro
Riegler Erwin
Publication venue
Publication date: 01/01/2011
Field of study

Iterative information processing, either based on heuristics or analytical frameworks, has been shown to be a very powerful tool for the design of efficient, yet feasible, wireless receiver architectures. Within this context, algorithms performing message-passing on a probabilistic graph, such as the sum-product (SP) and variational message passing (VMP) algorithms, have become increasingly popular. In this contribution, we apply a combined VMP-SP message-passing technique to the design of receivers for MIMO-ODFM systems. The message-passing equations of the combined scheme can be obtained from the equations of the stationary points of a constrained region-based free energy approximation. When applied to a MIMO-OFDM probabilistic model, we obtain a generic receiver architecture performing iterative channel weight and noise precision estimation, equalization and data decoding. We show that this generic scheme can be particularized to a variety of different receiver structures, ranging from high-performance iterative structures to low complexity receivers. This allows for a flexible design of the signal processing specially tailored for the requirements of each specific application. The numerical assessment of our solutions, based on Monte Carlo simulations, corroborates the high performance of the proposed algorithms and their superiority to heuristic approaches

arXiv.org e-Print Archive

CiteSeerX

VBN

GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector

Author: Fan Deng-Ping
Fan Qi
Fu Huazhu
Qin Jie
Tai Yu-Wing
Tang Chi-Keung
Van Gool Luc
Zheng Peng
Publication venue
Publication date: 30/07/2022
Field of study

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes. The proposed GCoNet+ achieves the new state-of-the-art performance for co-salient object detection (CoSOD) through mining consensus representations based on the following two essential criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module (GAM); 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module (GCM) conditioning on the inconsistent consensus. To further improve the accuracy, we design a series of simple yet effective components as follows: i) a recurrent auxiliary classification module (RACM) promoting the model learning at the semantic level; ii) a confidence enhancement module (CEM) helping the model to improve the quality of the final predictions; and iii) a group-based symmetric triplet (GST) loss guiding the model to learn more discriminative features. Extensive experiments on three challenging benchmarks, i.e., CoCA, CoSOD3k, and CoSal2015, demonstrate that our GCoNet+ outperforms the existing 12 cutting-edge models. Code has been released at https://github.com/ZhengPeng7/GCoNet_plus

arXiv.org e-Print Archive

Transformer Transforms Salient Object Detection and Camouflaged Object Detection

Author: Barnes Nick
Dai Yuchao
Fan Deng-Ping
Li Aixuan
Lv Yunqiu
Mao Yuxin
Tian Xinyu
Wan Zhexiong
Zhang Jing
Publication venue
Publication date: 26/06/2021
Field of study

The transformer networks are particularly good at modeling long-range dependencies within a long sequence. In this paper, we conduct research on applying the transformer networks for salient object detection (SOD). We adopt the dense transformer backbone for fully supervised RGB image based SOD, RGB-D image pair based SOD, and weakly supervised SOD within a unified framework based on the observation that the transformer backbone can provide accurate structure modeling, which makes it powerful in learning from weak labels with less structure information. Further, we find that the vision transformer architectures do not offer direct spatial supervision, instead encoding position as a feature. Therefore, we investigate the contributions of two strategies to provide stronger spatial supervision through the transformer layers within our unified framework, namely deep supervision and difficulty-aware learning. We find that deep supervision can get gradients back into the higher level features, thus leads to uniform activation within the same semantic object. Difficulty-aware learning on the other hand is capable of identifying the hard pixels for effective hard negative mining. We also visualize features of conventional backbone and transformer backbone before and after fine-tuning them for SOD, and find that transformer backbone encodes more accurate object structure information and more distinct semantic information within the lower and higher level features respectively. We also apply our model to camouflaged object detection (COD) and achieve similar observations as the above three SOD tasks. Extensive experimental results on various SOD and COD tasks illustrate that transformer networks can transform SOD and COD, leading to new benchmarks for each related task. The source code and experimental results are available via our project page: https://github.com/fupiao1998/TrasformerSOD.Comment: Technical report, 18 pages, 22 figure

arXiv.org e-Print Archive

Imputation estimators for unnormalized models with missing data

Author: Kim Jae Kwang
Kim Jae Kwang
Matsuda Takeru
Uehara Masatoshi
Publication venue
Publication date: 12/03/2019
Field of study

Several statistical models are given in the form of unnormalized densities, and calculation of the normalization constant is intractable. We propose estimation methods for such unnormalized models with missing data. The key concept is to combine imputation techniques with estimators for unnormalized models including noise contrastive estimation and score matching. In addition, we derive asymptotic distributions of the proposed estimators and construct confidence intervals. Simulation results with truncated Gaussian graphical models and the application to real data of wind direction reveal that the proposed methods effectively enable statistical inference with unnormalized models from missing data.Comment: To appear (AISTATS 2020

arXiv.org e-Print Archive

Digital Repository @ Iowa State University (ISU)