290 research outputs found
Advancing Transformer's Capabilities in Commonsense Reasoning
Recent advances in general purpose pre-trained language models have shown
great potential in commonsense reasoning. However, current works still perform
poorly on standard commonsense reasoning benchmarks including the Com2Sense
Dataset. We argue that this is due to a disconnect with current cutting-edge
machine learning methods. In this work, we aim to bridge the gap by introducing
current ML-based methods to improve general purpose pre-trained language models
in the task of commonsense reasoning. Specifically, we experiment with and
systematically evaluate methods including knowledge transfer, model ensemble,
and introducing an additional pairwise contrastive objective. Our best model
outperforms the strongest previous works by ~15\% absolute gains in Pairwise
Accuracy and ~8.7\% absolute gains in Standard Accuracy
Block-Randomized Stochastic Methods for Tensor Ring Decomposition
Tensor ring (TR) decomposition is a simple but effective tensor network for
analyzing and interpreting latent patterns of tensors. In this work, we propose
a doubly randomized optimization framework for computing TR decomposition. It
can be regarded as a sensible mix of randomized block coordinate descent and
stochastic gradient descent, and hence functions in a double-random manner and
can achieve lightweight updates and a small memory footprint. Further, to
improve the convergence, especially for ill-conditioned problems, we propose a
scaled version of the framework that can be viewed as an adaptive
preconditioned or diagonally-scaled variant. Four different probability
distributions for selecting the mini-batch and the adaptive strategy for
determining the step size are also provided. Finally, we present the
theoretical properties and numerical performance for our proposals
Unsupervised Hierarchical Domain Adaptation for Adverse Weather Optical Flow
Optical flow estimation has made great progress, but usually suffers from
degradation under adverse weather. Although semi/full-supervised methods have
made good attempts, the domain shift between the synthetic and real adverse
weather images would deteriorate their performance. To alleviate this issue,
our start point is to unsupervisedly transfer the knowledge from source clean
domain to target degraded domain. Our key insight is that adverse weather does
not change the intrinsic optical flow of the scene, but causes a significant
difference for the warp error between clean and degraded images. In this work,
we propose the first unsupervised framework for adverse weather optical flow
via hierarchical motion-boundary adaptation. Specifically, we first employ
image translation to construct the transformation relationship between clean
and degraded domains. In motion adaptation, we utilize the flow consistency
knowledge to align the cross-domain optical flows into a motion-invariance
common space, where the optical flow from clean weather is used as the
guidance-knowledge to obtain a preliminary optical flow for adverse weather.
Furthermore, we leverage the warp error inconsistency which measures the motion
misalignment of the boundary between the clean and degraded domains, and
propose a joint intra- and inter-scene boundary contrastive adaptation to
refine the motion boundary. The hierarchical motion and boundary adaptation
jointly promotes optical flow in a unified framework. Extensive quantitative
and qualitative experiments have been performed to verify the superiority of
the proposed method
Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow
Single RGB or LiDAR is the mainstream sensor for the challenging scene flow,
which relies heavily on visual features to match motion features. Compared with
single modality, existing methods adopt a fusion strategy to directly fuse the
cross-modal complementary knowledge in motion space. However, these direct
fusion methods may suffer the modality gap due to the visual intrinsic
heterogeneous nature between RGB and LiDAR, thus deteriorating motion features.
We discover that event has the homogeneous nature with RGB and LiDAR in both
visual and motion spaces. In this work, we bring the event as a bridge between
RGB and LiDAR, and propose a novel hierarchical visual-motion fusion framework
for scene flow, which explores a homogeneous space to fuse the cross-modal
complementary knowledge for physical interpretation. In visual fusion, we
discover that event has a complementarity (relative v.s. absolute) in luminance
space with RGB for high dynamic imaging, and has a complementarity (local
boundary v.s. global shape) in scene structure space with LiDAR for structure
integrity. In motion fusion, we figure out that RGB, event and LiDAR are
complementary (spatial-dense, temporal-dense v.s. spatiotemporal-sparse) to
each other in correlation space, which motivates us to fuse their motion
correlations for motion continuity. The proposed hierarchical fusion can
explicitly fuse the multimodal knowledge to progressively improve scene flow
from visual space to motion space. Extensive experiments have been performed to
verify the superiority of the proposed method
- …