580 research outputs found
Cascade Residual Learning: A Two-stage Convolutional Neural Network for Stereo Matching
Leveraging on the recent developments in convolutional neural networks
(CNNs), matching dense correspondence from a stereo pair has been cast as a
learning problem, with performance exceeding traditional approaches. However,
it remains challenging to generate high-quality disparities for the inherently
ill-posed regions. To tackle this problem, we propose a novel cascade CNN
architecture composing of two stages. The first stage advances the recently
proposed DispNet by equipping it with extra up-convolution modules, leading to
disparity images with more details. The second stage explicitly rectifies the
disparity initialized by the first stage; it couples with the first-stage and
generates residual signals across multiple scales. The summation of the outputs
from the two stages gives the final disparity. As opposed to directly learning
the disparity at the second stage, we show that residual learning provides more
effective refinement. Moreover, it also benefits the training of the overall
cascade network. Experimentation shows that our cascade residual learning
scheme provides state-of-the-art performance for matching stereo
correspondence. By the time of the submission of this paper, our method ranks
first in the KITTI 2015 stereo benchmark, surpassing the prior works by a
noteworthy margin.Comment: Accepted at ICCVW 2017. The first two authors contributed equally to
this pape
A Study on Satisfaction and Willingness to Continuously Participate in Business Class Virtual Simulation Competitions -- An Empirical Analysis Based on Technology Acceptance Model (TAM)
The combination of virtual simulation technology and innovation and entrepreneurship education can achieve the role of interaction between the environment and the real environment during students’ virtual simulation, which can attract active participation of users and help college students understand the risks and opportunities faced by enterprises in the process of operation and growth, so as to improve the ability of enterprise operation and management, and deepen students’ understanding of theory and practical ability. In this paper, we construct a technology acceptance model (TAM) through seven dimensions: perceived usefulness, perceived ease of use, external environment, teacher guidance, willingness to participate, satisfaction, and willingness to continue to use, and investigate students’ satisfaction and willingness to continue to participate in the virtual simulation competition in some universities. The results of the data analysis show that the satisfaction and willingness to continue to participate in the virtual simulation competition play a good role
Diffuse gamma-ray emission around the Rosette Nebula
The Rosette Nebula is a young stellar cluster and molecular cloud complex,
located at the edge of the southern shell of a middle-aged SNR Monoceros Loop
(G205.5+0.5). We revisited the GeV gamma-ray emission towards the Rosette
Nebula using more than 13 years of Fermi-LAT data. We tested several spatial
models and found that compared to the result using the CO gas template only,
the inclusion of the HII gas template can significantly improve the likelihood
fit. We performed spectral analysis using the new spatial template. With both
the gamma-ray observation and CO+HII gas data, we derived the cosmic ray
spectrum of different components in the vicinity of the Rosette Nebula. We
found the gamma-ray emissions from Rosette Nebula are substantially harder than
previously reported, which may imply that Rosette Nebula is another example of
a gamma-ray emitting young massive star cluster.Comment: 6 pages, 5 figures, published in MNRA
ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: Semi-Supervised Video Object Segmentation
The Associating Objects with Transformers (AOT) framework has exhibited
exceptional performance in a wide range of complex scenarios for video object
segmentation. In this study, we introduce MSDeAOT, a variant of the AOT series
that incorporates transformers at multiple feature scales. Leveraging the
hierarchical Gated Propagation Module (GPM), MSDeAOT efficiently propagates
object masks from previous frames to the current frame using a feature scale
with a stride of 16. Additionally, we employ GPM in a more refined feature
scale with a stride of 8, leading to improved accuracy in detecting and
tracking small objects. Through the implementation of test-time augmentations
and model ensemble techniques, we achieve the top-ranking position in the
EPIC-KITCHEN VISOR Semi-supervised Video Object Segmentation Challenge.Comment: Top 1 solution for EPIC-KITCHEN Challenge 2023: Semi-Supervised Video
Object Segmentatio
ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: TREK-150 Single Object Tracking
The Associating Objects with Transformers (AOT) framework has exhibited
exceptional performance in a wide range of complex scenarios for video object
tracking and segmentation. In this study, we convert the bounding boxes to
masks in reference frames with the help of the Segment Anything Model (SAM) and
Alpha-Refine, and then propagate the masks to the current frame, transforming
the task from Video Object Tracking (VOT) to video object segmentation (VOS).
Furthermore, we introduce MSDeAOT, a variant of the AOT series that
incorporates transformers at multiple feature scales. MSDeAOT efficiently
propagates object masks from previous frames to the current frame using two
feature scales of 16 and 8. As a testament to the effectiveness of our design,
we achieved the 1st place in the EPIC-KITCHENS TREK-150 Object Tracking
Challenge.Comment: Top 1 solution for EPIC-KITCHEN Challenge 2023: TREK-150 Single
Object Tracking. arXiv admin note: text overlap with arXiv:2307.0201
On symbology and differential equations of Feynman integrals from Schubert analysis
We take the first step in generalizing the so-called "Schubert analysis",
originally proposed in twistor space for four-dimensional kinematics, to the
study of symbol letters and more detailed information on canonical differential
equations for Feynman integral families in general dimensions with general
masses. The basic idea is to work in embedding space and compute possible
cross-ratios built from (Lorentz products of) maximal cut solutions for all
integrals in the family. We demonstrate the power of the method using the most
general one-loop integrals, as well as various two-loop planar integral
families (such as sunrise, double-triangle and double-box) in general
dimensions. Not only can we obtain all symbol letters as cross-ratios from
maximal-cut solutions, but we also reproduce entries in the canonical
differential equations satisfied by a basis of dlog integrals.Comment: 51 pages, many figure
Robotic grasp detection based on image processing and random forest
© 2019, The Author(s). Real-time grasp detection plays a key role in manipulation, and it is also a complex task, especially for detecting how to grasp novel objects. This paper proposes a very quick and accurate approach to detect robotic grasps. The main idea is to perform grasping of novel objects in a typical RGB-D scene view. Our goal is not to find the best grasp for every object but to obtain the local optimal grasps in candidate grasp rectangles. There are three main contributions to our detection work. Firstly, an improved graph segmentation approach is used to do objects detection and it can separate objects from the background directly and fast. Secondly, we develop a morphological image processing method to generate candidate grasp rectangles set which avoids us to search grasp rectangles globally. Finally, we train a random forest model to predict grasps and achieve an accuracy of 94.26%. The model is mainly used to score every element in our candidate grasps set and the one gets the highest score will be converted to the final grasp configuration for robots. For real-world experiments, we set up our system on a tabletop scene with multiple objects and when implementing robotic grasps, we control Baxter robot with a different inverse kinematics strategy rather than the built-in one
Simulation of a Solar Jet Formed from an Untwisting Flux Rope Interacting with a Null Point
Coronal jets are eruptions identified by a collimated, sometimes twisted
spire. They are small-scale energetic events compared with flares. Using
multi-wavelength observations from the Solar Dynamics Observatory/Atmospheric
Imaging Assembly (SDO/AIA) and a magnetogram from Hinode/Spectro-Polarimeter
(Hinode/SP), we study the formation and evolution of a jet occurring on 2019
March 22 in the active region NOAA 12736. A zero- magnetohydrodynamic
(MHD) simulation is conducted to probe the initiation mechanisms and appearance
of helical motion during this jet event. As the simulation reveals, there are
two pairs of field lines at the jet base, indicating two distinct magnetic
structures. One structure outlines a flux rope lying low above the photosphere
in the north of a bald patch region and the other structure shows a null point
high in the corona in the south. The untwisting motions of the observed flux
rope was recovered by adding an anomalous (artificial) resistivity in the
simulation. A reconnection occurs at the bald patch in the flux rope structure,
which is moving upwards and simultaneously encounters the field lines of the
null point structure. The interaction of the two structures results in the jet
while the twist of the flux rope is transferred to the jet by the reconnected
field lines. The rotational motion of the flux rope is proposed to be an
underlying trigger of this process and responsible for helical motions in the
jet spire.Comment: 17pages, 9 figures. Accepted for publication in The Astrophysical
Journa
JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery
In this study, we focus on the problem of 3D human mesh recovery from a
single image under obscured conditions. Most state-of-the-art methods aim to
improve 2D alignment technologies, such as spatial averaging and 2D joint
sampling. However, they tend to neglect the crucial aspect of 3D alignment by
improving 3D representations. Furthermore, recent methods struggle to separate
the target human from occlusion or background in crowded scenes as they
optimize the 3D space of target human with 3D joint coordinates as local
supervision. To address these issues, a desirable method would involve a
framework for fusing 2D and 3D features and a strategy for optimizing the 3D
space globally. Therefore, this paper presents 3D JOint contrastive learning
with TRansformers (JOTR) framework for handling occluded 3D human mesh
recovery. Our method includes an encoder-decoder transformer architecture to
fuse 2D and 3D representations for achieving 2D3D aligned results in a
coarse-to-fine manner and a novel 3D joint contrastive learning approach for
adding explicitly global supervision for the 3D feature space. The contrastive
learning approach includes two contrastive losses: joint-to-joint contrast for
enhancing the similarity of semantically similar voxels (i.e., human joints),
and joint-to-non-joint contrast for ensuring discrimination from others (e.g.,
occlusions and background). Qualitative and quantitative analyses demonstrate
that our method outperforms state-of-the-art competitors on both
occlusion-specific and standard benchmarks, significantly improving the
reconstruction of occluded humans.Comment: Camera Ready Version for ICCV 202
- …