6,522 research outputs found
Frustum PointNets for 3D Object Detection from RGB-D Data
In this work, we study 3D object detection from RGB-D data in both indoor and
outdoor scenes. While previous methods focus on images or 3D voxels, often
obscuring natural 3D patterns and invariances of 3D data, we directly operate
on raw point clouds by popping up RGB-D scans. However, a key challenge of this
approach is how to efficiently localize objects in point clouds of large-scale
scenes (region proposal). Instead of solely relying on 3D proposals, our method
leverages both mature 2D object detectors and advanced 3D deep learning for
object localization, achieving efficiency as well as high recall for even small
objects. Benefited from learning directly in raw point clouds, our method is
also able to precisely estimate 3D bounding boxes even under strong occlusion
or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection
benchmarks, our method outperforms the state of the art by remarkable margins
while having real-time capability.Comment: 15 pages, 12 figures, 14 table
Leptogenesis and light scalar dark matter in a model
We discuss the possibility of light scalar dark matter in a
model, in which the dark matter carries
charge but it is a singlet in the Standard Model. We
consider the case that the right-handed neutrinos not only generate baryon
asymmetry but also are related with dark matter production. We assume that dark
matter production mainly comes from scattering associated with a pair of
right-handed neutrinos while other related processes are highly suppressed due
to the tiny charge of dark matter, and the dark
matter relic density are generated via freeze-in mechanism. A feasible
parameter space is considered and we found the correct dark matter relic
density can be obtained without influencing the result of leptogenesis, and the
allowed dark matter mass region is
Resolved shear stress intensity coefficient and fatigue crack growth in large crystals
Fatigue crack growth in large grain Al alloy was studied. Fatigue crack growth is caused primarily by shear decohesion due to dislocation motion in the crack tip region. The crack paths in the large crystals are very irregular and zigzag. The crack planes are often inclined to the loading axis both in the inplane direction and the thickness direction. The stress intensity factors of such inclined cracks are approximated from the two dimensional finite element calculations. The plastic deformation in a large crystal is highly anisotropic, and dislocation motion in such crystals are driven by the resolved shear stress. The resolved shear stress intensity coefficient in a crack solid, RSSIC, is defined, and the coefficients for the slip systems at a crack tip are evaluated from the calculated stress intensity factors. The orientations of the crack planes are closely related to the slip planes with the high RSSIC values. If a single slip system has a much higher RSSIC than all the others, the crack will follow the slip plane, and the slip plane becomes the crack plane. If two or more slip systems have a high RSSIC, the crack plane is the result of the decohesion processes on these active slip planes
THE BIOMECHANICAL DIAGNOSIS OF TRIP-PUTTING TECHNIQUE
The purpose of this paper was to apply biomechanical analysis to the shot put technique of a Chinese high performance athlete. The data collected on trip-putting technique was recorded with fixed-spot method using PANASONIC DP200 Professional Camera, with a speed of 50 fields/second. The AIJIE 2D Analysis System was used to analyze the images, collect relevant kinetic data. In order to determine the cause of an unsatisfactory performance, the focus of the study was on the technique during the transition phase.
During transition phase the interval between landings is too long, the distance between two feet after landing is too small. In addition, it was important to determine the main reason for an apparent loss of power in this phase
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
This work presents DocPedia, a novel large multimodal model (LMM) for
versatile OCR-free document understanding, capable of parsing images up to
2,5602,560 resolution. Unlike existing work either struggle with
high-resolution documents or give up the large language model thus vision or
language ability constrained, our DocPedia directly processes visual input in
the frequency domain rather than the pixel space. The unique characteristic
enables DocPedia to capture a greater amount of visual and textual information
using a limited number of visual tokens. To consistently enhance both
perception and comprehension abilities of our model, we develop a dual-stage
training strategy and enrich instructions/annotations of all training tasks
covering multiple document types. Extensive quantitative and qualitative
experiments conducted on various publicly available benchmarks confirm the
mutual benefits of jointly learning perception and comprehension tasks. The
results provide further evidence of the effectiveness and superior performance
of our DocPedia over other methods
- …