310 research outputs found
A Decision Making Framework for Recommended Maintenance of Road Segments
With the rapid development of global road transportation, countries worldwide
have completed the construction of road networks. However, the ensuing
challenge lies in the maintenance of existing roads. It is well-known that
countries allocate limited budgets to road maintenance projects, and road
management departments face difficulties in making scientifically informed
maintenance decisions. Therefore, integrating various artificial intelligence
decision-making techniques to thoroughly explore historical maintenance data
and adapt them to the context of road maintenance scientific decision-making
has become an urgent issue. This integration aims to provide road management
departments with more scientific tools and evidence for decision-making. The
framework proposed in this paper primarily addresses the following four issues:
1) predicting the pavement performance of various routes, 2) determining the
prioritization of maintenance routes, 3) making maintenance decisions based on
the evaluation of the effects of past maintenance, and considering
comprehensive technical and management indicators, and 4) determining the
prioritization of maintenance sections based on the maintenance effectiveness
and recommended maintenance effectiveness. By tackling these four problems, the
framework enables intelligent decision-making for the optimal maintenance plan
and maintenance sections, taking into account limited funding and historical
maintenance management experience.Comment: 19 pages, 8 figures, 4 tables, and 2 algorithm
Abrasion Behavior of High Manganese Steel under Low Impact Energy and Corrosive Conditions
The abrasion behavior of high manganese steel is investigated under three levels of impact energy in acid-ironstone slurry. The wear test was carried out by an MLDF-10 tester with impact energy of 0.7 J, 1.2 J, and 1.7 J. The impact abrasion property of high manganese steel in corrosive condition was compared according to the wear mass loss curves. The wear mechanism was analysed by the SEM analysis of the worn surface and the optical metallographic analysis of the vertical section to the wear surface. The results show that the impact energy has a great effect on the impact corrosion and abrasion properties of it. Its abrasion mechanism in corrosive condition is mainly microplough and breakage of plastic deformed ridges and wedges under the impact energy of 0.7 J. It is mainly the spelling of plastic deformed ridges and wedges under 1.2 J and the spalling of the work-hardening layer under 1.7 J after a long time testing
Learning a Deep Color Difference Metric for Photographic Images
Most well-established and widely used color difference (CD) metrics are
handcrafted and subject-calibrated against uniformly colored patches, which do
not generalize well to photographic images characterized by natural scene
complexities. Constructing CD formulae for photographic images is still an
active research topic in imaging/illumination, vision science, and color
science communities. In this paper, we aim to learn a deep CD metric for
photographic images with four desirable properties. First, it well aligns with
the observations in vision science that color and form are linked inextricably
in visual cortical processing. Second, it is a proper metric in the
mathematical sense. Third, it computes accurate CDs between photographic
images, differing mainly in color appearances. Fourth, it is robust to mild
geometric distortions (e.g., translation or due to parallax), which are often
present in photographic images of the same scene captured by different digital
cameras. We show that all these properties can be satisfied at once by learning
a multi-scale autoregressive normalizing flow for feature transform, followed
by the Euclidean distance which is linearly proportional to the human
perceptual CD. Quantitative and qualitative experiments on the large-scale SPCD
dataset demonstrate the promise of the learned CD metric
Light Field Diffusion for Single-View Novel View Synthesis
Single-view novel view synthesis, the task of generating images from new
viewpoints based on a single reference image, is an important but challenging
task in computer vision. Recently, Denoising Diffusion Probabilistic Model
(DDPM) has become popular in this area due to its strong ability to generate
high-fidelity images. However, current diffusion-based methods directly rely on
camera pose matrices as viewing conditions, globally and implicitly introducing
3D constraints. These methods may suffer from inconsistency among generated
images from different perspectives, especially in regions with intricate
textures and structures. In this work, we present Light Field Diffusion (LFD),
a conditional diffusion-based model for single-view novel view synthesis.
Unlike previous methods that employ camera pose matrices, LFD transforms the
camera view information into light field encoding and combines it with the
reference image. This design introduces local pixel-wise constraints within the
diffusion models, thereby encouraging better multi-view consistency.
Experiments on several datasets show that our LFD can efficiently generate
high-fidelity images and maintain better 3D consistency even in intricate
regions. Our method can generate images with higher quality than NeRF-based
models, and we obtain sample quality similar to other diffusion-based models
but with only one-third of the model size
Regeneration under crisis - research on the renewal and evolution of the forms of future urban residential communities
CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer
Reconstructing personalized animatable head avatars has significant
implications in the fields of AR/VR. Existing methods for achieving explicit
face control of 3D Morphable Models (3DMM) typically rely on multi-view images
or videos of a single subject, making the reconstruction process complex.
Additionally, the traditional rendering pipeline is time-consuming, limiting
real-time animation possibilities. In this paper, we introduce CVTHead, a novel
approach that generates controllable neural head avatars from a single
reference image using point-based neural rendering. CVTHead considers the
sparse vertices of mesh as the point set and employs the proposed
Vertex-feature Transformer to learn local feature descriptors for each vertex.
This enables the modeling of long-range dependencies among all the vertices.
Experimental results on the VoxCeleb dataset demonstrate that CVTHead achieves
comparable performance to state-of-the-art graphics-based methods. Moreover, it
enables efficient rendering of novel human heads with various expressions, head
poses, and camera views. These attributes can be explicitly controlled using
the coefficients of 3DMMs, facilitating versatile and realistic animation in
real-time scenarios.Comment: WACV202
Transfer Learning with Optimal Transportation and Frequency Mixup for EEG-based Motor Imagery Recognition
Peer reviewedPublisher PD
Turning a CLIP Model into a Scene Text Spotter
We exploit the potential of the large-scale Contrastive Language-Image
Pretraining (CLIP) model to enhance scene text detection and spotting tasks,
transforming it into a robust backbone, FastTCM-CR50. This backbone utilizes
visual prompt learning and cross-attention in CLIP to extract image and
text-based prior knowledge. Using predefined and learnable prompts,
FastTCM-CR50 introduces an instance-language matching process to enhance the
synergy between image and text embeddings, thereby refining text regions. Our
Bimodal Similarity Matching (BSM) module facilitates dynamic language prompt
generation, enabling offline computations and improving performance.
FastTCM-CR50 offers several advantages: 1) It can enhance existing text
detectors and spotters, improving performance by an average of 1.7% and 1.5%,
respectively. 2) It outperforms the previous TCM-CR50 backbone, yielding an
average improvement of 0.2% and 0.56% in text detection and spotting tasks,
along with a 48.5% increase in inference speed. 3) It showcases robust few-shot
training capabilities. Utilizing only 10% of the supervised data, FastTCM-CR50
improves performance by an average of 26.5% and 5.5% for text detection and
spotting tasks, respectively. 4) It consistently enhances performance on
out-of-distribution text detection and spotting datasets, particularly the
NightTime-ArT subset from ICDAR2019-ArT and the DOTA dataset for oriented
object detection. The code is available at https://github.com/wenwenyu/TCM.Comment: arXiv admin note: text overlap with arXiv:2302.1433
- …