97 research outputs found
Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning
Compositional Zero-Shot Learning (CZSL) aims to recognize novel concepts
formed by known states and objects during training. Existing methods either
learn the combined state-object representation, challenging the generalization
of unseen compositions, or design two classifiers to identify state and object
separately from image features, ignoring the intrinsic relationship between
them. To jointly eliminate the above issues and construct a more robust CZSL
system, we propose a novel framework termed Decomposed Fusion with Soft Prompt
(DFSP)1, by involving vision-language models (VLMs) for unseen composition
recognition. Specifically, DFSP constructs a vector combination of learnable
soft prompts with state and object to establish the joint representation of
them. In addition, a cross-modal decomposed fusion module is designed between
the language and image branches, which decomposes state and object among
language features instead of image features. Notably, being fused with the
decomposed features, the image features can be more expressive for learning the
relationship with states and objects, respectively, to improve the response of
unseen compositions in the pair space, hence narrowing the domain gap between
seen and unseen sets. Experimental results on three challenging benchmarks
demonstrate that our approach significantly outperforms other state-of-the-art
methods by large margins.Comment: 10 pages included reference, conferenc
Analysing Railway Safety with Systems Thinking
Railway system is a socio-technical system because the operation of such system also
heavily relies on the management of human activities and operating procedures in the organisation, as well as the execution of technical subsystems. Safety of these systems therefore is more than just about engineering their technical subsystems. The latest approach from systems engineering considers that an accident is due to inadequate controlled interactions in the system and is usually a dynamic event chain started from the activation of a hazard and culminated in a complex process of sequential and concurrent events until the system is eventually out of control. Meanwhile the analysis of these systems’s safety becomes much harder when simply applying the traditional techniques of safety assessment. It is because, first of all, a social-technical system consists of a lot of complex and
non-linear interactions, traditional techniques show their limits when analysing complex systems. And secondly, the safety of a social-technical system requires a system perspective, which should take all the behaviours (desired and undesired but predicted) of a system as a whole in the context of its environment. To capture the information needed, the models for these analyses (i.e., fault tree and FMEA table) will become too complex to have a systemic view of each individual causal factor. In this paper, we proposed an approach based on system thinking and system dynamics to analyse the safety of a social-technical system. The case study of a tram accident is simple enough for the purpose of demonstrating its feasibility and benefits. The comparison with fault tree analysis was conducted, but it was not for the evaluation of our approach. The real evaluation comes from the extensive
applications in real world
Parallel implementation of 3D global MHD simulations for Earth’s magnetosphere
AbstractThis paper presents a dynamic domain decomposition (D3) technique for implementing the parallelization of the piecewise parabolic method (PPM) for solving the ideal magnetohydrodynamics (MHD) equations. The key point of D3 is distributing the work dynamically among processes during the execution of the PPM algorithm. This parallel code utilizes D3 with a message passing interface (MPI) in order to permit efficient implementation on clusters of distributed memory machines and may also simultaneously exploit threading for multiprocessing shared address space architectures. 3D global MHD simulation results for the Earth’s magnetosphere on the massively parallel supercomputers Deepcomp 1800 and 6800 demonstrate the scalability and efficiency of our parallelization strategy
GBE-MLZSL: A Group Bi-Enhancement Framework for Multi-Label Zero-Shot Learning
This paper investigates a challenging problem of zero-shot learning in the
multi-label scenario (MLZSL), wherein, the model is trained to recognize
multiple unseen classes within a sample (e.g., an image) based on seen classes
and auxiliary knowledge, e.g., semantic information. Existing methods usually
resort to analyzing the relationship of various seen classes residing in a
sample from the dimension of spatial or semantic characteristics, and transfer
the learned model to unseen ones. But they ignore the effective integration of
local and global features. That is, in the process of inferring unseen classes,
global features represent the principal direction of the image in the feature
space, while local features should maintain uniqueness within a certain range.
This integrated neglect will make the model lose its grasp of the main
components of the image. Relying only on the local existence of seen classes
during the inference stage introduces unavoidable bias. In this paper, we
propose a novel and effective group bi-enhancement framework for MLZSL, dubbed
GBE-MLZSL, to fully make use of such properties and enable a more accurate and
robust visual-semantic projection. Specifically, we split the feature maps into
several feature groups, of which each feature group can be trained
independently with the Local Information Distinguishing Module (LID) to ensure
uniqueness. Meanwhile, a Global Enhancement Module (GEM) is designed to
preserve the principal direction. Besides, a static graph structure is designed
to construct the correlation of local features. Experiments on large-scale
MLZSL benchmark datasets NUS-WIDE and Open-Images-v4 demonstrate that the
proposed GBE-MLZSL outperforms other state-of-the-art methods with large
margins.Comment: 11 pages, 8 figure
Galactic Cosmic Rays Modulation in the Vicinity of Corotating Interaction Regions: Observations During the Last Two Solar Minima
Corotating interaction regions (CIRs) are responsible for short-term recurrent cosmic-ray modulation, prominent near solar minima. Using the OMNI data sets for two periods of low solar activity near the beginning and end of solar cycle 24, superposed epoch analysis was performed on the solar wind plasma features for 53 and 43 events during periods 2007–2008 and 2017–2018, respectively. Turbulent properties of the solar wind were studied using the variance method for each CIR. Power spectra have been constructed for overlapped subintervals in the vicinity of stream interfaces (SIs). Using measured correlation lengths and turbulent energies, parallel and perpendicular diffusion mean free paths for cosmic-ray ions have been inferred based on two distinct theoretical formulations. For the two periods with opposite solar polarities, our results show that unlike solar wind speed, magnetic field strength, flow pressure, and proton density are relatively higher during the latest period. Increased turbulent energy and reduced parallel transport coefficients of energetic particles at the SIs are observed. The diffusion coefficients follow the same trends during both periods. The perpendicular diffusion starts increasing nearly a day before SIs and is higher in the fast wind. Superposed epoch analysis is performed on the >120 MeV proton count rate obtained from the CRIS instrument on board the ACE spacecraft for the same events. The recorded proton rates have peaks half a day before a SI and reach their minimum more than a day after a SI and have a high anticorrelation with the perpendicular diffusion coefficient
Semisupervised hypergraph discriminant learning for dimensionality reduction of hyperspectral image.
Semisupervised learning is an effective technique to represent the intrinsic features of a hyperspectral image (HSI), which can reduce the cost to obtain the labeled information of samples. However, traditional semisupervised learning methods fail to consider multiple properties of an HSI, which has restricted the discriminant performance of feature representation. In this article, we introduce the hypergraph into semisupervised learning to reveal the complex multistructures of an HSI, and construct a semisupervised discriminant hypergraph learning (SSDHL) method by designing an intraclass hypergraph and an interclass graph with the labeled samples. SSDHL constructs an unsupervised hypergraph with the unlabeled samples. In addition, a total scatter matrix is used to measure the distribution of the labeled and unlabeled samples. Then, a low-dimensional projection function is constructed to compact the properties of the intraclass hypergraph and the unsupervised hypergraph, and simultaneously separate the characteristics of the interclass graph and the total scatter matrix. Finally, according to the objective function, we can obtain the projection matrix and the low-dimensional features. Experiments on three HSI data sets (Botswana, KSC, and PaviaU) show that the proposed method can achieve better classification results compared with a few state-of-the-art methods. The result indicates that SSDHL can simultaneously utilize the labeled and unlabeled samples to represent the homogeneous properties and restrain the heterogeneous characteristics of an HSI
DRPT: Disentangled and Recurrent Prompt Tuning for Compositional Zero-Shot Learning
Compositional Zero-shot Learning (CZSL) aims to recognize novel concepts
composed of known knowledge without training samples. Standard CZSL either
identifies visual primitives or enhances unseen composed entities, and as a
result, entanglement between state and object primitives cannot be fully
utilized. Admittedly, vision-language models (VLMs) could naturally cope with
CZSL through tuning prompts, while uneven entanglement leads prompts to be
dragged into local optimum. In this paper, we take a further step to introduce
a novel Disentangled and Recurrent Prompt Tuning framework termed DRPT to
better tap the potential of VLMs in CZSL. Specifically, the state and object
primitives are deemed as learnable tokens of vocabulary embedded in prompts and
tuned on seen compositions. Instead of jointly tuning state and object, we
devise a disentangled and recurrent tuning strategy to suppress the traction
force caused by entanglement and gradually optimize the token parameters,
leading to a better prompt space. Notably, we develop a progressive fine-tuning
procedure that allows for incremental updates to the prompts, optimizing the
object first, then the state, and vice versa. Meanwhile, the optimization of
state and object is independent, thus clearer features can be learned to
further alleviate the issue of entangling misleading optimization. Moreover, we
quantify and analyze the entanglement in CZSL and supplement entanglement
rebalancing optimization schemes. DRPT surpasses representative
state-of-the-art methods on extensive benchmark datasets, demonstrating
superiority in both accuracy and efficiency
DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning
Federated learning (FL) has emerged as a powerful paradigm for learning from
decentralized data, and federated domain generalization further considers the
test dataset (target domain) is absent from the decentralized training data
(source domains). However, most existing FL methods assume that domain labels
are provided during training, and their evaluation imposes explicit constraints
on the number of domains, which must strictly match the number of clients.
Because of the underutilization of numerous edge devices and additional
cross-client domain annotations in the real world, such restrictions may be
impractical and involve potential privacy leaks. In this paper, we propose an
efficient and novel approach, called Disentangled Prompt Tuning (DiPrompT), a
method that tackles the above restrictions by learning adaptive prompts for
domain generalization in a distributed manner. Specifically, we first design
two types of prompts, i.e., global prompt to capture general knowledge across
all clients and domain prompts to capture domain-specific knowledge. They
eliminate the restriction on the one-to-one mapping between source domains and
local clients. Furthermore, a dynamic query metric is introduced to
automatically search the suitable domain label for each sample, which includes
two-substep text-image alignments based on prompt tuning without
labor-intensive annotation. Extensive experiments on multiple datasets
demonstrate that our DiPrompT achieves superior domain generalization
performance over state-of-the-art FL methods when domain labels are not
provided, and even outperforms many centralized learning methods using domain
labels
- …