97 research outputs found

    Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning

    Full text link
    Compositional Zero-Shot Learning (CZSL) aims to recognize novel concepts formed by known states and objects during training. Existing methods either learn the combined state-object representation, challenging the generalization of unseen compositions, or design two classifiers to identify state and object separately from image features, ignoring the intrinsic relationship between them. To jointly eliminate the above issues and construct a more robust CZSL system, we propose a novel framework termed Decomposed Fusion with Soft Prompt (DFSP)1, by involving vision-language models (VLMs) for unseen composition recognition. Specifically, DFSP constructs a vector combination of learnable soft prompts with state and object to establish the joint representation of them. In addition, a cross-modal decomposed fusion module is designed between the language and image branches, which decomposes state and object among language features instead of image features. Notably, being fused with the decomposed features, the image features can be more expressive for learning the relationship with states and objects, respectively, to improve the response of unseen compositions in the pair space, hence narrowing the domain gap between seen and unseen sets. Experimental results on three challenging benchmarks demonstrate that our approach significantly outperforms other state-of-the-art methods by large margins.Comment: 10 pages included reference, conferenc

    Analysing Railway Safety with Systems Thinking

    Get PDF
    Railway system is a socio-technical system because the operation of such system also heavily relies on the management of human activities and operating procedures in the organisation, as well as the execution of technical subsystems. Safety of these systems therefore is more than just about engineering their technical subsystems. The latest approach from systems engineering considers that an accident is due to inadequate controlled interactions in the system and is usually a dynamic event chain started from the activation of a hazard and culminated in a complex process of sequential and concurrent events until the system is eventually out of control. Meanwhile the analysis of these systems’s safety becomes much harder when simply applying the traditional techniques of safety assessment. It is because, first of all, a social-technical system consists of a lot of complex and non-linear interactions, traditional techniques show their limits when analysing complex systems. And secondly, the safety of a social-technical system requires a system perspective, which should take all the behaviours (desired and undesired but predicted) of a system as a whole in the context of its environment. To capture the information needed, the models for these analyses (i.e., fault tree and FMEA table) will become too complex to have a systemic view of each individual causal factor. In this paper, we proposed an approach based on system thinking and system dynamics to analyse the safety of a social-technical system. The case study of a tram accident is simple enough for the purpose of demonstrating its feasibility and benefits. The comparison with fault tree analysis was conducted, but it was not for the evaluation of our approach. The real evaluation comes from the extensive applications in real world

    Parallel implementation of 3D global MHD simulations for Earth’s magnetosphere

    Get PDF
    AbstractThis paper presents a dynamic domain decomposition (D3) technique for implementing the parallelization of the piecewise parabolic method (PPM) for solving the ideal magnetohydrodynamics (MHD) equations. The key point of D3 is distributing the work dynamically among processes during the execution of the PPM algorithm. This parallel code utilizes D3 with a message passing interface (MPI) in order to permit efficient implementation on clusters of distributed memory machines and may also simultaneously exploit threading for multiprocessing shared address space architectures. 3D global MHD simulation results for the Earth’s magnetosphere on the massively parallel supercomputers Deepcomp 1800 and 6800 demonstrate the scalability and efficiency of our parallelization strategy

    GBE-MLZSL: A Group Bi-Enhancement Framework for Multi-Label Zero-Shot Learning

    Full text link
    This paper investigates a challenging problem of zero-shot learning in the multi-label scenario (MLZSL), wherein, the model is trained to recognize multiple unseen classes within a sample (e.g., an image) based on seen classes and auxiliary knowledge, e.g., semantic information. Existing methods usually resort to analyzing the relationship of various seen classes residing in a sample from the dimension of spatial or semantic characteristics, and transfer the learned model to unseen ones. But they ignore the effective integration of local and global features. That is, in the process of inferring unseen classes, global features represent the principal direction of the image in the feature space, while local features should maintain uniqueness within a certain range. This integrated neglect will make the model lose its grasp of the main components of the image. Relying only on the local existence of seen classes during the inference stage introduces unavoidable bias. In this paper, we propose a novel and effective group bi-enhancement framework for MLZSL, dubbed GBE-MLZSL, to fully make use of such properties and enable a more accurate and robust visual-semantic projection. Specifically, we split the feature maps into several feature groups, of which each feature group can be trained independently with the Local Information Distinguishing Module (LID) to ensure uniqueness. Meanwhile, a Global Enhancement Module (GEM) is designed to preserve the principal direction. Besides, a static graph structure is designed to construct the correlation of local features. Experiments on large-scale MLZSL benchmark datasets NUS-WIDE and Open-Images-v4 demonstrate that the proposed GBE-MLZSL outperforms other state-of-the-art methods with large margins.Comment: 11 pages, 8 figure

    Galactic Cosmic Rays Modulation in the Vicinity of Corotating Interaction Regions: Observations During the Last Two Solar Minima

    Get PDF
    Corotating interaction regions (CIRs) are responsible for short-term recurrent cosmic-ray modulation, prominent near solar minima. Using the OMNI data sets for two periods of low solar activity near the beginning and end of solar cycle 24, superposed epoch analysis was performed on the solar wind plasma features for 53 and 43 events during periods 2007–2008 and 2017–2018, respectively. Turbulent properties of the solar wind were studied using the variance method for each CIR. Power spectra have been constructed for overlapped subintervals in the vicinity of stream interfaces (SIs). Using measured correlation lengths and turbulent energies, parallel and perpendicular diffusion mean free paths for cosmic-ray ions have been inferred based on two distinct theoretical formulations. For the two periods with opposite solar polarities, our results show that unlike solar wind speed, magnetic field strength, flow pressure, and proton density are relatively higher during the latest period. Increased turbulent energy and reduced parallel transport coefficients of energetic particles at the SIs are observed. The diffusion coefficients follow the same trends during both periods. The perpendicular diffusion starts increasing nearly a day before SIs and is higher in the fast wind. Superposed epoch analysis is performed on the >120 MeV proton count rate obtained from the CRIS instrument on board the ACE spacecraft for the same events. The recorded proton rates have peaks half a day before a SI and reach their minimum more than a day after a SI and have a high anticorrelation with the perpendicular diffusion coefficient

    Semisupervised hypergraph discriminant learning for dimensionality reduction of hyperspectral image.

    Get PDF
    Semisupervised learning is an effective technique to represent the intrinsic features of a hyperspectral image (HSI), which can reduce the cost to obtain the labeled information of samples. However, traditional semisupervised learning methods fail to consider multiple properties of an HSI, which has restricted the discriminant performance of feature representation. In this article, we introduce the hypergraph into semisupervised learning to reveal the complex multistructures of an HSI, and construct a semisupervised discriminant hypergraph learning (SSDHL) method by designing an intraclass hypergraph and an interclass graph with the labeled samples. SSDHL constructs an unsupervised hypergraph with the unlabeled samples. In addition, a total scatter matrix is used to measure the distribution of the labeled and unlabeled samples. Then, a low-dimensional projection function is constructed to compact the properties of the intraclass hypergraph and the unsupervised hypergraph, and simultaneously separate the characteristics of the interclass graph and the total scatter matrix. Finally, according to the objective function, we can obtain the projection matrix and the low-dimensional features. Experiments on three HSI data sets (Botswana, KSC, and PaviaU) show that the proposed method can achieve better classification results compared with a few state-of-the-art methods. The result indicates that SSDHL can simultaneously utilize the labeled and unlabeled samples to represent the homogeneous properties and restrain the heterogeneous characteristics of an HSI

    DRPT: Disentangled and Recurrent Prompt Tuning for Compositional Zero-Shot Learning

    Full text link
    Compositional Zero-shot Learning (CZSL) aims to recognize novel concepts composed of known knowledge without training samples. Standard CZSL either identifies visual primitives or enhances unseen composed entities, and as a result, entanglement between state and object primitives cannot be fully utilized. Admittedly, vision-language models (VLMs) could naturally cope with CZSL through tuning prompts, while uneven entanglement leads prompts to be dragged into local optimum. In this paper, we take a further step to introduce a novel Disentangled and Recurrent Prompt Tuning framework termed DRPT to better tap the potential of VLMs in CZSL. Specifically, the state and object primitives are deemed as learnable tokens of vocabulary embedded in prompts and tuned on seen compositions. Instead of jointly tuning state and object, we devise a disentangled and recurrent tuning strategy to suppress the traction force caused by entanglement and gradually optimize the token parameters, leading to a better prompt space. Notably, we develop a progressive fine-tuning procedure that allows for incremental updates to the prompts, optimizing the object first, then the state, and vice versa. Meanwhile, the optimization of state and object is independent, thus clearer features can be learned to further alleviate the issue of entangling misleading optimization. Moreover, we quantify and analyze the entanglement in CZSL and supplement entanglement rebalancing optimization schemes. DRPT surpasses representative state-of-the-art methods on extensive benchmark datasets, demonstrating superiority in both accuracy and efficiency

    DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning

    Full text link
    Federated learning (FL) has emerged as a powerful paradigm for learning from decentralized data, and federated domain generalization further considers the test dataset (target domain) is absent from the decentralized training data (source domains). However, most existing FL methods assume that domain labels are provided during training, and their evaluation imposes explicit constraints on the number of domains, which must strictly match the number of clients. Because of the underutilization of numerous edge devices and additional cross-client domain annotations in the real world, such restrictions may be impractical and involve potential privacy leaks. In this paper, we propose an efficient and novel approach, called Disentangled Prompt Tuning (DiPrompT), a method that tackles the above restrictions by learning adaptive prompts for domain generalization in a distributed manner. Specifically, we first design two types of prompts, i.e., global prompt to capture general knowledge across all clients and domain prompts to capture domain-specific knowledge. They eliminate the restriction on the one-to-one mapping between source domains and local clients. Furthermore, a dynamic query metric is introduced to automatically search the suitable domain label for each sample, which includes two-substep text-image alignments based on prompt tuning without labor-intensive annotation. Extensive experiments on multiple datasets demonstrate that our DiPrompT achieves superior domain generalization performance over state-of-the-art FL methods when domain labels are not provided, and even outperforms many centralized learning methods using domain labels
    corecore