20 research outputs found

    Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning

    Full text link
    Recent compositional zero-shot learning (CZSL) methods adapt pre-trained vision-language models (VLMs) by constructing trainable prompts only for composed state-object pairs. Relying on learning the joint representation of seen compositions, these methods ignore the explicit modeling of the state and object, thus limiting the exploitation of pre-trained knowledge and generalization to unseen compositions. With a particular focus on the universality of the solution, in this work, we propose a novel paradigm for CZSL models that establishes three identification branches (i.e., Multi-Path) to jointly model the state, object, and composition. The presented Troika is our implementation that aligns the branch-specific prompt representations with decomposed visual features. To calibrate the bias between semantically similar multi-modal representations, we further devise a Cross-Modal Traction module into Troika that shifts the prompt representation towards the current visual content. We conduct extensive experiments on three popular benchmarks, where our method significantly outperforms existing methods in both closed-world and open-world settings.Comment: 14 page

    Logic Diffusion for Knowledge Graph Reasoning

    Full text link
    Most recent works focus on answering first order logical queries to explore the knowledge graph reasoning via multi-hop logic predictions. However, existing reasoning models are limited by the circumscribed logical paradigms of training samples, which leads to a weak generalization of unseen logic. To address these issues, we propose a plug-in module called Logic Diffusion (LoD) to discover unseen queries from surroundings and achieves dynamical equilibrium between different kinds of patterns. The basic idea of LoD is relation diffusion and sampling sub-logic by random walking as well as a special training mechanism called gradient adaption. Besides, LoD is accompanied by a novel loss function to further achieve the robust logical diffusion when facing noisy data in training or testing sets. Extensive experiments on four public datasets demonstrate the superiority of mainstream knowledge graph reasoning models with LoD over state-of-the-art. Moreover, our ablation study proves the general effectiveness of LoD on the noise-rich knowledge graph.Comment: 10 pages, 6 figure

    UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

    Full text link
    This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namely, in-image, in-text, cross-image, cross-text, and image-text. Following this protocol, we collect, from public international news, a large-scale multimodal knowledge graph dataset that consists of 1,388,568 nodes (with 571,791 vision-related ones) and 3,673,817 triplets. The dataset is also annotated with rich event tags, including 96 coarse labels and 9,185 fine labels, expanding its potential usage. To further verify that UKnow can serve as a standard protocol, we set up an efficient pipeline to help reorganize existing datasets under UKnow format. Finally, we benchmark the performance of some widely-used baselines on the tasks of common-sense reasoning and vision-language pre-training. Results on both our new dataset and the reformatted public datasets demonstrate the effectiveness of UKnow in knowledge organization and method evaluation. Code, dataset, conversion tool, and baseline models will be made public

    Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone

    Full text link
    Parameter-efficient tuning has become a trend in transferring large-scale foundation models to downstream applications. Existing methods typically embed some light-weight tuners into the backbone, where both the design and the learning of the tuners are highly dependent on the base model. This work offers a new tuning paradigm, dubbed Res-Tuning, which intentionally unbinds tuners from the backbone. With both theoretical and empirical evidence, we show that popular tuning approaches have their equivalent counterparts under our unbinding formulation, and hence can be integrated into our framework effortlessly. Thanks to the structural disentanglement, we manage to free the design of tuners from the network architecture, facilitating flexible combination of various tuning strategies. We further propose a memory-efficient variant of Res-Tuning, where the bypass i.e., formed by a sequence of tuners) is effectively detached from the main branch, such that the gradients are back-propagated only to the tuners but not to the backbone. Such a detachment also allows one-time backbone forward for multi-task inference. Extensive experiments on both discriminative and generative tasks demonstrate the superiority of our method over existing alternatives from the perspectives of efficacy and efficiency. Project page: \href\href{https://res-tuning.github.io/}{\textit{https://res-tuning.github.io/}}.Comment: Accepted to NeurIPS 202

    Post-Processing Approach for Refining Raw Land Cover Change Detection of Very High-Resolution Remote Sensing Images

    Get PDF
    In recent decades, land cover change detection (LCCD) using very high-spatial resolution (VHR) remote sensing images has been a major research topic. However, VHR remote sensing images usually lead to a large amount of noises in spectra, thereby reducing the reliability of the detected results. To solve this problem, this study proposes an object-based expectation maximization (OBEM) post-processing approach for enhancing raw LCCD results. OBEM defines a refinement of the labeling in a detected map to enhance its raw detection accuracies. Current mainstream change detection (preprocessing) techniques concentrate on proposing a change magnitude measurement or considering image spatial features to obtain a change detection map. The proposed OBEM approach is a new solution to enhance change detection accuracy by refining the raw result. Post-processing approaches can achieve competitive accuracies to the preprocessing methods, but in a direct and succinct manner. The proposed OBEM post-processing method synthetically considers multi-scale segmentation and expectation maximum algorithms to refine the raw change detection result. Then, the influence of the scale of segmentation on the LCCD accuracy of the proposed OBEM is investigated. Four pairs of remote sensing images, one of two pairs (aerial image with 0.5 m/pixel resolution) which depict two landslide sites on Landtau Island, Hong Kong, China, are used in the experiments to evaluate the effectiveness of the proposed approach. In addition, the proposed approach is applied, and validated by two case studies, LCCD in Tianjin City China (SPOT-5 satellite image with 2.5 m/pixel resolution) and Mexico forest fire case (Landsat TM images with 30 m/pixel resolution), respectively. Quantitative evaluations show that the proposed OBEM post-processing approach can achieve better performance and higher accuracies than several commonly used preprocessing methods. To the best of the authors’ knowledge, this type of post-processing framework is first proposed here for the field of LCCD using VHR remote sensing images.This work was supported by the National Science Foundation China (61701396 and D010701), the Science Foundation of Hunan Province (Grant No. 2016JJ6100), the Natural Science Foundation of Shaan Xi Province (2017JQ4006), and the project from the China Postdoctoral Science Foundation (2015M572658XB).Peer Reviewe

    Multi-dimensional, multi-branch hyperspectral remote sensing image classification with limited training samples

    No full text
    Deep learning-based hyperspectral remote sensing image classification methods are currently a research hotspot. However, they suffer from issues such as large feature network parameter size, complex calculations, and the need for a large number of training data to achieve good classification results. Moreover, hyperspectral remote sensing images face challenges such as difficulty in obtaining the ground truth of land cover, limited availability of effective datasets for training, and endmember spectral variability, making it difficult for existing algorithm models to be widely adopted. To address these issues, this paper proposes a multi-branch classification model with multi-dimensional feature fusion, constructing lightweight deep network models for one-dimensional spectral, two-dimensional spatial, and three-dimensional depth feature extraction, respectively. This enriches feature information while reducing the parameters of each branch’s deep model, effectively improving the land cover classification accuracy using hyperspectral remote sensing images under limited training sample conditions. Experimental verification with open-source hyperspectral remote sensing datasets shows that the proposed classification method can obtain over 90% classification accuracy when the training set account for only 5% of the total dataset, which is significantly better than current mainstream deep network classification models

    Electromagnetic forming of AA1060 sheet based on mixed forces generated by a three-coil dual-power system

    No full text
    Abstract The electromagnetic force used in electromagnetic forming is mainly divided into attraction and repulsion. Dual-coil attractive electromagnetic forming can be used in the field of sheet pit repair. However, the magnetic field and eddy current generated by the two coils compete with each other, and the energy utilization rate is low. Therefore, a compensation coil is introduced, and an electromagnetic forming scheme of a three-coil dual-power sheet based on mixed force is proposed and verified by simulation. It is found that the three-coil mixed force can effectively improve the competition between the magnetic field and eddy current. The loading of the mixing force is not a simple superposition of attraction and repulsion, but the mutual promotion of the two. The forming displacement of the three-coil mixed force forming scheme is 582% higher than that of the dual-coil attraction forming scheme, and 89% higher than that of the attract first and then repel forming scheme. The forming effect of the three-coil mixing force is related to the number of turns of the compensation coil. The research results can improve the energy utilization rate of electromagnetic forming and provide a new idea for the loading scheme of electromagnetic forming force field

    Performance Simulation of the Active Magnetic Regenerator under a Pulsed Magnetic Field

    No full text
    Magnetic refrigeration is acknowledged as a potential substitute for the conventional vapor-compression refrigeration technology, owing to its high efficiency and environmental friendliness. Existing magnetic refrigeration systems are mostly based on permanent magnets, owing to the characteristics of lower magnetic field intensity, non-uniform magnetic field distribution, and lower operating frequency due to the moving parts, which results in a low cooling capacity and small temperature difference. Thus, this study proposes the application of a pulsed magnetic field, with a high intensity and frequency, to a magnetic refrigeration system to achieve a high performance. A verified numerical model is established to investigate the thermodynamic cycle and cooling performance of an active magnetic regenerator (AMR). The transient and steady-state performances of AMR under pulsed and permanent magnetic fields are compared. The results suggest that an AMR can establish a stable temperature difference under a pulsed magnetic field that is 40 times faster than that under a permanent magnetic field. The maximum steady-state cooling capacity under a pulsed magnetic field is 2.5 times that under a permanent magnetic field when the temperature difference is 20 K. Additionally, the effects of pulsed magnetic field waveforms, frequency, and intensity on the performance of AMR are investigated under various utilization factors. These results can guide the improvement of room-temperature magnetic refrigerators

    Multi-Scale Object Histogram Distance for LCCD Using Bi-Temporal Very-High-Resolution Remote Sensing Images

    Get PDF
    To improve the performance of land-cover change detection (LCCD) using remote sensing images, this study utilises spatial information in an adaptive and multi-scale manner. It proposes a novel multi-scale object histogram distance (MOHD) to measure the change magnitude between bi-temporal remote sensing images. Three major steps are related to the proposed MOHD. Firstly, multi-scale objects for the post-event image are extracted through a widely used algorithm called the fractional net evaluation approach. The pixels within a segmental object are taken to construct the pairwise frequency distribution histograms. An arithmetic frequency-mean feature is then defined from the red, green and blue band histogram. Secondly, bin-to-bin distance is adapted to measure the change magnitude between the pairwise objects of bi-temporal images. The change magnitude image (CMI) of the bi-temporal images can be generated through object-by-object. Finally, the classical binary method Otsu is used to divide the CMI to a binary change detection map. Experimental results based on two real datasets with different land-cover change scenes demonstrate the effectiveness of the proposed MOHD approach in detecting land-cover change compared with three widely used existing approaches

    Performance Simulation of the Active Magnetic Regenerator under a Pulsed Magnetic Field

    No full text
    Magnetic refrigeration is acknowledged as a potential substitute for the conventional vapor-compression refrigeration technology, owing to its high efficiency and environmental friendliness. Existing magnetic refrigeration systems are mostly based on permanent magnets, owing to the characteristics of lower magnetic field intensity, non-uniform magnetic field distribution, and lower operating frequency due to the moving parts, which results in a low cooling capacity and small temperature difference. Thus, this study proposes the application of a pulsed magnetic field, with a high intensity and frequency, to a magnetic refrigeration system to achieve a high performance. A verified numerical model is established to investigate the thermodynamic cycle and cooling performance of an active magnetic regenerator (AMR). The transient and steady-state performances of AMR under pulsed and permanent magnetic fields are compared. The results suggest that an AMR can establish a stable temperature difference under a pulsed magnetic field that is 40 times faster than that under a permanent magnetic field. The maximum steady-state cooling capacity under a pulsed magnetic field is 2.5 times that under a permanent magnetic field when the temperature difference is 20 K. Additionally, the effects of pulsed magnetic field waveforms, frequency, and intensity on the performance of AMR are investigated under various utilization factors. These results can guide the improvement of room-temperature magnetic refrigerators
    corecore