64 research outputs found

    Sparse Spatial Transformers for Few-Shot Learning

    Full text link
    Learning from limited data is a challenging task since the scarcity of data leads to a poor generalization of the trained model. The classical global pooled representation is likely to lose useful local information. Recently, many few shot learning methods address this challenge by using deep descriptors and learning a pixel-level metric. However, using deep descriptors as feature representations may lose the contextual information of the image. And most of these methods deal with each class in the support set independently, which cannot sufficiently utilize discriminative information and task-specific embeddings. In this paper, we propose a novel Transformer based neural network architecture called Sparse Spatial Transformers (SSFormers), which can find task-relevant features and suppress task-irrelevant features. Specifically, we first divide each input image into several image patches of different sizes to obtain dense local features. These features retain contextual information while expressing local information. Then, a sparse spatial transformer layer is proposed to find spatial correspondence between the query image and the entire support set to select task-relevant image patches and suppress task-irrelevant image patches. Finally, we propose to use an image patch matching module for calculating the distance between dense local representations, thus to determine which category the query image belongs to in the support set. Extensive experiments on popular few-shot learning benchmarks show that our method achieves the state-of-the-art performance

    Shaping Visual Representations with Attributes for Few-Shot Recognition

    Full text link
    Few-shot recognition aims to recognize novel categories under low-data regimes. Some recent few-shot recognition methods introduce auxiliary semantic modality, i.e., category attribute information, into representation learning, which enhances the feature discrimination and improves the recognition performance. Most of these existing methods only consider the attribute information of support set while ignoring the query set, resulting in a potential loss of performance. In this letter, we propose a novel attribute-shaped learning (ASL) framework, which can jointly perform query attributes generation and discriminative visual representation learning for few-shot recognition. Specifically, a visual-attribute predictor (VAP) is constructed to predict the attributes of queries. By leveraging the attributes information, an attribute-visual attention module (AVAM) is designed, which can adaptively utilize attributes and visual representations to learn more discriminative features. Under the guidance of attribute modality, our method can learn enhanced semantic-aware representation for classification. Experiments demonstrate that our method can achieve competitive results on CUB and SUN benchmarks. Our source code is available at: \url{https://github.com/chenhaoxing/ASL}.Comment: accepted by IEEE Signal Process. Let

    Segment Anything Model Meets Image Harmonization

    Full text link
    Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images. Current methods adopt either global-level or pixel-level feature matching. Global-level feature matching ignores the proximity prior, treating foreground and background as separate entities. On the other hand, pixel-level feature matching loses contextual information. Therefore, it is necessary to use the information from semantic maps that describe different objects to guide harmonization. In this paper, we propose Semantic-guided Region-aware Instance Normalization (SRIN) that can utilize the semantic segmentation maps output by a pre-trained Segment Anything Model (SAM) to guide the visual consistency learning of foreground and background features. Abundant experiments demonstrate the superiority of our method for image harmonization over state-of-the-art methods.Comment: Accepted by ICASSP 202

    BoolGebra: Attributed Graph-learning for Boolean Algebraic Manipulation

    Full text link
    Boolean algebraic manipulation is at the core of logic synthesis in Electronic Design Automation (EDA) design flow. Existing methods struggle to fully exploit optimization opportunities, and often suffer from an explosive search space and limited scalability efficiency. This work presents BoolGebra, a novel attributed graph-learning approach for Boolean algebraic manipulation that aims to improve fundamental logic synthesis. BoolGebra incorporates Graph Neural Networks (GNNs) and takes initial feature embeddings from both structural and functional information as inputs. A fully connected neural network is employed as the predictor for direct optimization result predictions, significantly reducing the search space and efficiently locating the optimization space. The experiments involve training the BoolGebra model w.r.t design-specific and cross-design inferences using the trained model, where BoolGebra demonstrates generalizability for cross-design inference and its potential to scale from small, simple training datasets to large, complex inference datasets. Finally, BoolGebra is integrated with existing synthesis tool ABC to perform end-to-end logic minimization evaluation w.r.t SOTA baselines.Comment: DATE 2024 extended version. arXiv admin note: text overlap with arXiv:2310.0784

    Defending against Contagious Attacks on a Network with Resource Reallocation

    Full text link
    In classic network security games, the defender distributes defending resources to the nodes of the network, and the attacker attacks a node, with the objective to maximize the damage caused. Existing models assume that the attack at node u causes damage only at u. However, in many real-world security scenarios, the attack at a node u spreads to the neighbors of u and can cause damage at multiple nodes, e.g., for the outbreak of a virus. In this paper, we consider the network defending problem against contagious attacks. Existing works that study shared resources assume that the resource allocated to a node can be shared or duplicated between neighboring nodes. However, in real world, sharing resource naturally leads to a decrease in defending power of the source node, especially when defending against contagious attacks. To this end, we study the model in which resources allocated to a node can only be transferred to its neighboring nodes, which we refer to as a reallocation process. We show that this more general model is difficult in two aspects: (1) even for a fixed allocation of resources, we show that computing the optimal reallocation is NP-hard; (2) for the case when reallocation is not allowed, we show that computing the optimal allocation (against contagious attack) is also NP-hard. For positive results, we give a mixed integer linear program formulation for the problem and a bi-criteria approximation algorithm. Our experimental results demonstrate that the allocation and reallocation strategies our algorithm computes perform well in terms of minimizing the damage due to contagious attacks.Comment: 9 pages; AAAI 202

    DiffUTE: Universal Text Editing Diffusion Model

    Full text link
    Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we propose a universal self-supervised text editing diffusion model (DiffUTE), which aims to replace or modify words in the source image with another one while maintaining its realistic appearance. Specifically, we build our model on a diffusion model and carefully modify the network structure to enable the model for drawing multilingual characters with the help of glyph and position information. Moreover, we design a self-supervised learning framework to leverage large amounts of web data to improve the representation ability of the model. Experimental results show that our method achieves an impressive performance and enables controllable editing on in-the-wild images with high fidelity. Our code will be avaliable in \url{https://github.com/chenhaoxing/DiffUTE}

    ChipNeMo: Domain-Adapted LLMs for Chip Design

    Full text link
    ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design. Instead of directly deploying off-the-shelf commercial or open-source LLMs, we instead adopt the following domain adaptation techniques: domain-adaptive tokenization, domain-adaptive continued pretraining, model alignment with domain-specific instructions, and domain-adapted retrieval models. We evaluate these methods on three selected LLM applications for chip design: an engineering assistant chatbot, EDA script generation, and bug summarization and analysis. Our evaluations demonstrate that domain-adaptive pretraining of language models, can lead to superior performance in domain related downstream tasks compared to their base LLaMA2 counterparts, without degradations in generic capabilities. In particular, our largest model, ChipNeMo-70B, outperforms the highly capable GPT-4 on two of our use cases, namely engineering assistant chatbot and EDA scripts generation, while exhibiting competitive performance on bug summarization and analysis. These results underscore the potential of domain-specific customization for enhancing the effectiveness of large language models in specialized applications.Comment: Updated results for ChipNeMo-70B mode

    Genetic Diversity of Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici in Yunnan, China

    Full text link
    The stripe rust of wheat is one of the devastating diseases in China, which is caused by fungus Puccinia striiformis f. sp. tritici (Pst). The Yunnan Province of China is located in the south-western part, and holds distinctive geographical and climate features, while wheat growth and epidemics of stripe rust fungus are fully dissimilar to the major wheat-growing regions of China. It is important to discover its origin and migration to control the disease. In this study, 352 isolates were sampled from 11 spots of the Yunnan Province during the wheat growing season from 2004 to 2015 and analyzed with SNPs markers of housekeeping genes. Results revealed that 220 haplotypes were inferred from the concatenating sequences; among them, 5 haplotypes (viz., ‘H86′, ‘H18′, ‘H8′, ‘H15′ and ‘H23′) comprised over 24.5% of the population. The haplotype diversity, nucleotide diversity, mutation rate and recombination events were 0.992, 6.04 × 10−3, 4.46 × 10−3 and 18.0 respectively, which revealed the genetic diversity of Pst populations among all locations. Four grouping methods, such as UPGMA-tree, PCA, PLS-DA and STRUCTURE, were employed for the categorization of the Pst populations conferring to their races and topographical localities. All methods were found significant and mostly had co-linear relations with each other. The analysis of molecular variance (AMOVA) conferred total variation was 9.09%, and 86.20% of variation was within the populations. The current study also exposed a comparatively high genetic multiplicity within the population, while low genetic inconsistency among the populations. Furthermore, the molecular records on the gene pole (Nm = 18.45) established that the migration of the stripe rust pathogen occurred among all locations in Yunnan province. The ancestral haplotype was detected in Yuxi. Based on the trajectories of upper airflow and genetic diversity of Pst populations in different locations, it is suggested that the locations Dehong, Dali, Lincang and Baoshan are probably a major source of Pst in Yunnan
    • …
    corecore