64 research outputs found
Sparse Spatial Transformers for Few-Shot Learning
Learning from limited data is a challenging task since the scarcity of data
leads to a poor generalization of the trained model. The classical global
pooled representation is likely to lose useful local information. Recently,
many few shot learning methods address this challenge by using deep descriptors
and learning a pixel-level metric. However, using deep descriptors as feature
representations may lose the contextual information of the image. And most of
these methods deal with each class in the support set independently, which
cannot sufficiently utilize discriminative information and task-specific
embeddings. In this paper, we propose a novel Transformer based neural network
architecture called Sparse Spatial Transformers (SSFormers), which can find
task-relevant features and suppress task-irrelevant features. Specifically, we
first divide each input image into several image patches of different sizes to
obtain dense local features. These features retain contextual information while
expressing local information. Then, a sparse spatial transformer layer is
proposed to find spatial correspondence between the query image and the entire
support set to select task-relevant image patches and suppress task-irrelevant
image patches. Finally, we propose to use an image patch matching module for
calculating the distance between dense local representations, thus to determine
which category the query image belongs to in the support set. Extensive
experiments on popular few-shot learning benchmarks show that our method
achieves the state-of-the-art performance
Shaping Visual Representations with Attributes for Few-Shot Recognition
Few-shot recognition aims to recognize novel categories under low-data
regimes. Some recent few-shot recognition methods introduce auxiliary semantic
modality, i.e., category attribute information, into representation learning,
which enhances the feature discrimination and improves the recognition
performance. Most of these existing methods only consider the attribute
information of support set while ignoring the query set, resulting in a
potential loss of performance. In this letter, we propose a novel
attribute-shaped learning (ASL) framework, which can jointly perform query
attributes generation and discriminative visual representation learning for
few-shot recognition. Specifically, a visual-attribute predictor (VAP) is
constructed to predict the attributes of queries. By leveraging the attributes
information, an attribute-visual attention module (AVAM) is designed, which can
adaptively utilize attributes and visual representations to learn more
discriminative features. Under the guidance of attribute modality, our method
can learn enhanced semantic-aware representation for classification.
Experiments demonstrate that our method can achieve competitive results on CUB
and SUN benchmarks. Our source code is available at:
\url{https://github.com/chenhaoxing/ASL}.Comment: accepted by IEEE Signal Process. Let
Segment Anything Model Meets Image Harmonization
Image harmonization is a crucial technique in image composition that aims to
seamlessly match the background by adjusting the foreground of composite
images. Current methods adopt either global-level or pixel-level feature
matching. Global-level feature matching ignores the proximity prior, treating
foreground and background as separate entities. On the other hand, pixel-level
feature matching loses contextual information. Therefore, it is necessary to
use the information from semantic maps that describe different objects to guide
harmonization. In this paper, we propose Semantic-guided Region-aware Instance
Normalization (SRIN) that can utilize the semantic segmentation maps output by
a pre-trained Segment Anything Model (SAM) to guide the visual consistency
learning of foreground and background features. Abundant experiments
demonstrate the superiority of our method for image harmonization over
state-of-the-art methods.Comment: Accepted by ICASSP 202
BoolGebra: Attributed Graph-learning for Boolean Algebraic Manipulation
Boolean algebraic manipulation is at the core of logic synthesis in
Electronic Design Automation (EDA) design flow. Existing methods struggle to
fully exploit optimization opportunities, and often suffer from an explosive
search space and limited scalability efficiency. This work presents BoolGebra,
a novel attributed graph-learning approach for Boolean algebraic manipulation
that aims to improve fundamental logic synthesis. BoolGebra incorporates Graph
Neural Networks (GNNs) and takes initial feature embeddings from both
structural and functional information as inputs. A fully connected neural
network is employed as the predictor for direct optimization result
predictions, significantly reducing the search space and efficiently locating
the optimization space. The experiments involve training the BoolGebra model
w.r.t design-specific and cross-design inferences using the trained model,
where BoolGebra demonstrates generalizability for cross-design inference and
its potential to scale from small, simple training datasets to large, complex
inference datasets. Finally, BoolGebra is integrated with existing synthesis
tool ABC to perform end-to-end logic minimization evaluation w.r.t SOTA
baselines.Comment: DATE 2024 extended version. arXiv admin note: text overlap with
arXiv:2310.0784
Defending against Contagious Attacks on a Network with Resource Reallocation
In classic network security games, the defender distributes defending
resources to the nodes of the network, and the attacker attacks a node, with
the objective to maximize the damage caused. Existing models assume that the
attack at node u causes damage only at u. However, in many real-world security
scenarios, the attack at a node u spreads to the neighbors of u and can cause
damage at multiple nodes, e.g., for the outbreak of a virus. In this paper, we
consider the network defending problem against contagious attacks.
Existing works that study shared resources assume that the resource allocated
to a node can be shared or duplicated between neighboring nodes. However, in
real world, sharing resource naturally leads to a decrease in defending power
of the source node, especially when defending against contagious attacks. To
this end, we study the model in which resources allocated to a node can only be
transferred to its neighboring nodes, which we refer to as a reallocation
process.
We show that this more general model is difficult in two aspects: (1) even
for a fixed allocation of resources, we show that computing the optimal
reallocation is NP-hard; (2) for the case when reallocation is not allowed, we
show that computing the optimal allocation (against contagious attack) is also
NP-hard. For positive results, we give a mixed integer linear program
formulation for the problem and a bi-criteria approximation algorithm. Our
experimental results demonstrate that the allocation and reallocation
strategies our algorithm computes perform well in terms of minimizing the
damage due to contagious attacks.Comment: 9 pages; AAAI 202
DiffUTE: Universal Text Editing Diffusion Model
Diffusion model based language-guided image editing has achieved great
success recently. However, existing state-of-the-art diffusion models struggle
with rendering correct text and text style during generation. To tackle this
problem, we propose a universal self-supervised text editing diffusion model
(DiffUTE), which aims to replace or modify words in the source image with
another one while maintaining its realistic appearance. Specifically, we build
our model on a diffusion model and carefully modify the network structure to
enable the model for drawing multilingual characters with the help of glyph and
position information. Moreover, we design a self-supervised learning framework
to leverage large amounts of web data to improve the representation ability of
the model. Experimental results show that our method achieves an impressive
performance and enables controllable editing on in-the-wild images with high
fidelity. Our code will be avaliable in
\url{https://github.com/chenhaoxing/DiffUTE}
ChipNeMo: Domain-Adapted LLMs for Chip Design
ChipNeMo aims to explore the applications of large language models (LLMs) for
industrial chip design. Instead of directly deploying off-the-shelf commercial
or open-source LLMs, we instead adopt the following domain adaptation
techniques: domain-adaptive tokenization, domain-adaptive continued
pretraining, model alignment with domain-specific instructions, and
domain-adapted retrieval models. We evaluate these methods on three selected
LLM applications for chip design: an engineering assistant chatbot, EDA script
generation, and bug summarization and analysis. Our evaluations demonstrate
that domain-adaptive pretraining of language models, can lead to superior
performance in domain related downstream tasks compared to their base LLaMA2
counterparts, without degradations in generic capabilities. In particular, our
largest model, ChipNeMo-70B, outperforms the highly capable GPT-4 on two of our
use cases, namely engineering assistant chatbot and EDA scripts generation,
while exhibiting competitive performance on bug summarization and analysis.
These results underscore the potential of domain-specific customization for
enhancing the effectiveness of large language models in specialized
applications.Comment: Updated results for ChipNeMo-70B mode
Genetic Diversity of Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici in Yunnan, China
The stripe rust of wheat is one of the devastating diseases in China, which is caused by fungus Puccinia striiformis f. sp. tritici (Pst). The Yunnan Province of China is located in the south-western part, and holds distinctive geographical and climate features, while wheat growth and epidemics of stripe rust fungus are fully dissimilar to the major wheat-growing regions of China. It is important to discover its origin and migration to control the disease. In this study, 352 isolates were sampled from 11 spots of the Yunnan Province during the wheat growing season from 2004 to 2015 and analyzed with SNPs markers of housekeeping genes. Results revealed that 220 haplotypes were inferred from the concatenating sequences; among them, 5 haplotypes (viz., ‘H86′, ‘H18′, ‘H8′, ‘H15′ and ‘H23′) comprised over 24.5% of the population. The haplotype diversity, nucleotide diversity, mutation rate and recombination events were 0.992, 6.04 × 10−3, 4.46 × 10−3 and 18.0 respectively, which revealed the genetic diversity of Pst populations among all locations. Four grouping methods, such as UPGMA-tree, PCA, PLS-DA and STRUCTURE, were employed for the categorization of the Pst populations conferring to their races and topographical localities. All methods were found significant and mostly had co-linear relations with each other. The analysis of molecular variance (AMOVA) conferred total variation was 9.09%, and 86.20% of variation was within the populations. The current study also exposed a comparatively high genetic multiplicity within the population, while low genetic inconsistency among the populations. Furthermore, the molecular records on the gene pole (Nm = 18.45) established that the migration of the stripe rust pathogen occurred among all locations in Yunnan province. The ancestral haplotype was detected in Yuxi. Based on the trajectories of upper airflow and genetic diversity of Pst populations in different locations, it is suggested that the locations Dehong, Dali, Lincang and Baoshan are probably a major source of Pst in Yunnan
- …