26 research outputs found
Scalable Language Model with Generalized Continual Learning
Continual learning has gained increasing importance as it facilitates the
acquisition and refinement of scalable knowledge and skills in language models.
However, existing methods typically encounter strict limitations and challenges
in real-world scenarios, such as reliance on experience replay, optimization
constraints, and inference task-ID. In this study, we introduce the Scalable
Language Model (SLM) to overcome these limitations within a more challenging
and generalized setting, representing a significant advancement toward
practical applications for continual learning. Specifically, we propose the
Joint Adaptive Re-Parameterization (JARe), integrated with Dynamic Task-related
Knowledge Retrieval (DTKR), to enable adaptive adjustment of language models
based on specific downstream tasks. This approach leverages the task
distribution within the vector space, aiming to achieve a smooth and effortless
continual learning process. Our method demonstrates state-of-the-art
performance on diverse backbones and benchmarks, achieving effective continual
learning in both full-set and few-shot scenarios with minimal forgetting.
Moreover, while prior research primarily focused on a single task type such as
classification, our study goes beyond, with the large language model, i.e.,
LLaMA-2, to explore the effects across diverse domains and task types, such
that a single language model can be decently scaled to broader applications.Comment: The Twelfth International Conference on Learning Representation
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract
Few-shot semantic segmentation (FSS) aims to form class-agnostic models
segmenting unseen classes with only a handful of annotations. Previous methods
limited to the semantic feature and prototype representation suffer from coarse
segmentation granularity and train-set overfitting. In this work, we design
Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support
correlation based on the transformer architecture. The self-attention modules
are used to assist in establishing hierarchical dense features, as a means to
accomplish the cascade matching between query and support features. Moreover,
we propose a matching module to reduce train-set overfitting and introduce
correlation distillation leveraging semantic correspondence from coarse
resolution to boost fine-grained segmentation. Our method performs decently in
experiments. We achieve 50.0% mIoU on COCO dataset one-shot setting and 56.0%
on five-shot segmentation, respectively. The code will be available on the
project website. We hope our work can benefit broader industrial applications
where novel classes with limited annotations are required to be decently
identified.Comment: Accepted to CVPR 2023 VISION Workshop, Oral. The extended abstract of
Hierarchical Dense Correlation Distillation for Few-Shot Segmentation. arXiv
admin note: substantial text overlap with arXiv:2303.1465
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding
Self-supervised 3D representation learning aims to learn effective
representations from large-scale unlabeled point clouds. Most existing
approaches adopt point discrimination as the pretext task, which assigns
matched points in two distinct views as positive pairs and unmatched points as
negative pairs. However, this approach often results in semantically identical
points having dissimilar representations, leading to a high number of false
negatives and introducing a "semantic conflict" problem. To address this issue,
we propose GroupContrast, a novel approach that combines segment grouping and
semantic-aware contrastive learning. Segment grouping partitions points into
semantically meaningful regions, which enhances semantic coherence and provides
semantic guidance for the subsequent contrastive representation learning.
Semantic-aware contrastive learning augments the semantic information extracted
from segment grouping and helps to alleviate the issue of "semantic conflict".
We conducted extensive experiments on multiple 3D scene understanding tasks.
The results demonstrate that GroupContrast learns semantically meaningful
representations and achieves promising transfer learning performance.Comment: CVPR 202
Collaboration of Pre-trained Models Makes Better Few-shot Learner
Few-shot classification requires deep neural networks to learn generalized
representations only from limited training images, which is challenging but
significant in low-data regimes. Recently, CLIP-based methods have shown
promising few-shot performance benefited from the contrastive language-image
pre-training. Based on this point, we question if the large-scale pre-training
can alleviate the few-shot data deficiency and also assist the representation
learning by the pre-learned knowledge. In this paper, we propose CoMo, a
Collaboration of pre-trained Models that incorporates diverse prior knowledge
from various pre-training paradigms for better few-shot learning. Our CoMo
includes: CLIP's language-contrastive knowledge, DINO's vision-contrastive
knowledge, and DALL-E's language-generative knowledge. Specifically, CoMo works
in two aspects: few-shot data expansion and diverse knowledge ensemble. For
one, we generate synthetic images via zero-shot DALL-E to enrich the few-shot
training data without any manpower. For the other, we introduce a learnable
Multi-Knowledge Adapter (MK-Adapter) to adaptively blend the predictions from
CLIP and DINO. By such collaboration, CoMo can fully unleash the potential of
different pre-training methods and unify them to perform state-of-the-art for
few-shot classification. We conduct extensive experiments on 11 datasets to
demonstrate the superiority and generalization ability of our approach.Comment: 10 pages, 6 figure
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model
While LISA effectively bridges the gap between segmentation and large
language models to enable reasoning segmentation, it poses certain limitations:
unable to distinguish different instances of the target region, and constrained
by the pre-defined textual response formats. In this work, we introduce LISA++,
an update to the existing LISA model, focusing on improving core
functionalities while keeping the base architecture intact. The main
enhancements in LISA++ include: \textbf{1) Enhanced Segmentation}: The instance
segmentation ability has been added, providing a more detailed scene analysis
along with the existing multi-region semantic segmentation. \textbf{2) More
Natural Conversation}: Improved capability for multi-turn dialogue, with the
ability to incorporate segmentation results directly into text responses, i.e.,
Segmentation in Dialogue (SiD). These improvements are achieved by curating the
existing samples of generic segmentation datasets, aimed specifically at
enhancing the segmentation and conversational skills without structural change
and additional data sources. Comparative analysis with the original LISA model
shows significant advancements in these areas, positioning LISA++ as a notable
upgrade in visual understanding and interaction. LISA++'s adaptability and
improved features highlight the versatility of the mask-as-embedding paradigm
proposed by LISA, and the potential as a foundational model for diverse
applications.Comment: Typo fixe
Nlrp2, a Maternal Effect Gene Required for Early Embryonic Development in the Mouse
Maternal effect genes encode proteins that are produced during oogenesis and play an essential role during early embryogenesis. Genetic ablation of such genes in oocytes can result in female subfertility or infertility. Here we report a newly identified maternal effect gene, Nlrp2, which plays a role in early embryogenesis in the mouse. Nlrp2 mRNAs and their proteins (βΌ118 KDa) are expressed in oocytes and granulosa cells during folliculogenesis. The transcripts show a striking decline in early preimplantation embryos before zygotic genome activation, but the proteins remain present through to the blastocyst stage. Immunogold electron microscopy revealed that the NLRP2 protein is located in the cytoplasm, nucleus and close to nuclear pores in the oocytes, as well as in the surrounding granulosa cells. Using RNA interference, we knocked down Nlrp2 transcription specifically in mouse germinal vesicle oocytes. The knockdown oocytes could progress through the metaphase of meiosis I and emit the first polar body. However, the development of parthenogenetic embryos derived from Nlrp2 knockdown oocytes mainly blocked at the 2-cell stage. The maternal depletion of Nlrp2 in zygotes led to early embryonic arrest. In addition, overexpression of Nlrp2 in zygotes appears to lead to normal development, but increases blastomere apoptosis in blastocysts. These results provide the first evidence that Nlrp2 is a member of the mammalian maternal effect genes and required for early embryonic development in the mouse
SCMA Codebook Design Based on Decomposition of the Superposed Constellation for AWGN Channel
In this study, we propose a method named decomposition of the superposed constellation (DCSC) to design sparse code multiple access (SCMA) codebooks for the additive white Gaussian noise (AWGN) channel. We prove that the power of the user symbols (USs) is accurately determined by the power of the superposed constellation (SC). Thus, we select quadrature amplitude modulation (QAM) constellations as the SC and decompose the SC into several groups of USs with power diversity. The minimum Euclidean distance (MED) between superposed symbols (SS-MED) in the receiver is determined by the selected QAM and MED between the multi-dimensional codewords (CW-MED) is optimized by matching the symbols on different dimensions. We propose a simplified DCSC (S-DCSC) by modifying the factor graph and avoiding the transmission of USs with low power, which greatly reduces the complexity of the message passing algorithm (MPA). The simulations show that the SS-MEDs of DCSC and S-DCSC are larger than those in previous papers and the BER performance of the proposed codebooks is better than others
A New Hybrid Prediction Method of Ultra-Short-Term Wind Power Forecasting Based on EEMD-PE and LSSVM Optimized by the GSA
Wind power time series data always exhibits nonlinear and non-stationary features, making it very difficult to accurately predict. In this paper, a novel hybrid wind power time series prediction model, based on ensemble empirical mode decomposition-permutation entropy (EEMD-PE), the least squares support vector machine model (LSSVM), and gravitational search algorithm (GSA), is proposed to improve accuracy of ultra-short-term wind power forecasting. To process the data, original wind power series were decomposed by EEMD-PE techniques into a number of subsequences with obvious complexity differences. Then, a new heuristic GSA algorithm was utilized to optimize the parameters of the LSSVM. The optimized model was developed for wind power forecasting and improved regression prediction accuracy. The proposed model was validated with practical wind power generation data from the Hebei province, China. A comprehensive error metric analysis was carried out to compare the performance of our method with other approaches. The results showed that the proposed model enhanced forecasting performance compared to other benchmark models
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
The booming of 3D recognition in the 2020s began with the introduction of
point cloud transformers. They quickly overwhelmed sparse CNNs and became
state-of-the-art models, especially in 3D semantic segmentation. However,
sparse CNNs are still valuable networks, due to their efficiency treasure, and
ease of application. In this work, we reexamine the design distinctions and
test the limits of what a sparse CNN can achieve. We discover that the key
credit to the performance difference is adaptivity. Specifically, we propose
two key components, i.e., adaptive receptive fields (spatially) and adaptive
relation, to bridge the gap. This exploration led to the creation of
Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a
lightweight module to greatly enhance the adaptivity of sparse CNNs at minimal
computational cost. Without any self-attention modules, OA-CNNs favorably
surpass point transformers in terms of accuracy in both indoor and outdoor
scenes, with much less latency and memory cost. Notably, it achieves 76.1%,
78.9%, and 70.6% mIoU on ScanNet v2, nuScenes, and SemanticKITTI validation
benchmarks respectively, while maintaining at most 5x better speed than
transformer counterparts. This revelation highlights the potential of pure
sparse CNNs to outperform transformer-related networks.Comment: CVPR 202
Hierarchical model predictive control strategy based on dynamic active power dispatch for wind power cluster integration
Large-scale wind power cluster with distributed wind farms has generated the active power dispatch and control problems in the power system. In this paper, a novel hierarchical model predictive control (HMPC) strategy based on dynamic active power dispatch is proposed to improve wind power schedule and increase wind power accommodation. The strategy consists of four layers with refined time scales, including intra-day dispatch, real-time dispatch, cluster optimization and wind farm modulation layer. A dynamic grouping strategy is specifically developed to allocate the schedule for wind farms in cluster optimization layer. In order to maximize wind power output, downward spinning reserve and transmission pathway utilization are developed in wind farm modulation layer. Meanwhile, a stratification analysis approach for ultra-short-term wind power forecasting error is presented as feedback correction to increase forecasting accuracy. The proposed strategy is evaluated by a case study in the IEEE network with wind power cluster integration. Results show that wind power accommodation has been enhanced by use of the proposed HMPC strategy, compared with the conventional dispatch and allocation methods