156 research outputs found
Diffusion-Model-Assisted Supervised Learning of Generative Models for Density Estimation
We present a supervised learning framework of training generative models for
density estimation. Generative models, including generative adversarial
networks, normalizing flows, variational auto-encoders, are usually considered
as unsupervised learning models, because labeled data are usually unavailable
for training. Despite the success of the generative models, there are several
issues with the unsupervised training, e.g., requirement of reversible
architectures, vanishing gradients, and training instability. To enable
supervised learning in generative models, we utilize the score-based diffusion
model to generate labeled data. Unlike existing diffusion models that train
neural networks to learn the score function, we develop a training-free score
estimation method. This approach uses mini-batch-based Monte Carlo estimators
to directly approximate the score function at any spatial-temporal location in
solving an ordinary differential equation (ODE), corresponding to the
reverse-time stochastic differential equation (SDE). This approach can offer
both high accuracy and substantial time savings in neural network training.
Once the labeled data are generated, we can train a simple fully connected
neural network to learn the generative model in the supervised manner. Compared
with existing normalizing flow models, our method does not require to use
reversible neural networks and avoids the computation of the Jacobian matrix.
Compared with existing diffusion models, our method does not need to solve the
reverse-time SDE to generate new samples. As a result, the sampling efficiency
is significantly improved. We demonstrate the performance of our method by
applying it to a set of 2D datasets as well as real data from the UCI
repository
Connecting With Fundamental Mathematical Knowledge Directly: The Organizational Features of Good Mathematical Cognitive Structure
This paper reported on the study of a good mathematical cognitive structure (GMCS) based on 43 top university students and 82 concepts of Calculus materials, using the social network analysis method. The results indicated that the GMCS has the following organizational features: (1) The mathematical knowledge (MK) in GMCS interconnected widely, especially in MK with a higher connection tightness; (2) Most connections between MK were direct; (3) MK of the basic and higher inclusive level had a greater impact; and (4) There were multiple MK accumulation points connecting others to form subsets. These new findings enrich the results of previous GMCS studies and promotes further exploration of GMCS. In view of this, teachers should pay closer attention to basic and abstract MK and help their students construct various direct connections of the MK in their mind
Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering
Large Language Model (LLM) has gained popularity and achieved remarkable
results in open-domain tasks, but its performance in real industrial
domain-specific scenarios is average since there is no specific knowledge in
it. This issue has attracted widespread attention, but there are few relevant
benchmarks available. In this paper, we provide a benchmark Question Answering
(QA) dataset named MSQA, which is about Microsoft products and IT technical
problems encountered by customers. This dataset contains industry
cloud-specific QA knowledge, which is not available for general LLM, so it is
well suited for evaluating methods aimed at improving domain-specific
capabilities of LLM. In addition, we propose a new model interaction paradigm
that can empower LLM to achieve better performance on domain-specific tasks
where it is not proficient. Extensive experiments demonstrate that the approach
following our model fusion framework outperforms the commonly used LLM with
retrieval methods.Comment: 13 pages, 1 figur
Self-Guard: Empower the LLM to Safeguard Itself
The jailbreak attack can bypass the safety measures of a Large Language Model
(LLM), generating harmful content. This misuse of LLM has led to negative
societal consequences. Currently, there are two main approaches to address
jailbreak attacks: safety training and safeguards. Safety training focuses on
further training LLM to enhance its safety. On the other hand, safeguards
involve implementing external models or filters to prevent harmful outputs.
However, safety training has constraints in its ability to adapt to new attack
types and often leads to a drop in model performance. Safeguards have proven to
be of limited help. To tackle these issues, we propose a novel approach called
Self-Guard, which combines the strengths of both safety methods. Self-Guard
includes two stages. In the first stage, we enhance the model's ability to
assess harmful content, and in the second stage, we instruct the model to
consistently perform harmful content detection on its own responses. The
experiment has demonstrated that Self-Guard is robust against jailbreak
attacks. In the bad case analysis, we find that LLM occasionally provides
harmless responses to harmful queries. Additionally, we evaluated the general
capabilities of the LLM before and after safety training, providing evidence
that Self-Guard does not result in the LLM's performance degradation. In
sensitivity tests, Self-Guard not only avoids inducing over-sensitivity in LLM
but also can even mitigate this issue
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems
Large Language Models (LLMs) has shown exceptional capabilities in many
natual language understanding and generation tasks. However, the
personalization issue still remains a much-coveted property, especially when it
comes to the multiple sources involved in the dialogue system. To better plan
and incorporate the use of multiple sources in generating personalized
response, we firstly decompose it into three sub-tasks: Knowledge Source
Selection, Knowledge Retrieval, and Response Generation. We then propose a
novel Unified Multi-Source Retrieval-Augmented Generation system (UniMS-RAG)
Specifically, we unify these three sub-tasks with different formulations into
the same sequence-to-sequence paradigm during the training, to adaptively
retrieve evidences and evaluate the relevance on-demand using special tokens,
called acting tokens and evaluation tokens. Enabling language models to
generate acting tokens facilitates interaction with various knowledge sources,
allowing them to adapt their behavior to diverse task requirements. Meanwhile,
evaluation tokens gauge the relevance score between the dialogue context and
the retrieved evidence. In addition, we carefully design a self-refinement
mechanism to iteratively refine the generated response considering 1) the
consistency scores between the generated response and retrieved evidence; and
2) the relevance scores. Experiments on two personalized datasets (DuLeMon and
KBP) show that UniMS-RAG achieves state-of-the-art performance on the knowledge
source selection and response generation task with itself as a retriever in a
unified manner. Extensive analyses and discussions are provided for shedding
some new perspectives for personalized dialogue systems
The Invasive MED/Q \u3cem\u3eBemisia tabaci\u3c/em\u3e Genome: A Tale of Gene Loss and Gene Gain
Background: Sweetpotato whitefly, Bemisia tabaci MED/Q and MEAM1/B, are two economically important invasive species that cause considerable damages to agriculture crops through direct feeding and indirect vectoring of plant pathogens. Recently, a draft genome of B. tabaci MED/Q has been assembled. In this study, we focus on the genomic comparison between MED/Q and MEAM1/B, with a special interest in MED/Q’s genomic signatures that may contribute to the highly invasive nature of this emerging insect pest.
Results: The genomes of both species share similarity in syntenic blocks, but have significant divergence in the gene coding sequence. Expansion of cytochrome P450 monooxygenases and UDP glycosyltransferases in MED/Q and MEAM1/B genome is functionally validated for mediating insecticide resistance in MED/Q using in vivo RNAi. The amino acid biosynthesis pathways in MED/Q genome are partitioned among the host and endosymbiont genomes in a manner distinct from other hemipterans. Evidence of horizontal gene transfer to the host genome may explain their obligate relationship. Putative loss-of-function in the immune deficiency-signaling pathway due to the gene loss is a shared ancestral trait among hemipteran insects.
Conclusions: The expansion of detoxification genes families, such as P450s, may contribute to the development of insecticide resistance traits and a broad host range in MED/Q and MEAM1/B, and facilitate species’ invasions into intensively managed cropping systems. Numerical and compositional changes in multiple gene families (gene loss and gene gain) in the MED/Q genome sets a foundation for future hypothesis testing that will advance our understanding of adaptation, viral transmission, symbiosis, and plant-insect-pathogen tritrophic interactions
MPOLSAR-1.0: Multidimensional SAR Multiband Fully Polarized Fine Classification Dataset
Fine terrain classification is one of the main applications of Synthetic Aperture Radar (SAR). In the multiband fully polarized SAR operating mode, obtaining information on different frequency bands of the target and polarization response characteristics of a target is possible, which can improve target classification accuracy. However, the existing datasets at home and abroad only have low-resolution fully polarized classification data for individual bands, limited regions, and small samples. Thus, a multidimensional SAR dataset from Hainan is used to construct a multiband fully polarized fine classification dataset with ample sample size, diverse land cover categories, and high classification reliability. This dataset will promote the development of multiband fully polarized SAR classification applications, supported by the high-resolution aerial observation system application calibration and verification project. This paper provides an overview of the composition of the dataset, and describes the information and dataset production methods for the first batch of published data (MPOLSAR-1.0). Furthermore, this study presents the preliminary classification experimental results based on the polarization feature classification and classical machine learning classification methods, providing support for the sharing and application of the dataset
Diffusion basis spectrum imaging measures anti-inflammatory and neuroprotective effects of fingolimod on murine optic neuritis
OBJECTIVE: To prospectively determine whether diffusion basis spectrum imaging (DBSI) detects, differentiates and quantitates coexisting inflammation, demyelination, axonal injury and axon loss in mice with optic neuritis (ON) due to experimental autoimmune encephalomyelitis (EAE), and to determine if DBSI accurately measures effects of fingolimod on underlying pathology.
METHODS: EAE was induced in 7-week-old C57BL/6 female mice. Visual acuity (VA) was assessed daily to detect onset of ON after which daily oral-treatment with either fingolimod (1 mg/kg) or saline was given for ten weeks. In vivo DBSI scans of optic nerves were performed at baseline, 2-, 6- and 10-weeks post treatment. DBSI-derived metrics including restricted isotropic diffusion tensor fraction (putatively reflecting cellularity), non-restricted isotropic diffusion tensor fraction (putatively reflecting vasogenic edema), DBSI-derived axonal volume, axial diffusivity, λ
RESULTS: Optic nerves of fingolimod-treated mice exhibited significantly better (p \u3c 0.05) VA than saline-treated group at each time point. During ten-week of treatment, DBSI-derived non-restricted and restricted-isotropic-diffusion-tensor fractions, and axonal volumes were not significantly different (p \u3e 0.05) from the baseline values in fingolimod-treated mice. Transient DBSI-λ
CONCLUSION: DBSI was used to assess changes of the underlying optic nerve pathologies in EAE mice with ON, exhibiting great potential as a noninvasive outcome measure for monitoring disease progression and therapeutic efficacy for MS
- …