156 research outputs found

    Diffusion-Model-Assisted Supervised Learning of Generative Models for Density Estimation

    Full text link
    We present a supervised learning framework of training generative models for density estimation. Generative models, including generative adversarial networks, normalizing flows, variational auto-encoders, are usually considered as unsupervised learning models, because labeled data are usually unavailable for training. Despite the success of the generative models, there are several issues with the unsupervised training, e.g., requirement of reversible architectures, vanishing gradients, and training instability. To enable supervised learning in generative models, we utilize the score-based diffusion model to generate labeled data. Unlike existing diffusion models that train neural networks to learn the score function, we develop a training-free score estimation method. This approach uses mini-batch-based Monte Carlo estimators to directly approximate the score function at any spatial-temporal location in solving an ordinary differential equation (ODE), corresponding to the reverse-time stochastic differential equation (SDE). This approach can offer both high accuracy and substantial time savings in neural network training. Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner. Compared with existing normalizing flow models, our method does not require to use reversible neural networks and avoids the computation of the Jacobian matrix. Compared with existing diffusion models, our method does not need to solve the reverse-time SDE to generate new samples. As a result, the sampling efficiency is significantly improved. We demonstrate the performance of our method by applying it to a set of 2D datasets as well as real data from the UCI repository

    Connecting With Fundamental Mathematical Knowledge Directly: The Organizational Features of Good Mathematical Cognitive Structure

    Get PDF
    This paper reported on the study of a good mathematical cognitive structure (GMCS) based on 43 top university students and 82 concepts of Calculus materials, using the social network analysis method. The results indicated that the GMCS has the following organizational features: (1) The mathematical knowledge (MK) in GMCS interconnected widely, especially in MK with a higher connection tightness; (2) Most connections between MK were direct; (3) MK of the basic and higher inclusive level had a greater impact; and (4) There were multiple MK accumulation points connecting others to form subsets. These new findings enrich the results of previous GMCS studies and promotes further exploration of GMCS. In view of this, teachers should pay closer attention to basic and abstract MK and help their students construct various direct connections of the MK in their mind

    Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

    Full text link
    Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average since there is no specific knowledge in it. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, which is about Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, which is not available for general LLM, so it is well suited for evaluating methods aimed at improving domain-specific capabilities of LLM. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our model fusion framework outperforms the commonly used LLM with retrieval methods.Comment: 13 pages, 1 figur

    Self-Guard: Empower the LLM to Safeguard Itself

    Full text link
    The jailbreak attack can bypass the safety measures of a Large Language Model (LLM), generating harmful content. This misuse of LLM has led to negative societal consequences. Currently, there are two main approaches to address jailbreak attacks: safety training and safeguards. Safety training focuses on further training LLM to enhance its safety. On the other hand, safeguards involve implementing external models or filters to prevent harmful outputs. However, safety training has constraints in its ability to adapt to new attack types and often leads to a drop in model performance. Safeguards have proven to be of limited help. To tackle these issues, we propose a novel approach called Self-Guard, which combines the strengths of both safety methods. Self-Guard includes two stages. In the first stage, we enhance the model's ability to assess harmful content, and in the second stage, we instruct the model to consistently perform harmful content detection on its own responses. The experiment has demonstrated that Self-Guard is robust against jailbreak attacks. In the bad case analysis, we find that LLM occasionally provides harmless responses to harmful queries. Additionally, we evaluated the general capabilities of the LLM before and after safety training, providing evidence that Self-Guard does not result in the LLM's performance degradation. In sensitivity tests, Self-Guard not only avoids inducing over-sensitivity in LLM but also can even mitigate this issue

    UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems

    Full text link
    Large Language Models (LLMs) has shown exceptional capabilities in many natual language understanding and generation tasks. However, the personalization issue still remains a much-coveted property, especially when it comes to the multiple sources involved in the dialogue system. To better plan and incorporate the use of multiple sources in generating personalized response, we firstly decompose it into three sub-tasks: Knowledge Source Selection, Knowledge Retrieval, and Response Generation. We then propose a novel Unified Multi-Source Retrieval-Augmented Generation system (UniMS-RAG) Specifically, we unify these three sub-tasks with different formulations into the same sequence-to-sequence paradigm during the training, to adaptively retrieve evidences and evaluate the relevance on-demand using special tokens, called acting tokens and evaluation tokens. Enabling language models to generate acting tokens facilitates interaction with various knowledge sources, allowing them to adapt their behavior to diverse task requirements. Meanwhile, evaluation tokens gauge the relevance score between the dialogue context and the retrieved evidence. In addition, we carefully design a self-refinement mechanism to iteratively refine the generated response considering 1) the consistency scores between the generated response and retrieved evidence; and 2) the relevance scores. Experiments on two personalized datasets (DuLeMon and KBP) show that UniMS-RAG achieves state-of-the-art performance on the knowledge source selection and response generation task with itself as a retriever in a unified manner. Extensive analyses and discussions are provided for shedding some new perspectives for personalized dialogue systems

    The Invasive MED/Q \u3cem\u3eBemisia tabaci\u3c/em\u3e Genome: A Tale of Gene Loss and Gene Gain

    Get PDF
    Background: Sweetpotato whitefly, Bemisia tabaci MED/Q and MEAM1/B, are two economically important invasive species that cause considerable damages to agriculture crops through direct feeding and indirect vectoring of plant pathogens. Recently, a draft genome of B. tabaci MED/Q has been assembled. In this study, we focus on the genomic comparison between MED/Q and MEAM1/B, with a special interest in MED/Q’s genomic signatures that may contribute to the highly invasive nature of this emerging insect pest. Results: The genomes of both species share similarity in syntenic blocks, but have significant divergence in the gene coding sequence. Expansion of cytochrome P450 monooxygenases and UDP glycosyltransferases in MED/Q and MEAM1/B genome is functionally validated for mediating insecticide resistance in MED/Q using in vivo RNAi. The amino acid biosynthesis pathways in MED/Q genome are partitioned among the host and endosymbiont genomes in a manner distinct from other hemipterans. Evidence of horizontal gene transfer to the host genome may explain their obligate relationship. Putative loss-of-function in the immune deficiency-signaling pathway due to the gene loss is a shared ancestral trait among hemipteran insects. Conclusions: The expansion of detoxification genes families, such as P450s, may contribute to the development of insecticide resistance traits and a broad host range in MED/Q and MEAM1/B, and facilitate species’ invasions into intensively managed cropping systems. Numerical and compositional changes in multiple gene families (gene loss and gene gain) in the MED/Q genome sets a foundation for future hypothesis testing that will advance our understanding of adaptation, viral transmission, symbiosis, and plant-insect-pathogen tritrophic interactions

    MPOLSAR-1.0: Multidimensional SAR Multiband Fully Polarized Fine Classification Dataset

    Get PDF
    Fine terrain classification is one of the main applications of Synthetic Aperture Radar (SAR). In the multiband fully polarized SAR operating mode, obtaining information on different frequency bands of the target and polarization response characteristics of a target is possible, which can improve target classification accuracy. However, the existing datasets at home and abroad only have low-resolution fully polarized classification data for individual bands, limited regions, and small samples. Thus, a multidimensional SAR dataset from Hainan is used to construct a multiband fully polarized fine classification dataset with ample sample size, diverse land cover categories, and high classification reliability. This dataset will promote the development of multiband fully polarized SAR classification applications, supported by the high-resolution aerial observation system application calibration and verification project. This paper provides an overview of the composition of the dataset, and describes the information and dataset production methods for the first batch of published data (MPOLSAR-1.0). Furthermore, this study presents the preliminary classification experimental results based on the polarization feature classification and classical machine learning classification methods, providing support for the sharing and application of the dataset

    Diffusion basis spectrum imaging measures anti-inflammatory and neuroprotective effects of fingolimod on murine optic neuritis

    Get PDF
    OBJECTIVE: To prospectively determine whether diffusion basis spectrum imaging (DBSI) detects, differentiates and quantitates coexisting inflammation, demyelination, axonal injury and axon loss in mice with optic neuritis (ON) due to experimental autoimmune encephalomyelitis (EAE), and to determine if DBSI accurately measures effects of fingolimod on underlying pathology. METHODS: EAE was induced in 7-week-old C57BL/6 female mice. Visual acuity (VA) was assessed daily to detect onset of ON after which daily oral-treatment with either fingolimod (1 mg/kg) or saline was given for ten weeks. In vivo DBSI scans of optic nerves were performed at baseline, 2-, 6- and 10-weeks post treatment. DBSI-derived metrics including restricted isotropic diffusion tensor fraction (putatively reflecting cellularity), non-restricted isotropic diffusion tensor fraction (putatively reflecting vasogenic edema), DBSI-derived axonal volume, axial diffusivity, λ RESULTS: Optic nerves of fingolimod-treated mice exhibited significantly better (p \u3c 0.05) VA than saline-treated group at each time point. During ten-week of treatment, DBSI-derived non-restricted and restricted-isotropic-diffusion-tensor fractions, and axonal volumes were not significantly different (p \u3e 0.05) from the baseline values in fingolimod-treated mice. Transient DBSI-λ CONCLUSION: DBSI was used to assess changes of the underlying optic nerve pathologies in EAE mice with ON, exhibiting great potential as a noninvasive outcome measure for monitoring disease progression and therapeutic efficacy for MS
    • …
    corecore