46 research outputs found

    From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

    Full text link
    In the realm of Large Language Models, the balance between instruction data quality and quantity has become a focal point. Recognizing this, we introduce a self-guided methodology for LLMs to autonomously discern and select cherry samples from vast open-source datasets, effectively minimizing manual curation and potential cost for instruction tuning an LLM. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal tool to identify discrepancies between a model's expected responses and its autonomous generation prowess. Through the adept application of IFD, cherry samples are pinpointed, leading to a marked uptick in model training efficiency. Empirical validations on renowned datasets like Alpaca and WizardLM underpin our findings; with a mere 10% of conventional data input, our strategy showcases improved results. This synthesis of self-guided cherry-picking and the IFD metric signifies a transformative leap in the optimization of LLMs, promising both efficiency and resource-conscious advancements. Codes, data, and models are available: https://github.com/MingLiiii/Cherry_LL

    Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

    Full text link
    In this work, we focus on open vocabulary instance segmentation to expand a segmentation model to classify and segment instance-level novel categories. Previous approaches have relied on massive caption datasets and complex pipelines to establish one-to-one mappings between image regions and words in captions. However, such methods build noisy supervision by matching non-visible words to image regions, such as adjectives and verbs. Meanwhile, context words are also important for inferring the existence of novel objects as they show high inter-correlations with novel categories. To overcome these limitations, we devise a joint \textbf{Caption Grounding and Generation (CGG)} framework, which incorporates a novel grounding loss that only focuses on matching object nouns to improve learning efficiency. We also introduce a caption generation head that enables additional supervision and contextual modeling as a complementation to the grounding loss. Our analysis and results demonstrate that grounding and generation components complement each other, significantly enhancing the segmentation performance for novel classes. Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG. Specifically, CGG achieves a substantial improvement of 6.8% mAP for novel classes without extra data on the OVIS task and 15% PQ improvements for novel classes on the OSPS benchmark.Comment: ICCV-202

    Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

    Get PDF
    In this work, we focus on open vocabulary instance segmentation to expand a segmentation model to classify and segment instance-level novel categories. Previous approaches have relied on massive caption datasets and complex pipelines to establish one-to-one mappings between image regions and words in captions. However, such methods build noisy supervision by matching non-visible words to image regions, such as adjectives and verbs. Meanwhile, context words are also important for inferring the existence of novel objects as they show high inter-correlations with novel categories. To overcome these limitations, we devise a joint Caption Grounding and Generation (CGG) framework, which incorporates a novel grounding loss that only focuses on matching object nouns to improve learning efficiency. We also introduce a caption generation head that enables additional supervision and contextual modeling as a complementation to the grounding loss. Our analysis and results demonstrate that grounding and generation components complement each other, significantly enhancing the segmentation performance for novel classes. Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG. Specifically, CGG achieves a substantial improvement of 6.8% mAP for novel classes without extra data on the OVIS task and 15% PQ improvements for novel classes on the OSPS benchmark

    Protective Effect of Tetrahydroxystilbene Glucoside on 6-OHDA-Induced Apoptosis in PC12 Cells through the ROS-NO Pathway

    Get PDF
    Oxidative stress plays an important role in the pathogenesis of neurodegenerative diseases, such as Parkinson's disease. The molecule, 2,3,5,4′-tetrahydr- oxystilbene-2-O-β-D-glucoside (TSG), is a potent antioxidant derived from the Chinese herb, Polygonum multiflorum Thunb. In this study, we investigated the protective effect of TSG against 6-hydroxydopamine-induced apoptosis in rat adrenal pheochromocytoma PC12 cells and the possible mechanisms. Our data demonstrated that TSG significantly reversed the 6-hydroxydopamine-induced decrease in cell viability, prevented 6-hydroxydopamine-induced changes in condensed nuclei and decreased the percentage of apoptotic cells in a dose-dependent manner. In addition, TSG slowed the accumulation of intracellular reactive oxygen species and nitric oxide, counteracted the overexpression of inducible nitric oxide syntheses as well as neuronal nitric oxide syntheses, and also reduced the level of protein-bound 3-nitrotyrosine. These results demonstrate that the protective effects of TSG on rat adrenal pheochromocytoma PC12 cells are mediated, at least in part, by the ROS-NO pathway. Our results indicate that TSG may be effective in providing protection against neurodegenerative diseases associated with oxidative stress

    Biological Effects of Tetrahydroxystilbene Glucoside: An Active Component of a Rhizome Extracted from Polygonum multiflorum

    No full text
    Polygonum multiflorum Thunb. (PM), a traditional Chinese medicinal herb, has been widely used in the Orient as a tonic and antiaging agent. 2,3,5,4′-Tetrahydroxystilbene-2-O-β-D-glucoside (TSG, C20H22O9, FW = 406.38928) is one of the active components extracted from PM. TSG is an antioxidant agent, which exhibits remarkable antioxidative activities in vivo and in vitro. The antioxidant effect of TSG is achieved by its radical-scavenging effects. TSG can inhibit apoptosis and protect neuronal cells against injury through multifunctional cytoprotective pathways. TSG performs prophylactic and therapeutic activities against Alzheimer’s disease, Parkinson’s disease, and cerebral ischemia/reperfusion injury. It is also antiatherosclerotic and anti-inflammatory. However, the mechanisms underlying these pharmacological activities are unclear. This study aimed at reviewing experimental studies and describing the effectiveness and possible mechanisms of TSG
    corecore