18 research outputs found
Knowledge Distillation for Closed-Source Language Models
Closed-source language models such as GPT-4 have achieved remarkable
performance. Many recent studies focus on enhancing the capabilities of smaller
models through knowledge distillation from closed-source language models.
However, due to the incapability to directly access the weights, hidden states,
and output distributions of these closed-source models, the distillation can
only be performed by fine-tuning smaller models with data samples generated by
closed-source language models, which constrains the effectiveness of knowledge
distillation. In this paper, we propose to estimate the output distributions of
closed-source language models within a Bayesian estimation framework, involving
both prior and posterior estimation. The prior estimation aims to derive a
prior distribution by utilizing the corpus generated by closed-source language
models, while the posterior estimation employs a proxy model to update the
prior distribution and derive a posterior distribution. By leveraging the
estimated output distribution of closed-source language models, traditional
knowledge distillation can be executed. Experimental results demonstrate that
our method surpasses the performance of current models directly fine-tuned on
data generated by closed-source language models
Coordination control and analysis of TCSC devices to protect electrical power systems against disruptive disturbances
summary:In this work, we study coordination control and effective deployment of thyristor-controlled series compensation (TCSC) to protect power grids against disruptive disturbances. The power grid consists of flexible alternate current transmission systems (FACTS) devices for regulating power flow, phasor measurement units (PMUs) for detecting system states, and control station for generating the regulation signals. We propose a novel coordination control approach of TCSC devices to change branch impedance and regulate the power flow against unexpected disturbances on buses or branches. More significantly, a numerical method is developed to estimate a gradient vector for generating regulation signals of TCSC devices and reducing computational costs. To describe the degree of power system stress, a performance index is designed based on the error between the desired power flow and actual values. Moreover, technical analysis is presented to ensure the convergence of the proposed coordination control algorithm. Numerical simulations are implemented to substantiate that the coordination control approach can effectively alleviate the stress caused by contingencies on IEEE 24 bus system, as compared to the classic PID control. It is also demonstrated that the deployment of TCSCs can alleviate the system stress greatly by considering both impedance magnitude and active power on branches
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent
Large Language Model (LLM) agents significantly extend the capabilities of
standalone LLMs, empowering them to interact with external tools (e.g., APIs,
functions) and complete various tasks in a self-directed fashion. The challenge
of tool use demands that LLMs not only understand user queries and generate
answers accurately but also excel in task planning, tool invocation, and result
summarization. While traditional works focus on training a single LLM with all
these capabilities, performance limitations become apparent, particularly with
smaller models. To overcome these challenges, we propose a novel approach that
decomposes the aforementioned capabilities into a planner, caller, and
summarizer. Each component is implemented by a single LLM that focuses on a
specific capability and collaborates with others to accomplish the task. This
modular framework facilitates individual updates and the potential use of
smaller LLMs for building each capability. To effectively train this framework,
we introduce a two-stage training paradigm. First, we fine-tune a backbone LLM
on the entire dataset without discriminating sub-tasks, providing the model
with a comprehensive understanding of the task. Second, the fine-tuned LLM is
used to instantiate the planner, caller, and summarizer respectively, which are
continually fine-tuned on respective sub-tasks. Evaluation across various
tool-use benchmarks illustrates that our proposed multi-LLM framework surpasses
the traditional single-LLM approach, highlighting its efficacy and advantages
in tool learning.Comment: On progress, github repo: https://github.com/X-PLUG/Multi-LLM-Agen
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
Large language models (LLMs) have recently demonstrated remarkable
capabilities to comprehend human intentions, engage in reasoning, and design
planning-like behavior. To further unleash the power of LLMs to accomplish
complex tasks, there is a growing trend to build agent framework that equips
LLMs, such as ChatGPT, with tool-use abilities to connect with massive external
APIs. In this work, we introduce ModelScope-Agent, a general and customizable
agent framework for real-world applications, based on open-source LLMs as
controllers. It provides a user-friendly system library, with customizable
engine design to support model training on multiple open-source LLMs, while
also enabling seamless integration with both model APIs and common APIs in a
unified way. To equip the LLMs with tool-use abilities, a comprehensive
framework has been proposed spanning over tool-use data collection, tool
retrieval, tool registration, memory control, customized model training, and
evaluation for practical real-world applications. Finally, we showcase
ModelScopeGPT, a real-world intelligent assistant of ModelScope Community based
on the ModelScope-Agent framework, which is able to connect open-source LLMs
with more than 1000 public AI models and localized community knowledge in
ModelScope. The ModelScope-Agent
library\footnote{https://github.com/modelscope/modelscope-agent} and online
demo\footnote{https://modelscope.cn/studios/damo/ModelScopeGPT/summary} are now
publicly available
Construction and Applications of Billion-Scale Pre-trained Multimodal Business Knowledge Graph
Business Knowledge Graphs (KGs) are important to many enterprises today,
providing factual knowledge and structured data that steer many products and
make them more intelligent. Despite their promising benefits, building business
KG necessitates solving prohibitive issues of deficient structure and multiple
modalities. In this paper, we advance the understanding of the practical
challenges related to building KG in non-trivial real-world systems. We
introduce the process of building an open business knowledge graph (OpenBG)
derived from a well-known enterprise, Alibaba Group. Specifically, we define a
core ontology to cover various abstract products and consumption demands, with
fine-grained taxonomy and multimodal facts in deployed applications. OpenBG is
an open business KG of unprecedented scale: 2.6 billion triples with more than
88 million entities covering over 1 million core classes/concepts and 2,681
types of relations. We release all the open resources (OpenBG benchmarks)
derived from it for the community and report experimental results of KG-centric
tasks. We also run up an online competition based on OpenBG benchmarks, and has
attracted thousands of teams. We further pre-train OpenBG and apply it to many
KG- enhanced downstream tasks in business scenarios, demonstrating the
effectiveness of billion-scale multimodal knowledge for e-commerce. All the
resources with codes have been released at
\url{https://github.com/OpenBGBenchmark/OpenBG}.Comment: OpenBG. Work in Progres
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Large language models (LLMs) have demonstrated impressive zero-shot abilities
on a variety of open-ended tasks, while recent research has also explored the
use of LLMs for multi-modal generation. In this study, we introduce mPLUG-Owl,
a novel training paradigm that equips LLMs with multi-modal abilities through
modularized learning of foundation LLM, a visual knowledge module, and a visual
abstractor module. This approach can support multiple modalities and facilitate
diverse unimodal and multimodal abilities through modality collaboration. The
training paradigm of mPLUG-Owl involves a two-stage method for aligning image
and text, which learns visual knowledge with the assistance of LLM while
maintaining and even improving the generation abilities of LLM. In the first
stage, the visual knowledge module and abstractor module are trained with a
frozen LLM module to align the image and text. In the second stage,
language-only and multi-modal supervised datasets are used to jointly fine-tune
a low-rank adaption (LoRA) module on LLM and the abstractor module by freezing
the visual knowledge module. We carefully build a visually-related instruction
evaluation set OwlEval. Experimental results show that our model outperforms
existing multi-modal models, demonstrating mPLUG-Owl's impressive instruction
and visual understanding ability, multi-turn conversation ability, and
knowledge reasoning ability. Besides, we observe some unexpected and exciting
abilities such as multi-image correlation and scene text understanding, which
makes it possible to leverage it for harder real scenarios, such as vision-only
document comprehension. Our code, pre-trained model, instruction-tuned models,
and evaluation set are available at https://github.com/X-PLUG/mPLUG-Owl. The
online demo is available at https://www.modelscope.cn/studios/damo/mPLUG-Owl.Comment: Working in Proces
Direct enzymatic ethanolysis of potential Nannochloropsis biomass for co-production of sustainable biodiesel and nutraceutical eicosapentaenoic acid
Abstract Background Marine microalga Nannochloropsis is a promising source for the production of renewable and sustainable biodiesel in replacement of depleting petroleum. Other than biodiesel, Nannochloropsis is a green and potential resource for the commercial production of nutraceutical eicosapentaenoic acid (EPA, C20:5). In recent studies, low-value biodiesel can be achieved by transesterification of Nannochloropsis biomass. However, it is undoubtedly wasteful to produce microalgal biodiesel containing EPA from nutritional and economical aspects. A new strategy was addressed and exploited to produce low-value bulky biodiesel along with EPA enrichment via enzymatic ethanolysis of Nannochloropsis biomass with a specific lipase. Results Cellulase pretreatment on Nannochloropsis sp. biomass significantly improved the biodiesel conversion by direct ethanolysis with five enzymes from Candida antarctica (CALA and CALB), Thermomyces lanuginosus (TL), Rhizomucor miehei (RM), and Aspergillus oryzae (PLA). Among these five biocatalysts, CALA was the best suitable enzyme to yield high biodiesel conversion and effectively enrich EPA. After optimization, the maximum biodiesel conversion (46.53–48.57%) was attained by CALA at 8:1 ethanol/biomass ratio (v/w) in 10–15% water content with 10% lipase weight at 35 °C for 72 h. Meanwhile, EPA (60.81%) was highly enriched in microalgae NPLs (neutral lipids and polar lipids), increasing original EPA levels by 1.51-fold. Moreover, this process was re-evaluated with two Nannochloropsis species (IMET1 and Salina 537). Under the optimized conditions, the biodiesel conversions of IMET1 and Salina 537 by CALA were 63.41% and 54.33%, respectively. EPA contents of microalgal NPLs were 50.06% for IMET1 and 53.73% for Salina 537. Conclusion CALA was the potential biocatalyst to discriminate against EPA in the ethanolysis of Nannochloropsis biomass. The biodiesel conversion and EPA enrich efficiency of CALA were greatly dependent on lipidic class and fatty acid compositions of Nannochloropsis biomass. CALA-catalyzed ethanolysis with Nannochloropsis biomass was a promising approach for co-production of low-value biodiesel and high-value microalgae products rich in EPA
Rice Black-streaked Dwarf Virus Preparation and Infection on Rice
Rice black-streaked dwarf virus (RBSDV), a member of genus Fijivirus in the family Reoviridae, infects rice, maize, barley and wheat, and can seriously affect crop yields. RBSDV is transmitted by the small brown planthopper (Laodelphax striatellus, SBPH) in a persistent manner. RBSDV has 10 linear dsRNA genomic segments, making it difficult to construct infectious clones for functional studies in plants. Here we describe a method for inoculating and maintaining RBSDV on rice in a greenhouse for use in laboratory research. The protocol uses SBPHs mass reared in the laboratory. We also describe in detail the propagation of a healthy planthopper population, the preparation of plant material, RBSDV inoculation and the evaluation of the rice after inoculation
A Rice Receptor-like Protein Negatively Regulates Rice Resistance to Southern Rice Black-Streaked Dwarf Virus Infection
Plants rely on various receptor-like proteins and receptor-like kinases to recognize and defend against invading pathogens. However, research on the role of receptor-like proteins in plant antiviral defense, particularly in rice–virus interactions, is limited. In this study, we identified a receptor-like gene, OsBAP1, which was significantly induced upon infection with southern rice black-streaked dwarf virus (SRBSDV) infection. A viral inoculation assay showed that the OsBAP1 knockout mutant exhibited enhanced resistance to SRBSDV infection, indicating that OsBAP1 plays a negatively regulated role in rice resistance to viral infection. Transcriptome analysis revealed that the genes involved in plant–pathogen interactions, plant hormone signal transduction, oxidation–reduction reactions, and protein phosphorylation pathways were significantly enriched in OsBAP1 mutant plants (osbap1-cas). Quantitative real-time PCR (RT-qPCR) analysis further demonstrated that some defense-related genes were significantly induced during SRBSDV infection in osbap1-cas mutants. Our findings provide new insights into the role of receptor-like proteins in plant immune signaling pathways, and demonstrate that OsBAP1 negatively regulates rice resistance to SRBSDV infection
Genome-Wide Identification and Gene Expression Analysis of the OTU DUB Family in <i>Oryza sativa</i>
Ovarian tumor domain (OTU)-containing deubiquitinating enzymes (DUBs) are an essential DUB to maintain protein stability in plants and play important roles in plant growth development and stress response. However, there is little genome-wide identification and analysis of the OTU gene family in rice. In this study, we identified 20 genes of the OTU family in rice genome, which were classified into four groups based on the phylogenetic analysis. Their gene structures, conserved motifs and domains, chromosomal distribution, and cis elements in promoters were further studied. In addition, OTU gene expression patterns in response to plant hormone treatments, including SA, MeJA, NAA, BL, and ABA, were investigated by RT-qPCR analysis. The results showed that the expression profile of OsOTU genes exhibited plant hormone-specific expression. Expression levels of most of the rice OTU genes were significantly changed in response to rice stripe virus (RSV), rice black-streaked dwarf virus (RBSDV), Southern rice black-streaked dwarf virus (SRBSDV), and Rice stripe mosaic virus (RSMV). These results suggest that the rice OTU genes are involved in diverse hormone signaling pathways and in varied responses to virus infection, providing new insights for further functional study of OsOTU genes