51 research outputs found
LUNA: A Model-Based Universal Analysis Framework for Large Language Models
Over the past decade, Artificial Intelligence (AI) has had great success
recently and is being used in a wide range of academic and industrial fields.
More recently, LLMs have made rapid advancements that have propelled AI to a
new level, enabling even more diverse applications and industrial domains with
intelligence, particularly in areas like software engineering and natural
language processing. Nevertheless, a number of emerging trustworthiness
concerns and issues exhibited in LLMs have already recently received much
attention, without properly solving which the widespread adoption of LLMs could
be greatly hindered in practice. The distinctive characteristics of LLMs, such
as the self-attention mechanism, extremely large model scale, and
autoregressive generation schema, differ from classic AI software based on CNNs
and RNNs and present new challenges for quality analysis. Up to the present, it
still lacks universal and systematic analysis techniques for LLMs despite the
urgent industrial demand. Towards bridging this gap, we initiate an early
exploratory study and propose a universal analysis framework for LLMs, LUNA,
designed to be general and extensible, to enable versatile analysis of LLMs
from multiple quality perspectives in a human-interpretable manner. In
particular, we first leverage the data from desired trustworthiness
perspectives to construct an abstract model as an auxiliary analysis asset,
which is empowered by various abstract model construction methods. To assess
the quality of the abstract model, we collect and define a number of evaluation
metrics, aiming at both abstract model level and the semantics level. Then, the
semantics, which is the degree of satisfaction of the LLM w.r.t. the
trustworthiness perspective, is bound to and enriches the abstract model with
semantics, which enables more detailed analysis applications for diverse
purposes.Comment: 44 pages, 9 figure
SurrealDriver: Designing Generative Driver Agent Simulation Framework in Urban Contexts based on Large Language Model
Simulation plays a critical role in the research and development of
autonomous driving and intelligent transportation systems. However, the current
simulation platforms exhibit limitations in the realism and diversity of agent
behaviors, which impede the transfer of simulation outcomes to the real world.
In this paper, we propose a generative driver agent simulation framework based
on large language models (LLMs), capable of perceiving complex traffic
scenarios and providing realistic driving maneuvers. Notably, we conducted
interviews with 24 drivers and used their detailed descriptions of driving
behavior as chain-of-thought prompts to develop a `coach agent' module, which
can evaluate and assist driver agents in accumulating driving experience and
developing human-like driving styles. Through practical simulation experiments
and user experiments, we validate the feasibility of this framework in
generating reliable driver agents and analyze the roles of each module. The
results show that the framework with full architect decreased the collision
rate by 81.04% and increased the human-likeness by 50%. Our research proposes
the first urban context driver agent simulation framework based on LLMs and
provides valuable insights into the future of agent simulation for complex
tasks.Comment: 12 pages, 8 figure
\u3ci\u3ePhotosystem II Subunit S\u3c/i\u3e overexpression increases the efficiency of water use in a field-grown crop
Insufficient water availability for crop production is a mounting barrier to achieving the 70% increase in food production that will be needed by 2050. One solution is to develop crops that require less water per unit mass of production. Water vapor transpires from leaves through stomata, which also facilitate the influx of CO2 during photosynthetic assimilation. Here, we hypothesize that Photosystem II Subunit S (PsbS) expression affects a chloroplastderived signal for stomatal opening in response to light, which can be used to improve wateruse efficiency. Transgenic tobacco plants with a range of PsbS expression, from undetectable to 3.7 times wild-type are generated. Plants with increased PsbS expression show less stomatal opening in response to light, resulting in a 25% reduction in water loss per CO2 assimilated under field conditions. Since the role of PsbS is universal across higher plants, this manipulation should be effective across all crops
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
This survey addresses the crucial issue of factuality in Large Language
Models (LLMs). As LLMs find applications across diverse domains, the
reliability and accuracy of their outputs become vital. We define the
Factuality Issue as the probability of LLMs to produce content inconsistent
with established facts. We first delve into the implications of these
inaccuracies, highlighting the potential consequences and challenges posed by
factual errors in LLM outputs. Subsequently, we analyze the mechanisms through
which LLMs store and process facts, seeking the primary causes of factual
errors. Our discussion then transitions to methodologies for evaluating LLM
factuality, emphasizing key metrics, benchmarks, and studies. We further
explore strategies for enhancing LLM factuality, including approaches tailored
for specific domains. We focus two primary LLM configurations standalone LLMs
and Retrieval-Augmented LLMs that utilizes external data, we detail their
unique challenges and potential enhancements. Our survey offers a structured
guide for researchers aiming to fortify the factual reliability of LLMs.Comment: 62 pages; 300+ reference
Molecular prevalence and subtype distribution of Blastocystis spp. among children who have diarrheia or are asymptomatic in Wenzhou, Zhejiang Province, China
Blastocystis sp., a significant zoonotic parasite with a global distribution, was the focus of this study, which aimed to investigate its prevalence and genetic diversity among diarrheic and asymptomatic children in Wenzhou, China. We collected 1,032 fecal samples from Yuying Children’s Hospital, Wenzhou, China, comprising 684 from children with diarrhea and 348 from asymptomatic children. Genomic DNA extracted from these samples was used to detect Blastocystis spp. by PCR, targeting the small subunit ribosomal RNA gene. Subsequently, a phylogenetic tree was constructed, applying the maximum likelihood method. Blastocystis spp. were detected in 67 (6.5%) of the fecal samples. The prevalence rate of Blastocystis spp. in diarrheic children (8.8%; 60/684) was significantly higher than that in asymptomatic children (2.0%; 7/348) (χ
2 = 17.3, p < 0.001). Sequence analysis of the SSU rRNA gene identified five known Blastocystis spp. subtypes, ST1 (n = 12), ST2 (n = 5), ST3 (n = 35), ST4 (n = 12), and ST7 (n = 3). ST1 and ST3 were present in both diarrheic and asymptomatic children, while ST2, ST4, and ST7 were exclusive to diarrheic children. Intra-subtype genetic polymorphisms were identified, comprising four variations in ST1 (ST1-1 to ST1-4), five in ST3 (ST3-1 to ST3-5), two in ST4 (ST4-1 and ST4-2), and two in ST7 (ST7-1 and ST7-2). Notably, ST1-2 to ST1-4, ST3-3 to ST3-5, and ST7-1 and ST7-2 represent newly identified variations. The composition and genetic characteristics of subtypes among children in this region suggest various sources of infection, including human-to-human and animal-to-human transmission
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Zinc finger proteins and regulation of the hallmarks of cancer
Zinc finger proteins (ZFPs) form one of the
largest families of transcription factors in human
genetics, via their conserved zinc finger motifs. ZFPs
function in many biological processes including
development, differentiation, metabolism and apoptosis.
In addition, recent studies have demonstrated that ZFPs
are closely associated with different stages of cancer
development. One of the hallmarks of cancer is altered
signal transduction cascades and an understanding of the
changes in these pathways is essential for targeted
cancer therapy. In this review, we discuss examples of
ZFPs involved in development and progression of
several types of cancer, which can provide new insights
into cancer treatmen
UAV Path Planning in Dynamical Environment: A Novel ICACO-IDWA Algorithm
In this paper, a novel UAV path planning algorithm based on improved cellular ant colony algorithm and dynamic window algorithm (ICACO-IDWA) is proposed to solve the problem of dynamically changing threat during actual flight. The main innovations of this paper are as follows. (a) The hexagon grid method is proposed to model the UAV flight space, which solves the problem of inconsistent simulation time step. (b) A novel ICACO-IDWA algorithm is proposed. In the first stage, the optimal path is obtained by the improved cellular ant colony algorithm (ICACO). In the second stage, the improved dynamic window algorithm (IDWA) is used to optimize the optimal path considering dynamic threat. Through the algorithm, the UAV path planning with dynamic threat change is realized. Finally, simulation results verify the effectiveness of the proposed model and algorithm
miR2119, a Novel Transcriptional Regulator, Plays a Positive Role in Woody Plant Drought Tolerance by Mediating the Degradation of the CkBI-1 Gene Associated with Apoptosis
Caragana korshinskii, an important vegetation restoration species with economic and ecological benefits in the arid region of northwest China, is characterized by significant drought tolerance. However, the underlying molecular mechanisms by which miRNAs confer this trait in C. korshinskii are unclear. Here, we investigate the effect of CkmiR2119 on drought tolerance and identified its target gene, CkBI-1. A negative correlation of CkmiR2119 and CkBI-1 in both stems and leaves in a drought gradient treatment followed by target gene validation suggest that CkmiR2119 might negatively regulate CkBI-1. Consistently, a decrease in the expression of the CkBI-1 gene was observed after both transient transformation and stable transformation of CkamiR2119 in tobacco (Nicotiana tabacum). Moreover, the physiological analysis of CkamiR2119 and CkBI-1 transgenic plants further indicate that CkmiR2119 can enhance the drought tolerance of C. korshinskii in two aspects: (i) downregulating CkBI-1 expression to accelerate vessel maturation in stems; (ii) contributing to a higher level of CkBI-1 in mesophyll cells to inhibit programmed cell death (PCD). This work reveals that CkmiR2119 can increase plants’ drought tolerance by downregulating the expression of CkBI-1, providing a theoretical basis to improve plants’ ability to withstand stress tolerance by manipulating miRNAs
- …