127 research outputs found
Understanding and Improving In-Context Learning on Vision-language Models
Recently, in-context learning (ICL) on large language models (LLMs) has
received great attention, and this technique can also be applied to
vision-language models (VLMs) built upon LLMs. These VLMs can respond to
queries by conditioning responses on a series of multimodal demonstrations,
which comprise images, queries, and answers. Though ICL has been extensively
studied on LLMs, its research on VLMs remains limited. The inclusion of
additional visual information in the demonstrations motivates the following
research questions: which of the two modalities in the demonstration is more
significant? How can we select effective multimodal demonstrations to enhance
ICL performance? This study investigates the significance of both visual and
language information. Our findings indicate that ICL in VLMs is predominantly
driven by the textual information in the demonstrations whereas the visual
information in the demonstrations barely affects the ICL performance.
Subsequently, we provide an understanding of the findings by analyzing the
model information flow and comparing model inner states given different ICL
settings. Motivated by our analysis, we propose a simple yet effective
approach, termed Mixed Modality In-Context Example Selection (MMICES), which
considers both visual and language modalities when selecting demonstrations and
shows better ICL performance. Extensive experiments are conducted to support
our findings, understanding, and improvement of the ICL performance of VLMs.Comment: 8 pages, 10 figure
Genome-Wide Linkage Mapping of QTL for Yield Components, Plant Height and Yield-Related Physiological Traits in the Chinese Wheat Cross Zhou 8425B/Chinese Spring
Recommended from our members
A photo-responsive F-box protein FOF2 regulates floral initiation by promoting FLC expression in Arabidopsis.
Floral initiation is regulated by various genetic pathways in response to light, temperature, hormones and developmental status; however, the molecular mechanisms underlying the interactions between different genetic pathways are not fully understood. Here, we show that the photoresponsive gene FOF2 (F-box of flowering 2) negatively regulates flowering. FOF2 encodes a putative F-box protein that interacts specifically with ASK14, and its overexpression results in later flowering under both long-day and short-day photoperiods. Conversely, transgenic plants expressing the F-box domain deletion mutant of FOF2 (FOF2ΔF), or double loss of function mutant of FOF2 and FOL1 (FOF2-LIKE 1) present early flowering phenotypes. The late flowering phenotype of the FOF2 overexpression lines is suppressed by the flc-3 loss-of-function mutation. Furthermore, FOF2 mRNA expression is regulated by autonomous pathway gene FCA, and the repressive effect of FOF2 in flowering can be overcome by vernalization. Interestingly, FOF2 expression is regulated by light. The protein level of FOF2 accumulates in response to light, whereas it is degraded under dark conditions via the 26S proteasome pathway. Our findings suggest a possible mechanistic link between light conditions and the autonomous floral promotion pathway in Arabidopsis
A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models
Prompt engineering is a technique that involves augmenting a large
pre-trained model with task-specific hints, known as prompts, to adapt the
model to new tasks. Prompts can be created manually as natural language
instructions or generated automatically as either natural language instructions
or vector representations. Prompt engineering enables the ability to perform
predictions based solely on prompts without updating model parameters, and the
easier application of large pre-trained models in real-world tasks. In past
years, Prompt engineering has been well-studied in natural language processing.
Recently, it has also been intensively studied in vision-language modeling.
However, there is currently a lack of a systematic overview of prompt
engineering on pre-trained vision-language models. This paper aims to provide a
comprehensive survey of cutting-edge research in prompt engineering on three
types of vision-language models: multimodal-to-text generation models (e.g.
Flamingo), image-text matching models (e.g. CLIP), and text-to-image generation
models (e.g. Stable Diffusion). For each type of model, a brief model summary,
prompting methods, prompting-based applications, and the corresponding
responsibility and integrity issues are summarized and discussed. Furthermore,
the commonalities and differences between prompting on vision-language models,
language models, and vision models are also discussed. The challenges, future
directions, and research opportunities are summarized to foster future research
on this topic
Analysis of miRNAs and their target genes associated with lipid metabolism in duck liver
Citation: He, J. et al. Analysis of miRNAs and their target genes associated with lipid metabolism in duck liver. Sci. Rep. 6, 27418; doi: 10.1038/srep27418 (2016).Fat character is an important index in duck culture that linked to local flavor, feed cost and fat intake for costumers. Since the regulation networks in duck lipid metabolism had not been reported very clearly, we aimed to explore the potential miRNA-mRNA pairs and their regulatory roles in duck lipid metabolism. Here, Cherry-Valley ducks were selected and treated with/without 5% oil added in feed for 2 weeks, and then fat content determination was performed on. The data showed that the fat contents and the fatty acid ratios of C17:1 and C18:2 were up-regulated in livers of oil-added ducks, while the C12:0 ratio was down-regulated. Then 21 differential miRNAs, including 10 novel miRNAs, were obtain from the livers by sequencing, and 73 target genes involved in lipid metabolic processes of these miRNAs were found, which constituted 316 miRNA-mRNA pairs. Two miRNA-mRNA pairs including one novel miRNA and one known miRNA, N-miR-16020-FASN and gga-miR-144-ELOVL6, were selected to validate the miRNA-mRNA negative relation. And the results showed that N-mir-16020 and gga-miR-144 could respectively bind the 3?-UTRs of FASN and ELOVL6 to control their expressions. This study provides new sights and useful information for future research on regulation network in duck lipid metabolism
Construction and validation of a nomogram of risk factors for new-onset atrial fibrillation in advanced lung cancer patients after non-surgical therapy
ObjectiveRisk factors of new-onset atrial fibrillation (NOAF) in advanced lung cancer patients are not well defined. We aim to construct and validate a nomogram model between NOAF and advanced lung cancer.MethodsWe retrospectively enrolled 19484 patients with Stage III-IV lung cancer undergoing first-line antitumor therapy in Shanghai Chest Hospital between January 2016 and December 2020 (15837 in training set, and 3647 in testing set). Patients with pre-existing AF, valvular heart disease, cardiomyopathy were excluded. Logistic regression analysis and propensity score matching (PSM) were performed to identify predictors of NOAF, and nomogram model was constructed and validated.ResultsA total of 1089 patients were included in this study (807 in the training set, and 282 in the testing set). Multivariate logistic regression analysis showed that age, c-reactive protein, centric pulmonary carcinoma, and pericardial effusion were independent risk factors, the last two of which were important independent risk factors as confirmed by PSM analysis. Nomogram included independent risk factors of age, c-reactive protein, centric pulmonary carcinoma, and pericardial effusion. The AUC was 0.716 (95% CI 0.661–0.770) and further evaluation of this model showed that the C-index was 0.716, while the bias-corrected C-index after internal validation was 0.748 in the training set. The calibration curves presented good concordance between the predicted and actual outcomes.ConclusionCentric pulmonary carcinoma and pericardial effusion were important independent risk factors for NOAF besides common ones in advanced lung cancer patients. Furthermore, the new nomogram model contributed to the prediction of NOAF
An Attack Threat Effect Analysis Method Based on K-Means Evaluation
To take full advantage of the specified features of the attack dataset in network attack effect evaluation, maximize the efficiency of evaluation without losing its accuracy. This paper proposed a K-Means evaluation technique using dimensional entropy components, derived from changes in network entropy through attack period and the advantages of clustering algorithm in data mining. This method makes a pre-process of the attack dataset on the basis of network entropy, mapping it to a two-dimensional plane and utilize the output of pre-process as the input of clustering. Then establish a relation between the attack dataset and the effect category via an improved K-Means algorithm, thus achieving an explicit division of attack effect set and provide efficient evaluation result. The experimental results prove that the method can process attack dataset with high efficiency, as well as provide a visualized evaluation result by the form of cluster tree
Layered hidden Markov models for real-time daily activity monitoring using body sensor networks
This paper presents an inferring and training architecture for long-term and continuous daily activity monitoring using a wearable body sensor network. Energy efficiency and system adaptivity to wearers are two of the most important requirements of a body sensor network. This paper discusses a two-layered hidden Markov model (HMM) architecture for in-network data processing to achieve energy efficiency and model individualization. The bottom-layer HMM is used to process sensory data locally at each wireless sensor node to significantly reduce data transmissions. The top-layer HMM is utilized to find the activity sequence from the result of the local processing. This approach is energy efficient in that only the results of the decoding procedure in each node need to be transmitted rather than raw sensing data. Therefore, the volume of data are significantly reduced. When the algorithm is applied in online monitoring systems, the results of local processing are transmitted only upon hidden state changes. The top-layer processing uses old data of one sensor node when it does not receive a new result sequence of the local processing from that sensor node. The adaption to various wearers is also discussed, and the robustness of this classification system is depicted. Experiments of 19 activity sequences to be classified are taken by 5 subjects to evaluate the performance of this system. © 2011 Springer-Verlag London Limited
- …