95 research outputs found
SwitchGPT: Adapting Large Language Models for Non-Text Outputs
Large Language Models (LLMs), primarily trained on text-based datasets,
exhibit exceptional proficiencies in understanding and executing complex
linguistic instructions via text outputs. However, they falter when requests to
generate non-text ones. Concurrently, modality conversion models, such as
text-to-image, despite generating high-quality images, suffer from a lack of
extensive textual pretraining. As a result, these models are only capable of
accommodating specific image descriptions rather than comprehending more
complex instructions. To bridge this gap, we propose a novel approach,
\methodname, from a modality conversion perspective that evolves a text-based
LLM into a multi-modal one. We specifically employ a minimal dataset to
instruct LLMs to recognize the intended output modality as directed by the
instructions. Consequently, the adapted LLM can effectively summon various
off-the-shelf modality conversion models from the model zoos to generate
non-text responses. This circumvents the necessity for complicated pretraining
that typically requires immense quantities of paired multi-modal data, while
simultaneously inheriting the extensive knowledge of LLMs and the ability of
high-quality generative models. To evaluate and compare the adapted multi-modal
LLM with its traditional counterparts, we have constructed a multi-modal
instruction benchmark that solicits diverse modality outputs. The experiment
results reveal that, with minimal training, LLMs can be conveniently adapted to
comprehend requests for non-text responses, thus achieving higher flexibility
in multi-modal scenarios. Code and data will be made available at
https://github.com/xinke-wang/SwitchGPT
Interpretable Ensemble Learning for Materials Property Prediction with Classical Interatomic Potentials: Carbon as an Example
Machine learning (ML) is widely used to explore crystal materials and predict
their properties. However, the training is time-consuming for deep-learning
models, and the regression process is a black box that is hard to interpret.
Also, the preprocess to transfer a crystal structure into the input of ML,
called descriptor, needs to be designed carefully. To efficiently predict
important properties of materials, we propose an approach based on ensemble
learning consisting of regression trees to predict formation energy and elastic
constants based on small-size datasets of carbon allotropes as an example.
Without using any descriptor, the inputs are the properties calculated by
molecular dynamics with 9 different classical interatomic potentials. Overall,
the results from ensemble learning are more accurate than those from classical
interatomic potentials, and ensemble learning can capture the relatively
accurate properties from the 9 classical potentials as criteria for predicting
the final properties
Contrastive Vision-Language Alignment Makes Efficient Instruction Learner
We study the task of extending the large language model (LLM) into a
vision-language instruction-following model. This task is crucial but
challenging since the LLM is trained on text modality only, making it hard to
effectively digest the visual modality. To address this, existing methods
typically train a visual adapter to align the representation between a
pre-trained vision transformer (ViT) and the LLM by a generative image
captioning loss. However, we find that the generative objective can only
produce weak alignment for vision and language, making the aligned
vision-language model very hungry for the instruction fine-tuning data. In this
paper, we propose CG-VLM that applies both Contrastive and Generative alignment
objectives to effectively align the representation of ViT and LLM. Different
from image level and sentence level alignment in common contrastive learning
settings, CG-VLM aligns the image-patch level features and text-token level
embeddings, which, however, is very hard to achieve as no explicit grounding
patch-token relation provided in standard image captioning datasets. To address
this issue, we propose to maximize the averaged similarity between pooled
image-patch features and text-token embeddings. Extensive experiments
demonstrate that the proposed CG-VLM produces strong vision-language alignment
and is an efficient instruction learner. For example, using only 10%
instruction tuning data, we reach 95% performance of state-of-the-art method
LLaVA [29] on the zero-shot ScienceQA-Image benchmark.Comment: 17 pages, 10 pages for main paper, 7 pages for supplementar
Frequency Enhanced Hybrid Attention Network for Sequential Recommendation
The self-attention mechanism, which equips with a strong capability of
modeling long-range dependencies, is one of the extensively used techniques in
the sequential recommendation field. However, many recent studies represent
that current self-attention based models are low-pass filters and are
inadequate to capture high-frequency information. Furthermore, since the items
in the user behaviors are intertwined with each other, these models are
incomplete to distinguish the inherent periodicity obscured in the time domain.
In this work, we shift the perspective to the frequency domain, and propose a
novel Frequency Enhanced Hybrid Attention Network for Sequential
Recommendation, namely FEARec. In this model, we firstly improve the original
time domain self-attention in the frequency domain with a ramp structure to
make both low-frequency and high-frequency information could be explicitly
learned in our approach. Moreover, we additionally design a similar attention
mechanism via auto-correlation in the frequency domain to capture the periodic
characteristics and fuse the time and frequency level attention in a union
model. Finally, both contrastive learning and frequency regularization are
utilized to ensure that multiple views are aligned in both the time domain and
frequency domain. Extensive experiments conducted on four widely used benchmark
datasets demonstrate that the proposed model performs significantly better than
the state-of-the-art approaches.Comment: 11 pages, 7 figures, The 46th International ACM SIGIR Conference on
Research and Development in Information Retrieva
Unveiling the medium-range order in glass models and its role in glass formation
The correlation between structure and glass formability in glassy systems is a long-standing puzzle. To solve this puzzle, many descriptors based on the short-range order (SRO) have been proposed. Here we show that the SRO, however, offers little help in explaining the glass formability and stability; instead it is the formation of medium-range order that stabilizes the glass against crystallization by suppressing the atomic rearrangement and compositional change. Our results provide a perspective for understanding the correlation between structure and stability in glasse
RITA: Boost Autonomous Driving Simulators with Realistic Interactive Traffic Flow
High-quality traffic flow generation is the core module in building
simulators for autonomous driving. However, the majority of available
simulators are incapable of replicating traffic patterns that accurately
reflect the various features of real-world data while also simulating
human-like reactive responses to the tested autopilot driving strategies.
Taking one step forward to addressing such a problem, we propose Realistic
Interactive TrAffic flow (RITA) as an integrated component of existing driving
simulators to provide high-quality traffic flow for the evaluation and
optimization of the tested driving strategies. RITA is developed with
consideration of three key features, i.e., fidelity, diversity, and
controllability, and consists of two core modules called RITABackend and
RITAKit. RITABackend is built to support vehicle-wise control and provide
traffic generation models from real-world datasets, while RITAKit is developed
with easy-to-use interfaces for controllable traffic generation via
RITABackend. We demonstrate RITA's capacity to create diversified and
high-fidelity traffic simulations in several highly interactive highway
scenarios. The experimental findings demonstrate that our produced RITA traffic
flows exhibit all three key features, hence enhancing the completeness of
driving strategy evaluation. Moreover, we showcase the possibility for further
improvement of baseline strategies through online fine-tuning with RITA traffic
flows.Comment: 8 pages, 5 figures, 3 table
Expression of Claudin-9 (CLDN9) in breast cancer, the clinical significance in connection with its subcoat anchorage proteins ZO-1 and ZO-3 and impact on drug resistance
(1) Introduction: Claudin-9 (CLDN9) is a member of the claudin protein family, a critical transmembrane protein family for tight junctions that are implemented in the progression of numerous cancer types. The present study investigated the role that CLDN9, along with the subcoat proteins, Zonula Occludens (ZOs), plays in clinical breast cancer and subsequent impact on drug response of patients. (2) Methods: CLDN9 protein and CLDN9 transcript were determined and correlated with clinical and pathological indicators, together with the status of hormonal receptors. The levels of CLDN9 transcript were also assessed against the therapeutic responses of the patients to chemotherapies by using a dataset from the TCGA database. Breast cancer cell models, representing different molecular subtypes of breast cancer, with differential expression of CLDN9 were created and used to assess the biological impact and response to chemotherapeutic drugs. (3) Results: Breast cancer tissues expressed significantly higher levels of the CLDN9, with the high levels being associated with shorter survival. CLDN9 was significantly correlated with its anchorage proteins ZO-1 and ZO-3. Integrated expression of CLDN9, ZO-1 and ZO-3 formed a signature that was significantly linked to overall survival (OS) (p = 0.013) and relapse-free survival (RFS) (p = 0.024) in an independent matter. CLDN9 transcript was significantly higher in patients who were resistant to chemotherapies (p < 0.000001). CLDN9 connection to chemoresistance was particularly prominent in patients of ER-positive (ER(+)), Her-2-negative((Her-2(â)), ER(+)/Her-2(â) and triple-negative breast cancers (TNBCs), but not in patients with HER-2-positive tumors. In Her-2-negative MCF7 and MDA-MB-231 cancer cells, loss of CLDN9 significantly increased sensitivity to several chemotherapeutic drugs including paclitaxel, gemcitabine and methotrexate, which was not seen in Her-2(+) SKBR3 cells. However, suppressing Her-2 using neratinib, a permanent Her-2 inhibitor, sensitized cellular response to these chemodrugs in cells with CLDN9 knockdown. (4) Conclusions: CLDN9 is an important prognostic indicator for patients with breast cancer and also a pivotal factor in assessing patient responses to chemotherapies. Her-2 is a negating factor for the treatment response prediction value by CLDN9 and negating Her-2 and CLDN9 may enhance breast cancer cellular response to chemotherapeutic drugs
Light control of surfaceâbulk coupling by terahertz vibrational coherence in a topological insulator
The demand for disorder-tolerant quantum logic and spin electronics can be met by generating and controlling dissipationless spin currents protected by topology. Dirac fermions with helical spin-locking surface transport offer a way of achieving such a goal. Yet, surface-bulk coupling can lead to strong Dirac electron scattering with bulk carriers and phonons as well as impurities, assisted by such dissipative channel, which results in âtopological breakdownâ. Here, we demonstrate that coherent lattice vibrations periodically driven by a single-cycle terahertz (THz) pulse can significantly suppress such dissipative channel in topological insulators. This is achieved by reducing the phase space in the bulk available for Dirac fermion scattering into during coherent lattice oscillations in Bi2Se3. This light-induced suppression manifests as a remarkable transition exclusively in surface transport, absent for bulk, above the THz electric fields for driving coherent phonons, which prolongs the surface transport lifetime. These results, together with simulations, identify the critical role of spinâorbit coupling for the âphase space contractionâ mechanism that suppresses the surface-bulk coupling. Imposing vibrational quantum coherence into topological states of matter may become a universal light control principle for reinforcing the symmetry-protected helical transport
Fibroblast Growth Factor 21 Deficiency Attenuates Experimental Colitis-Induced Adipose Tissue Lipolysis
Aims. Nutrient deficiencies are common in patients with inflammatory bowel disease (IBD). Adipose tissue plays a critical role in regulating energy balance. Fibroblast growth factor 21 (FGF21) is an important endocrine metabolic regulator with emerging beneficial roles in lipid homeostasis. We investigated the impact of FGF21 in experimental colitis-induced epididymal white adipose tissue (eWAT) lipolysis. Methods. Mice were given 2.5% dextran sulfate sodium (DSS) ad libitum for 7 days to induce colitis. The role of FGF21 was investigated using antibody neutralization or knockout (KO) mice. Lipolysis index and adipose lipolytic enzymes were determined. In addition, 3T3-L1 cells were pretreated with IL-6, followed by recombinant human FGF21 (rhFGF21) treatment; lipolysis was assessed. Results. DSS markedly decreased eWAT/body weight ratio and increased serum concentrations of free fatty acid (FFA) and glycerol, indicating increased adipose tissue lipolysis. eWAT intracellular lipolytic enzyme expression/activation was significantly increased. These alterations were significantly attenuated in FGF21 KO mice and by circulating FGF21 neutralization. Moreover, DSS treatment markedly increased serum IL-6 and FGF21 levels. IL-6 pretreatment was necessary for the stimulatory effect of FGF21 on adipose lipolysis in 3T3-L1 cells. Conclusions. Our results demonstrate that experimental colitis induces eWAT lipolysis via an IL-6/FGF21-mediated signaling pathway
Metastatic Lymph Node 64 (MLN64) expression in gastric cancer, the clinical and molecular implications in drug resistance
Background/Aim: Metastatic Lymph Node 64 (MLN64) is often co-amplified with ERBB2 (HER2) and plays a role in the progression of breast and prostate cancers. The present study explored the expression of MLN64 in clinical gastric cancer in association with the ERBB family and its impact on drug resistance in patients. Materials and Methods: Two independent gastric cancer cohorts (n=324; n=87) were used to explore the expression profile of MLN64 in con-junction with ERBB family members in clinical gastric cancer and its association with neoad-juvant chemotherapy responses. Gastric cancer AGS and HCG27 cells with MLN64 knock-down were generated to determine the function of MLN64 in cell behavioral changes. Results: Gastric tumor tissues expressed significantly increased levels of MLN64 compared with normal tissues (p<0.01); however, MLN64 alone was a weak prognostic indicator. An integrated co-expression of MLN64, ERBB4, and NRG4 was a significant factor in assessing overall survival in both cohorts. MLN64 was a profound indicator of patient response to neoadjuvant chemotherapy. In vitro studies indicated a significant contribution of MLN64 to the response of gastric cancer cells to chemodrugs and Her-2 inhibitors. MLN64 knockdown also contributed to the adhesiveness and migration and suggested a possible mechanism mediated by the in-teraction between MLN64 and ERBBs. Conclusion: MLN64 is an indicator for patient response to neoadjuvant chemotherapies in gastric cancer. Together with the expression pattern of ERBB4, it makes is a poor prognostic factor in gastric cancer patients
- âŠ