51 research outputs found
Learning Domain Invariant Prompt for Vision-Language Models
Prompt learning is one of the most effective and trending ways to adapt
powerful vision-language foundation models like CLIP to downstream datasets by
tuning learnable prompt vectors with very few samples. However, although prompt
learning achieves excellent performance over in-domain data, it still faces the
major challenge of generalizing to unseen classes and domains. Some existing
prompt learning methods tackle this issue by adaptively generating different
prompts for different tokens or domains but neglecting the ability of learned
prompts to generalize to unseen domains. In this paper, we propose a novel
prompt learning paradigm that directly generates \emph{domain invariant} prompt
that can be generalized to unseen domains, called MetaPrompt. Specifically, a
dual-modality prompt tuning network is proposed to generate prompts for input
from both image and text modalities. With a novel asymmetric contrastive loss,
the representation from the original pre-trained vision-language model acts as
supervision to enhance the generalization ability of the learned prompt. More
importantly, we propose a meta-learning-based prompt tuning algorithm that
explicitly constrains the task-specific prompt tuned for one domain or class to
also achieve good performance in another domain or class. Extensive experiments
on 11 datasets for base-to-new generalization and 4 datasets for domain
generalization demonstrate that our method consistently and significantly
outperforms existing methods.Comment: 12 pages, 6 figures, 5 table
PromptTTS 2: Describing and Generating Voices with Text Prompt
Speech conveys more information than just text, as the same word can be
uttered in various voices to convey diverse information. Compared to
traditional text-to-speech (TTS) methods relying on speech prompts (reference
speech) for voice variability, using text prompts (descriptions) is more
user-friendly since speech prompts can be hard to find or may not exist at all.
TTS approaches based on the text prompt face two challenges: 1) the one-to-many
problem, where not all details about voice variability can be described in the
text prompt, and 2) the limited availability of text prompt datasets, where
vendors and large cost of data labeling are required to write text prompt for
speech. In this work, we introduce PromptTTS 2 to address these challenges with
a variation network to provide variability information of voice not captured by
text prompts, and a prompt generation pipeline to utilize the large language
models (LLM) to compose high quality text prompts. Specifically, the variation
network predicts the representation extracted from the reference speech (which
contains full information about voice) based on the text prompt representation.
For the prompt generation pipeline, it generates text prompts for speech with a
speech understanding model to recognize voice attributes (e.g., gender, speed)
from speech and a large language model to formulate text prompt based on the
recognition results. Experiments on a large-scale (44K hours) speech dataset
demonstrate that compared to the previous works, PromptTTS 2 generates voices
more consistent with text prompts and supports the sampling of diverse voice
variability, thereby offering users more choices on voice generation.
Additionally, the prompt generation pipeline produces high-quality prompts,
eliminating the large labeling cost. The demo page of PromptTTS 2 is available
online\footnote{https://speechresearch.github.io/prompttts2}.Comment: Demo page: https://speechresearch.github.io/prompttts
Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech
Recently, leveraging BERT pre-training to improve the phoneme encoder in text
to speech (TTS) has drawn increasing attention. However, the works apply
pre-training with character-based units to enhance the TTS phoneme encoder,
which is inconsistent with the TTS fine-tuning that takes phonemes as input.
Pre-training only with phonemes as input can alleviate the input mismatch but
lack the ability to model rich representations and semantic information due to
limited phoneme vocabulary. In this paper, we propose MixedPhoneme BERT, a
novel variant of the BERT model that uses mixed phoneme and sup-phoneme
representations to enhance the learning capability. Specifically, we merge the
adjacent phonemes into sup-phonemes and combine the phoneme sequence and the
merged sup-phoneme sequence as the model input, which can enhance the model
capacity to learn rich contextual representations. Experiment results
demonstrate that our proposed Mixed-Phoneme BERT significantly improves the TTS
performance with 0.30 CMOS gain compared with the FastSpeech 2 baseline. The
Mixed-Phoneme BERT achieves 3x inference speedup and similar voice quality to
the previous TTS pre-trained model PnG BERTComment: submitted to interspeech 202
Mitochondrial PKM2 deacetylation by procyanidin B2-induced SIRT3 upregulation alleviates lung ischemia/reperfusion injury
Abstract Apoptosis is a critical event in the pathogenesis of lung ischemia/reperfusion (I/R) injury. Sirtuin 3 (SIRT3), an important deacetylase predominantly localized in mitochondria, regulates diverse physiological processes, including apoptosis. However, the detailed mechanisms by which SIRT3 regulates lung I/R injury remain unclear. Many polyphenols strongly regulate the sirtuin family. In this study, we found that a polyphenol compound, procyanidin B2 (PCB2), activated SIRT3 in mouse lungs. Due to this effect, PCB2 administration attenuated histological lesions, relieved pulmonary dysfunction, and improved the survival rate of the murine model of lung I/R injury. Additionally, this treatment inhibited hypoxia/reoxygenation (H/R)-induced A549 cell apoptosis and rescued Bcl-2 expression. Using Sirt3-knockout mice and specific SIRT3 knockdown in vitro, we further found that SIRT3 strongly protects against lung I/R injury. Sirt3 deficiency or enzymatic inactivation substantially aggravated lung I/R-induced pulmonary lesions, promoted apoptosis, and abolished PCB2-mediated protection. Mitochondrial pyruvate kinase M2 (PKM2) inhibits apoptosis by stabilizing Bcl-2. Here, we found that PKM2 accumulates and is hyperacetylated in mitochondria upon lung I/R injury. By screening the potential sites of PKM2 acetylation, we found that SIRT3 deacetylates the K433 residue of PKM2 in A549 cells. Transfection with a deacetylated mimic plasmid of PKM2 noticeably reduced apoptosis, while acetylated mimic transfection abolished the protective effect of PKM2. Furthermore, PKM2 knockdown or inhibition in vivo significantly abrogated the antiapoptotic effects of SIRT3 upregulation. Collectively, this study provides the first evidence that the SIRT3/PKM2 pathway is a protective target for the suppression of apoptosis in lung I/R injury. Moreover, this study identifies K433 deacetylation of PKM2 as a novel modification that regulates its anti-apoptotic activity. In addition, PCB2-mediated modulation of the SIRT3/PKM2 pathway may significantly protect against lung I/R injury, suggesting a novel prophylactic strategy for lung I/R injury
Improving Daytime Planetary Boundary Layer Height Determination from CALIOP: Validation Based on Ground-Based Lidar Station
An integrated algorithm by combining the advantages of the wavelet covariance method and the improved maximum variance method was developed to determine the planetary boundary layer height (PBLH) from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) measurements, and an aerosol fraction threshold was applied to the integrated algorithm considering the applicability of the two methods. We compared the CALIOP retrieval with the measurements of PBLH derived from nine years of ground-based Lidar synchronous observations located in Lille, north of France. The results indicate that a good correlation (R≥0.79) exists between the PBLHs derived from CALIOP and ground-based Lidar under clear sky conditions. The mean absolute differences of PBLHs are, respectively, of 206 m and 106 m before and after the removal of the aloft aerosol layer. The results under cloudy sky conditions show a lower agreement (R=0.48) in regard of the comparisons performed under clear sky conditions. Besides, the spatial correlation of PBLHs decreases with the increasing spatial distance between CALIOP footprint and Lille observation platform. Based on the above analysis, the PBLHs can be effectively derived by the integrated algorithm under clear sky conditions, while larger mean absolute difference (i.e., 527 m) exists under cloudy sky conditions
MITA/STING and Its Alternative Splicing Isoform MRP Restrict Hepatitis B Virus Replication.
An efficient clearance of hepatitis B virus (HBV) requires the coordinated work of both the innate and adaptive immune responses. MITA/STING, an adapter protein of the innate immune signaling pathways, plays a key role in regulating innate and adaptive immune responses to DNA virus infection. Previously, we identified an alternatively spliced isoform of MITA/STING, called MITA-related protein (MRP), and found that MRP could specifically block MITA-mediated interferon (IFN) induction while retaining the ability to activate NF-κB. Here, we asked whether MITA/STING and MRP were able to control the HBV replication. Both MITA/STING and MRP significantly inhibited HBV replication in vitro. MITA overexpression stimulated IRF3-IFN pathway; while MRP overexpression activated NF-κB pathway, suggesting these two isoforms may inhibit HBV replication through different ways. Using a hydrodynamic injection (HI) mouse model, we found that HBV replication was reduced following MITA/STING and MRP expression vectors in mice and was enhanced by the knockout of MITA/STING (MITA/STING-/-). The HBV specific humoral and CD8+ T cell responses were impaired in MITA/STING deficient mice, suggesting the participation of MITA/STING in the initiation of host adaptive immune responses. In summary, our data suggest that MITA/STING and MRP contribute to HBV control via modulation of the innate and adaptive responses
- …