57 research outputs found
Tipping points near a delayed saddle node bifurcation with periodic forcing
We consider the effect on tipping from an additive periodic forcing in a
canonical model with a saddle node bifurcation and a slowly varying bifurcation
parameter. Here tipping refers to the dramatic change in dynamical behavior
characterized by a rapid transition away from a previously attracting state. In
the absence of the periodic forcing, it is well-known that a slowly varying
bifurcation parameter produces a delay in this transition, beyond the
bifurcation point for the static case. Using a multiple scales analysis, we
consider the effect of amplitude and frequency of the periodic forcing relative
to the drifting rate of the slowly varying bifurcation parameter.
We show that a high frequency oscillation drives an earlier tipping when the
bifurcation parameter varies more slowly, with the advance of the tipping point
proportional to the square of the ratio of amplitude to frequency. In the low
frequency case the position of the tipping point is affected by the frequency,
amplitude and phase of the oscillation. The results are based on an analysis of
the local concavity of the trajectory, used for low frequencies both of the
same order as the drifting rate of the bifurcation parameter and for low
frequencies larger than the drifting rate. The tipping point location is
advanced with increased amplitude of the periodic forcing, with critical
amplitudes where there are jumps in the location, yielding significant advances
in the tipping point. We demonstrate the analysis for two applications with
saddle node-type bifurcations
Can Brain Signals Reveal Inner Alignment with Human Languages?
Brain Signals, such as Electroencephalography (EEG), and human languages have
been widely explored independently for many downstream tasks, however, the
connection between them has not been well explored. In this study, we explore
the relationship and dependency between EEG and language. To study at the
representation level, we introduced \textbf{MTAM}, a \textbf{M}ultimodal
\textbf{T}ransformer \textbf{A}lignment \textbf{M}odel, to observe coordinated
representations between the two modalities. We used various relationship
alignment-seeking techniques, such as Canonical Correlation Analysis and
Wasserstein Distance, as loss functions to transfigure features. On downstream
applications, sentiment analysis and relation detection, we achieved new
state-of-the-art results on two datasets, ZuCo and K-EmoCon. Our method
achieved an F1-score improvement of 1.7% on K-EmoCon and 9.3% on Zuco datasets
for sentiment analysis, and 7.4% on ZuCo for relation detection. In addition,
we provide interpretations of the performance improvement: (1) feature
distribution shows the effectiveness of the alignment module for discovering
and encoding the relationship between EEG and language; (2) alignment weights
show the influence of different language semantics as well as EEG frequency
features; (3) brain topographical maps provide an intuitive demonstration of
the connectivity in the brain regions. Our code is available at
\url{https://github.com/Jason-Qiu/EEG_Language_Alignment}.Comment: EMNLP 2023 Finding
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Recent advancements in Large Language Models (LLMs) have drawn increasing
attention since the learned embeddings pretrained on large-scale datasets have
shown powerful ability in various downstream applications. However, whether the
learned knowledge by LLMs can be transferred to clinical cardiology remains
unknown. In this work, we aim to bridge this gap by transferring the knowledge
of LLMs to clinical Electrocardiography (ECG). We propose an approach for
cardiovascular disease diagnosis and automatic ECG diagnosis report generation.
We also introduce an additional loss function by Optimal Transport (OT) to
align the distribution between ECG and language embedding. The learned
embeddings are evaluated on two downstream tasks: (1) automatic ECG diagnosis
report generation, and (2) zero-shot cardiovascular disease detection. Our
approach is able to generate high-quality cardiac diagnosis reports and also
achieves competitive zero-shot classification performance even compared with
supervised baselines, which proves the feasibility of transferring knowledge
from LLMs to the cardiac domain.Comment: EACL 202
Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
Multimodal image-text models have shown remarkable performance in the past
few years. However, evaluating robustness against distribution shifts is
crucial before adopting them in real-world applications. In this work, we
investigate the robustness of 12 popular open-sourced image-text models under
common perturbations on five tasks (image-text retrieval, visual reasoning,
visual entailment, image captioning, and text-to-image generation). In
particular, we propose several new multimodal robustness benchmarks by applying
17 image perturbation and 16 text perturbation techniques on top of existing
datasets. We observe that multimodal models are not robust to image and text
perturbations, especially to image perturbations. Among the tested perturbation
methods, character-level perturbations constitute the most severe distribution
shift for text, and zoom blur is the most severe shift for image data. We also
introduce two new robustness metrics (\textbf{MMI} for MultiModal Impact score
and \textbf{MOR} for Missing Object Rate) for proper evaluations of multimodal
models. We hope our extensive study sheds light on new directions for the
development of robust multimodal models. More details can be found on the
project webpage: \url{https://MMRobustness.github.io}.Comment: Accepted by Journal of Data-centric Machine Learning Research (DMLR)
202
Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment
Multimedia summarization with multimodal output (MSMO) is a recently explored
application in language grounding. It plays an essential role in real-world
applications, i.e., automatically generating cover images and titles for news
articles or providing introductions to online videos. However, existing methods
extract features from the whole video and article and use fusion methods to
select the representative one, thus usually ignoring the critical structure and
varying semantics. In this work, we propose a Semantics-Consistent Cross-domain
Summarization (SCCS) model based on optimal transport alignment with visual and
textual segmentation. In specific, our method first decomposes both video and
article into segments in order to capture the structural semantics,
respectively. Then SCCS follows a cross-domain alignment objective with optimal
transport distance, which leverages multimodal interaction to match and select
the visual and textual summary. We evaluated our method on three recent
multimodal datasets and demonstrated the effectiveness of our method in
producing high-quality multimodal summaries
Converting ECG Signals to Images for Efficient Image-text Retrieval via Encoding
Automated interpretation of electrocardiograms (ECG) has garnered significant
attention with the advancements in machine learning methodologies. Despite the
growing interest in automated ECG interpretation using machine learning, most
current studies focus solely on classification or regression tasks and overlook
a crucial aspect of clinical cardio-disease diagnosis: the diagnostic report
generated by experienced human clinicians. In this paper, we introduce a novel
approach to ECG interpretation, leveraging recent breakthroughs in Large
Language Models (LLMs) and Vision-Transformer (ViT) models. Rather than
treating ECG diagnosis as a classification or regression task, we propose an
alternative method of automatically identifying the most similar clinical cases
based on the input ECG data. Also, since interpreting ECG as images are more
affordable and accessible, we process ECG as encoded images and adopt a
vision-language learning paradigm to jointly learn vision-language alignment
between encoded ECG images and ECG diagnosis reports. Encoding ECG into images
can result in an efficient ECG retrieval system, which will be highly practical
and useful in clinical applications. More importantly, our findings could serve
as a crucial resource for providing diagnostic services in regions where only
paper-printed ECG images are accessible due to past underdevelopment.Comment: 26 page
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Multimodal summarization with multimodal output (MSMO) has emerged as a
promising research direction. Nonetheless, numerous limitations exist within
existing public MSMO datasets, including insufficient maintenance, data
inaccessibility, limited size, and the absence of proper categorization, which
pose significant challenges. To address these challenges and provide a
comprehensive dataset for this new direction, we have meticulously curated the
\textbf{MMSum} dataset. Our new dataset features (1) Human-validated summaries
for both video and textual content, providing superior human instruction and
labels for multimodal learning. (2) Comprehensively and meticulously arranged
categorization, spanning 17 principal categories and 170 subcategories to
encapsulate a diverse array of real-world scenarios. (3) Benchmark tests
performed on the proposed dataset to assess various tasks and methods,
including \textit{video summarization}, \textit{text summarization}, and
\textit{multimodal summarization}. To champion accessibility and collaboration,
we will release the \textbf{MMSum} dataset and the data collection tool as
fully open-source resources, fostering transparency and accelerating future
developments. Our project website can be found
at~\url{https://mmsum-dataset.github.io/}Comment: Project website: https://mmsum-dataset.github.io
Recommended from our members
Identification of Parkinsons disease PACE subtypes and repurposing treatments through integrative analyses of multimodal data.
Parkinsons disease (PD) is a serious neurodegenerative disorder marked by significant clinical and progression heterogeneity. This study aimed at addressing heterogeneity of PD through integrative analysis of various data modalities. We analyzed clinical progression data (≥5 years) of individuals with de novo PD using machine learning and deep learning, to characterize individuals phenotypic progression trajectories for PD subtyping. We discovered three pace subtypes of PD exhibiting distinct progression patterns: the Inching Pace subtype (PD-I) with mild baseline severity and mild progression speed; the Moderate Pace subtype (PD-M) with mild baseline severity but advancing at a moderate progression rate; and the Rapid Pace subtype (PD-R) with the most rapid symptom progression rate. We found cerebrospinal fluid P-tau/α-synuclein ratio and atrophy in certain brain regions as potential markers of these subtypes. Analyses of genetic and transcriptomic profiles with network-based approaches identified molecular modules associated with each subtype. For instance, the PD-R-specific module suggested STAT3, FYN, BECN1, APOA1, NEDD4, and GATA2 as potential driver genes of PD-R. It also suggested neuroinflammation, oxidative stress, metabolism, PI3K/AKT, and angiogenesis pathways as potential drivers for rapid PD progression (i.e., PD-R). Moreover, we identified repurposable drug candidates by targeting these subtype-specific molecular modules using network-based approach and cell line drug-gene signature data. We further estimated their treatment effects using two large-scale real-world patient databases; the real-world evidence we gained highlighted the potential of metformin in ameliorating PD progression. In conclusion, this work helps better understand clinical and pathophysiological complexity of PD progression and accelerate precision medicine
Recommended from our members
A Genome Wide Association Study Identifies Common Variants Associated with Lipid Levels in the Chinese Population
Plasma lipid levels are important risk factors for cardiovascular disease and are influenced by genetic and environmental factors. Recent genome wide association studies (GWAS) have identified several lipid-associated loci, but these loci have been identified primarily in European populations. In order to identify genetic markers for lipid levels in a Chinese population and analyze the heterogeneity between Europeans and Asians, especially Chinese, we performed a meta-analysis of two genome wide association studies on four common lipid traits including total cholesterol (TC), triglycerides (TG), low-density lipoprotein cholesterol (LDL) and high-density lipoprotein cholesterol (HDL) in a Han Chinese population totaling 3,451 healthy subjects. Replication was performed in an additional 8,830 subjects of Han Chinese ethnicity. We replicated eight loci associated with lipid levels previously reported in a European population. The loci genome wide significantly associated with TC were near DOCK7, HMGCR and ABO; those genome wide significantly associated with TG were near APOA1/C3/A4/A5 and LPL; those genome wide significantly associated with LDL were near HMGCR, ABO and TOMM40; and those genome wide significantly associated with HDL were near LPL, LIPC and CETP. In addition, an additive genotype score of eight SNPs representing the eight loci that were found to be associated with lipid levels was associated with higher TC, TG and LDL levels (P = 5.52×10-16, 1.38×10-6 and 5.59×10-9, respectively). These findings suggest the cumulative effects of multiple genetic loci on plasma lipid levels. Comparisons with previous GWAS of lipids highlight heterogeneity in allele frequency and in effect size for some loci between Chinese and European populations. The results from our GWAS provided comprehensive and convincing evidence of the genetic determinants of plasma lipid levels in a Chinese population
- …