59 research outputs found
Spatio-Temporal AU Relational Graph Representation Learning For Facial Action Units Detection
This paper presents our Facial Action Units (AUs) recognition submission to
the fifth Affective Behavior Analysis in-the-wild Competition (ABAW). Our
approach consists of three main modules: (i) a pre-trained facial
representation encoder which produce a strong facial representation from each
input face image in the input sequence; (ii) an AU-specific feature generator
that specifically learns a set of AU features from each facial representation;
and (iii) a spatio-temporal graph learning module that constructs a
spatio-temporal graph representation. This graph representation describes AUs
contained in all frames and predicts the occurrence of each AU based on both
the modeled spatial information within the corresponding face and the learned
temporal dynamics among frames. The experimental results show that our approach
outperformed the baseline and the spatio-temporal graph representation learning
allows our model to generate the best results among all ablated systems. Our
model ranks at the 4th place in the AU recognition track at the 5th ABAW
Competition
Scene Consistency Representation Learning for Video Scene Segmentation
A long-term video, such as a movie or TV show, is composed of various scenes,
each of which represents a series of shots sharing the same semantic story.
Spotting the correct scene boundary from the long-term video is a challenging
task, since a model must understand the storyline of the video to figure out
where a scene starts and ends. To this end, we propose an effective
Self-Supervised Learning (SSL) framework to learn better shot representations
from unlabeled long-term videos. More specifically, we present an SSL scheme to
achieve scene consistency, while exploring considerable data augmentation and
shuffling methods to boost the model generalizability. Instead of explicitly
learning the scene boundary features as in the previous methods, we introduce a
vanilla temporal model with less inductive bias to verify the quality of the
shot features. Our method achieves the state-of-the-art performance on the task
of Video Scene Segmentation. Additionally, we suggest a more fair and
reasonable benchmark to evaluate the performance of Video Scene Segmentation
methods. The code is made available.Comment: Accepted to CVPR 202
GRATIS: Deep Learning Graph Representation with Task-specific Topology and Multi-dimensional Edge Features
Graph is powerful for representing various types of real-world data. The
topology (edges' presence) and edges' features of a graph decides the message
passing mechanism among vertices within the graph. While most existing
approaches only manually define a single-value edge to describe the
connectivity or strength of association between a pair of vertices,
task-specific and crucial relationship cues may be disregarded by such manually
defined topology and single-value edge features. In this paper, we propose the
first general graph representation learning framework (called GRATIS) which can
generate a strong graph representation with a task-specific topology and
task-specific multi-dimensional edge features from any arbitrary input. To
learn each edge's presence and multi-dimensional feature, our framework takes
both of the corresponding vertices pair and their global contextual information
into consideration, enabling the generated graph representation to have a
globally optimal message passing mechanism for different down-stream tasks. The
principled investigation results achieved for various graph analysis tasks on
11 graph and non-graph datasets show that our GRATIS can not only largely
enhance pre-defined graphs but also learns a strong graph representation for
non-graph data, with clear performance improvements on all tasks. In
particular, the learned topology and multi-dimensional edge features provide
complementary task-related cues for graph analysis tasks. Our framework is
effective, robust and flexible, and is a plug-and-play module that can be
combined with different backbones and Graph Neural Networks (GNNs) to generate
a task-specific graph representation from various graph and non-graph data. Our
code is made publicly available at
https://github.com/SSYSteve/Learning-Graph-Representation-with-Task-specific-Topology-and-Multi-dimensional-Edge-Features
Ternary NiCoTi-layered double hydroxide nanosheets as a pH-responsive nanoagent for photodynamic/chemodynamic synergistic therapy
Combining photodynamic therapy (PDT) with chemodynamic therapy (CDT) has been proven to be a promising strategy to improve the treatment efficiency of cancer, because of the synergistic therapeutic effect arising between the two modalities. Herein, we report an inorganic nanoagent based on ternary NiCoTi-layered double hydroxide (NiCoTi-LDH) nanosheets to realize highly efficient photodynamic/chemodynamic synergistic therapy. The NiCoTi-LDH nanosheets exhibit oxygen vacancy-promoted electron-hole separation and photogenerated hole-induced O2-independent reactive oxygen species (ROS) generation under acidic circumstances, realizing in situ pH-responsive PDT. Moreover, due to the effective conversion between Co^{3+} and Co^{2+} caused by photogenerated electrons, the NiCoTi-LDH nanosheets catalyze the release of hydroxyl radicals (∙OH) from H2O2 through Fenton reactions, resulting in CDT. Laser irradiation enhances the catalyzed ability of the NiCoTi-LDH nanosheets to promote the ROS generation, resulting in a better performance than TiO_{2} nanoparticles at pH 6.5. In vitro and in vivo experimental results show conclusively that NiCoTi-LDH nanosheets plus irradiation lead to efficient cell apoptosis and significant inhibition of tumor growth. This study reports a new pH-responsive inorganic nanoagent with oxygen vacancy-promoted photodynamic/chemodynamic synergistic performance, offering a potentially appealing clinical strategy for selective tumor elimination
Genome-wide detection of human variants that disrupt intronic branchpoints
The search for candidate variants underlying human disease in massive parallel sequencing data typically focuses on coding regions and essential splice sites, mostly ignoring noncoding variants. The RNA spliceosome recognizes intronic branchpoint (BP) motifs at the beginning of splicing and operates mostly within introns to define the exon-intron boundaries; however, BP variants have been paid little attention. We established a comprehensive genome-wide database and knowledgebase of BP and developed BPHunter for systematic and informative genome-wide detection of intronic variants that may disrupt BP and splicing, together with an effective strategy for prioritizing BP variant candidates. BPHunter not only constitutes an important resource for understanding BP, but should also drive discovery of BP variants in human genetic diseases and traits. Pre-messenger RNA splicing is initiated with the recognition of a single-nucleotide intronic branchpoint (BP) within a BP motif by spliceosome elements. Forty-eight rare variants in 43 human genes have been reported to alter splicing and cause disease by disrupting BP. However, until now, no computational approach was available to efficiently detect such variants in massively parallel sequencing data. We established a comprehensive human genome-wide BP database by integrating existing BP data and generating new BP data from RNA sequencing of lariat debranching enzyme DBR1-mutated patients and from machine-learning predictions. We characterized multiple features of BP in major and minor introns and found that BP and BP-2 (two nucleotides upstream of BP) positions exhibit a lower rate of variation in human populations and higher evolutionary conservation than the intronic background, while being comparable to the exonic background. We developed BPHunter as a genome-wide computational approach to systematically and efficiently detect intronic variants that may disrupt BP recognition. BPHunter retrospectively identified 40 of the 48 known pathogenic BP variants, in which we summarized a strategy for prioritizing BP variant candidates. The remaining eight variants all create AG-dinucleotides between the BP and acceptor site, which is the likely reason for missplicing. We demonstrated the practical utility of BPHunter prospectively by using it to identify a novel germline heterozygous BP variant of STAT2 in a patient with critical COVID-19 pneumonia and a novel somatic intronic 59-nucleotide deletion of ITPKB in a lymphoma patient, both of which were validated experimentally. BPHunter is publicly available from an
- …