59 research outputs found

    Spatio-Temporal AU Relational Graph Representation Learning For Facial Action Units Detection

    Full text link
    This paper presents our Facial Action Units (AUs) recognition submission to the fifth Affective Behavior Analysis in-the-wild Competition (ABAW). Our approach consists of three main modules: (i) a pre-trained facial representation encoder which produce a strong facial representation from each input face image in the input sequence; (ii) an AU-specific feature generator that specifically learns a set of AU features from each facial representation; and (iii) a spatio-temporal graph learning module that constructs a spatio-temporal graph representation. This graph representation describes AUs contained in all frames and predicts the occurrence of each AU based on both the modeled spatial information within the corresponding face and the learned temporal dynamics among frames. The experimental results show that our approach outperformed the baseline and the spatio-temporal graph representation learning allows our model to generate the best results among all ablated systems. Our model ranks at the 4th place in the AU recognition track at the 5th ABAW Competition

    Scene Consistency Representation Learning for Video Scene Segmentation

    Full text link
    A long-term video, such as a movie or TV show, is composed of various scenes, each of which represents a series of shots sharing the same semantic story. Spotting the correct scene boundary from the long-term video is a challenging task, since a model must understand the storyline of the video to figure out where a scene starts and ends. To this end, we propose an effective Self-Supervised Learning (SSL) framework to learn better shot representations from unlabeled long-term videos. More specifically, we present an SSL scheme to achieve scene consistency, while exploring considerable data augmentation and shuffling methods to boost the model generalizability. Instead of explicitly learning the scene boundary features as in the previous methods, we introduce a vanilla temporal model with less inductive bias to verify the quality of the shot features. Our method achieves the state-of-the-art performance on the task of Video Scene Segmentation. Additionally, we suggest a more fair and reasonable benchmark to evaluate the performance of Video Scene Segmentation methods. The code is made available.Comment: Accepted to CVPR 202

    GRATIS: Deep Learning Graph Representation with Task-specific Topology and Multi-dimensional Edge Features

    Full text link
    Graph is powerful for representing various types of real-world data. The topology (edges' presence) and edges' features of a graph decides the message passing mechanism among vertices within the graph. While most existing approaches only manually define a single-value edge to describe the connectivity or strength of association between a pair of vertices, task-specific and crucial relationship cues may be disregarded by such manually defined topology and single-value edge features. In this paper, we propose the first general graph representation learning framework (called GRATIS) which can generate a strong graph representation with a task-specific topology and task-specific multi-dimensional edge features from any arbitrary input. To learn each edge's presence and multi-dimensional feature, our framework takes both of the corresponding vertices pair and their global contextual information into consideration, enabling the generated graph representation to have a globally optimal message passing mechanism for different down-stream tasks. The principled investigation results achieved for various graph analysis tasks on 11 graph and non-graph datasets show that our GRATIS can not only largely enhance pre-defined graphs but also learns a strong graph representation for non-graph data, with clear performance improvements on all tasks. In particular, the learned topology and multi-dimensional edge features provide complementary task-related cues for graph analysis tasks. Our framework is effective, robust and flexible, and is a plug-and-play module that can be combined with different backbones and Graph Neural Networks (GNNs) to generate a task-specific graph representation from various graph and non-graph data. Our code is made publicly available at https://github.com/SSYSteve/Learning-Graph-Representation-with-Task-specific-Topology-and-Multi-dimensional-Edge-Features

    Ternary NiCoTi-layered double hydroxide nanosheets as a pH-responsive nanoagent for photodynamic/chemodynamic synergistic therapy

    Get PDF
    Combining photodynamic therapy (PDT) with chemodynamic therapy (CDT) has been proven to be a promising strategy to improve the treatment efficiency of cancer, because of the synergistic therapeutic effect arising between the two modalities. Herein, we report an inorganic nanoagent based on ternary NiCoTi-layered double hydroxide (NiCoTi-LDH) nanosheets to realize highly efficient photodynamic/chemodynamic synergistic therapy. The NiCoTi-LDH nanosheets exhibit oxygen vacancy-promoted electron-hole separation and photogenerated hole-induced O2-independent reactive oxygen species (ROS) generation under acidic circumstances, realizing in situ pH-responsive PDT. Moreover, due to the effective conversion between Co^{3+} and Co^{2+} caused by photogenerated electrons, the NiCoTi-LDH nanosheets catalyze the release of hydroxyl radicals (∙OH) from H2O2 through Fenton reactions, resulting in CDT. Laser irradiation enhances the catalyzed ability of the NiCoTi-LDH nanosheets to promote the ROS generation, resulting in a better performance than TiO_{2} nanoparticles at pH 6.5. In vitro and in vivo experimental results show conclusively that NiCoTi-LDH nanosheets plus irradiation lead to efficient cell apoptosis and significant inhibition of tumor growth. This study reports a new pH-responsive inorganic nanoagent with oxygen vacancy-promoted photodynamic/chemodynamic synergistic performance, offering a potentially appealing clinical strategy for selective tumor elimination

    Genome-wide detection of human variants that disrupt intronic branchpoints

    Get PDF
    The search for candidate variants underlying human disease in massive parallel sequencing data typically focuses on coding regions and essential splice sites, mostly ignoring noncoding variants. The RNA spliceosome recognizes intronic branchpoint (BP) motifs at the beginning of splicing and operates mostly within introns to define the exon-intron boundaries; however, BP variants have been paid little attention. We established a comprehensive genome-wide database and knowledgebase of BP and developed BPHunter for systematic and informative genome-wide detection of intronic variants that may disrupt BP and splicing, together with an effective strategy for prioritizing BP variant candidates. BPHunter not only constitutes an important resource for understanding BP, but should also drive discovery of BP variants in human genetic diseases and traits. Pre-messenger RNA splicing is initiated with the recognition of a single-nucleotide intronic branchpoint (BP) within a BP motif by spliceosome elements. Forty-eight rare variants in 43 human genes have been reported to alter splicing and cause disease by disrupting BP. However, until now, no computational approach was available to efficiently detect such variants in massively parallel sequencing data. We established a comprehensive human genome-wide BP database by integrating existing BP data and generating new BP data from RNA sequencing of lariat debranching enzyme DBR1-mutated patients and from machine-learning predictions. We characterized multiple features of BP in major and minor introns and found that BP and BP-2 (two nucleotides upstream of BP) positions exhibit a lower rate of variation in human populations and higher evolutionary conservation than the intronic background, while being comparable to the exonic background. We developed BPHunter as a genome-wide computational approach to systematically and efficiently detect intronic variants that may disrupt BP recognition. BPHunter retrospectively identified 40 of the 48 known pathogenic BP variants, in which we summarized a strategy for prioritizing BP variant candidates. The remaining eight variants all create AG-dinucleotides between the BP and acceptor site, which is the likely reason for missplicing. We demonstrated the practical utility of BPHunter prospectively by using it to identify a novel germline heterozygous BP variant of STAT2 in a patient with critical COVID-19 pneumonia and a novel somatic intronic 59-nucleotide deletion of ITPKB in a lymphoma patient, both of which were validated experimentally. BPHunter is publicly available from an
    corecore