210 research outputs found

    Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search

    Full text link
    Text-based person search aims to retrieve the corresponding person images in an image database by virtue of a describing sentence about the person, which poses great potential for various applications such as video surveillance. Extracting visual contents corresponding to the human description is the key to this cross-modal matching problem. Moreover, correlated images and descriptions involve different granularities of semantic relevance, which is usually ignored in previous methods. To exploit the multilevel corresponding visual contents, we propose a pose-guided multi-granularity attention network (PMA). Firstly, we propose a coarse alignment network (CA) to select the related image regions to the global description by a similarity-based attention. To further capture the phrase-related visual body part, a fine-grained alignment network (FA) is proposed, which employs pose information to learn latent semantic alignment between visual body part and textual noun phrase. To verify the effectiveness of our model, we perform extensive experiments on the CUHK Person Description Dataset (CUHK-PEDES) which is currently the only available dataset for text-based person search. Experimental results show that our approach outperforms the state-of-the-art methods by 15 \% in terms of the top-1 metric.Comment: published in AAAI2020(oral

    Federated Scheduling for Stochastic Parallel Real-time Tasks

    Get PDF
    Federated scheduling is a strategy to schedule parallel real-time tasks: It allocates a dedicated cluster of cores to high-utilization task (utilization \u3e1); It uses a multiprocessor scheduling algorithm to schedule and execute all low-utilization tasks sequentially, on a shared cluster of the remaining cores. Prior work has shown that federated scheduling has the best known capacity augmentation bound of 2 for parallel tasks with implicit deadlines. In this paper, we explore the soft real-time performance of federated scheduling and address the average-case workloads instead of the worst-case values. In particular, we consider stochastic tasks -- tasks for which execution time and critical-path length are random variables. In this case, we use bounded expected tardiness as the schedulability criterion. We define a stochastic capacity augmentation bound and prove that federated scheduling algorithms guarantee the same bound of 2 for stochastic tasks. We present three federated mapping algorithms for core allocation. All of them guarantee bounded expected tardiness and provide the same capacity augmentation bound; In practice, however, we expect them to provide different performances, both in terms of the task sets they can schedule and the actual tardiness they guarantee. Therefore, we performed numerical evaluations using randomly generated task sets to understand the practical differences between the three algorithms

    Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses

    Full text link
    Recently, utilizing deep neural networks to build the opendomain dialogue models has become a hot topic. However, the responses generated by these models suffer from many problems such as responses not being contextualized and tend to generate generic responses that lack information content, damaging the user's experience seriously. Therefore, many studies try introducing more information into the dialogue models to make the generated responses more vivid and informative. Unlike them, this paper improves the quality of generated responses by learning the implicit pattern information between contexts and responses in the training samples. In this paper, we first build an open-domain dialogue model based on the pre-trained language model (i.e., GPT-2). And then, an improved scheduled sampling method is proposed for pre-trained models, by which the responses can be used to guide the response generation in the training phase while avoiding the exposure bias problem. More importantly, we design a response-aware mechanism for mining the implicit pattern information between contexts and responses so that the generated replies are more diverse and approximate to human replies. Finally, we evaluate the proposed model (RAD) on the Persona-Chat and DailyDialog datasets; and the experimental results show that our model outperforms the baselines on most automatic and manual metrics

    Touch and Go: Learning from Human-Collected Vision and Touch

    Full text link
    The ability to associate touch with sight is essential for tasks that require physically interacting with objects in the world. We propose a dataset with paired visual and tactile data called Touch and Go, in which human data collectors probe objects in natural environments using tactile sensors, while simultaneously recording egocentric video. In contrast to previous efforts, which have largely been confined to lab settings or simulated environments, our dataset spans a large number of "in the wild" objects and scenes. To demonstrate our dataset's effectiveness, we successfully apply it to a variety of tasks: 1) self-supervised visuo-tactile feature learning, 2) tactile-driven image stylization, i.e., making the visual appearance of an object more consistent with a given tactile signal, and 3) predicting future frames of a tactile signal from visuo-tactile inputs.Comment: Accepted by NeurIPS 2022 Track of Datasets and Benchmark

    Single-cell transcriptome revealed dysregulated RNA-binding protein expression patterns and functions in human ankylosing spondylitis

    Get PDF
    ObjectiveTo explore the expression characteristics and regulatory patterns of RBPs in different immune cell types of AS, and to clarify the potential key role of RBPs in the occurrence and development of AS disease.MethodsPBMC sample data from scRNA-seq (HC*29, AS*10) and bulk RNA-seq (NC*3, AS*5) were selected for correlation analysis.Results(1) Compared with the HC group, the numbers of B, DC (dendritic cells), CD14+ Mono and CD8+ T cells were increased in AS group, while the numbers of platelet (platelets), CD8+ NKT, CD16+ Mono (non-classical monocytes), Native CD4+ T and NK were decreased. (2) Through the analysis of RBP genes in B cells, some RBPs were found to play an important role in B cell differentiation and function, such as DDX3X, SFPQ, SRRM1, UPF2. (3) It may be related to B-cell receptor, IgA immunity, NOD-like receptor and other signaling pathways; Through the analysis of RBP genes in CD8+ T cells, some RBPs that play an important role in the immune regulation of CD8+ T were found, such as EIF2S3, EIF4B, HSPA5, MSL3, PABPC1 and SRSF7; It may be related to T cell receptor, TNF, IL17 and other signaling pathways. (4) Based on bulk RNA-seq, it was found that compared with HC and AS patients, differentially expressed variable splicing genes (RASGs) may play an important role in the occurrence and development of AS by participating in transcriptional regulation, protein phosphorylation and ubiquitination, DNA replication, angiogenesis, intracellular signal transduction and other related pathways.ConclusionRBPs has specific expression characteristics in different immune cell types of AS patients, and has important regulatory functions. Its abnormal expression and regulation may be closely related to the occurrence and development of AS

    Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End Neural Network for 4D-MRI with Simultaneous Motion Estimation and Super-Resolution

    Full text link
    Four-dimensional magnetic resonance imaging (4D-MRI) is an emerging technique for tumor motion management in image-guided radiation therapy (IGRT). However, current 4D-MRI suffers from low spatial resolution and strong motion artifacts owing to the long acquisition time and patients' respiratory variations; these limitations, if not managed properly, can adversely affect treatment planning and delivery in IGRT. Herein, we developed a novel deep learning framework called the coarse-super-resolution-fine network (CoSF-Net) to achieve simultaneous motion estimation and super-resolution in a unified model. We designed CoSF-Net by fully excavating the inherent properties of 4D-MRI, with consideration of limited and imperfectly matched training datasets. We conducted extensive experiments on multiple real patient datasets to verify the feasibility and robustness of the developed network. Compared with existing networks and three state-of-the-art conventional algorithms, CoSF-Net not only accurately estimated the deformable vector fields between the respiratory phases of 4D-MRI but also simultaneously improved the spatial resolution of 4D-MRI with enhanced anatomic features, yielding 4D-MR images with high spatiotemporal resolution

    Unsupervised Melody-to-Lyric Generation

    Full text link
    Automatic melody-to-lyric generation is a task in which song lyrics are generated to go with a given melody. It is of significant practical interest and more challenging than unconstrained lyric generation as the music imposes additional constraints onto the lyrics. The training data is limited as most songs are copyrighted, resulting in models that underfit the complicated cross-modal relationship between melody and lyrics. In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. Specifically, we design a hierarchical lyric generation framework that first generates a song outline and second the complete lyrics. The framework enables disentanglement of training (based purely on text) from inference (melody-guided text generation) to circumvent the shortage of parallel data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints as guidance during inference. The two-step hierarchical design also enables content control via the lyric outline, a much-desired feature for democratizing collaborative song creation. Experimental results show that our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines, for example SongMASS, a SOTA model trained on a parallel dataset, with a 24% relative overall quality improvement based on human ratings. OComment: Accepted to ACL 23. arXiv admin note: substantial text overlap with arXiv:2305.0776
    corecore