210 research outputs found
Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search
Text-based person search aims to retrieve the corresponding person images in
an image database by virtue of a describing sentence about the person, which
poses great potential for various applications such as video surveillance.
Extracting visual contents corresponding to the human description is the key to
this cross-modal matching problem. Moreover, correlated images and descriptions
involve different granularities of semantic relevance, which is usually ignored
in previous methods. To exploit the multilevel corresponding visual contents,
we propose a pose-guided multi-granularity attention network (PMA). Firstly, we
propose a coarse alignment network (CA) to select the related image regions to
the global description by a similarity-based attention. To further capture the
phrase-related visual body part, a fine-grained alignment network (FA) is
proposed, which employs pose information to learn latent semantic alignment
between visual body part and textual noun phrase. To verify the effectiveness
of our model, we perform extensive experiments on the CUHK Person Description
Dataset (CUHK-PEDES) which is currently the only available dataset for
text-based person search. Experimental results show that our approach
outperforms the state-of-the-art methods by 15 \% in terms of the top-1 metric.Comment: published in AAAI2020(oral
Federated Scheduling for Stochastic Parallel Real-time Tasks
Federated scheduling is a strategy to schedule parallel real-time tasks: It allocates a dedicated cluster of cores to high-utilization task (utilization \u3e1); It uses a multiprocessor scheduling algorithm to schedule and execute all low-utilization tasks sequentially, on a shared cluster of the remaining cores. Prior work has shown that federated scheduling has the best known capacity augmentation bound of 2 for parallel tasks with implicit deadlines. In this paper, we explore the soft real-time performance of federated scheduling and address the average-case workloads instead of the worst-case values. In particular, we consider stochastic tasks -- tasks for which execution time and critical-path length are random variables. In this case, we use bounded expected tardiness as the schedulability criterion. We define a stochastic capacity augmentation bound and prove that federated scheduling algorithms guarantee the same bound of 2 for stochastic tasks. We present three federated mapping algorithms for core allocation. All of them guarantee bounded expected tardiness and provide the same capacity augmentation bound; In practice, however, we expect them to provide different performances, both in terms of the task sets they can schedule and the actual tardiness they guarantee. Therefore, we performed numerical evaluations using randomly generated task sets to understand the practical differences between the three algorithms
Promoting Open-domain Dialogue Generation through Learning Pattern Information between Contexts and Responses
Recently, utilizing deep neural networks to build the opendomain dialogue
models has become a hot topic. However, the responses generated by these models
suffer from many problems such as responses not being contextualized and tend
to generate generic responses that lack information content, damaging the
user's experience seriously. Therefore, many studies try introducing more
information into the dialogue models to make the generated responses more vivid
and informative. Unlike them, this paper improves the quality of generated
responses by learning the implicit pattern information between contexts and
responses in the training samples. In this paper, we first build an open-domain
dialogue model based on the pre-trained language model (i.e., GPT-2). And then,
an improved scheduled sampling method is proposed for pre-trained models, by
which the responses can be used to guide the response generation in the
training phase while avoiding the exposure bias problem. More importantly, we
design a response-aware mechanism for mining the implicit pattern information
between contexts and responses so that the generated replies are more diverse
and approximate to human replies. Finally, we evaluate the proposed model (RAD)
on the Persona-Chat and DailyDialog datasets; and the experimental results show
that our model outperforms the baselines on most automatic and manual metrics
Touch and Go: Learning from Human-Collected Vision and Touch
The ability to associate touch with sight is essential for tasks that require
physically interacting with objects in the world. We propose a dataset with
paired visual and tactile data called Touch and Go, in which human data
collectors probe objects in natural environments using tactile sensors, while
simultaneously recording egocentric video. In contrast to previous efforts,
which have largely been confined to lab settings or simulated environments, our
dataset spans a large number of "in the wild" objects and scenes. To
demonstrate our dataset's effectiveness, we successfully apply it to a variety
of tasks: 1) self-supervised visuo-tactile feature learning, 2) tactile-driven
image stylization, i.e., making the visual appearance of an object more
consistent with a given tactile signal, and 3) predicting future frames of a
tactile signal from visuo-tactile inputs.Comment: Accepted by NeurIPS 2022 Track of Datasets and Benchmark
Single-cell transcriptome revealed dysregulated RNA-binding protein expression patterns and functions in human ankylosing spondylitis
ObjectiveTo explore the expression characteristics and regulatory patterns of RBPs in different immune cell types of AS, and to clarify the potential key role of RBPs in the occurrence and development of AS disease.MethodsPBMC sample data from scRNA-seq (HC*29, AS*10) and bulk RNA-seq (NC*3, AS*5) were selected for correlation analysis.Results(1) Compared with the HC group, the numbers of B, DC (dendritic cells), CD14+ Mono and CD8+ T cells were increased in AS group, while the numbers of platelet (platelets), CD8+ NKT, CD16+ Mono (non-classical monocytes), Native CD4+ T and NK were decreased. (2) Through the analysis of RBP genes in B cells, some RBPs were found to play an important role in B cell differentiation and function, such as DDX3X, SFPQ, SRRM1, UPF2. (3) It may be related to B-cell receptor, IgA immunity, NOD-like receptor and other signaling pathways; Through the analysis of RBP genes in CD8+ T cells, some RBPs that play an important role in the immune regulation of CD8+ T were found, such as EIF2S3, EIF4B, HSPA5, MSL3, PABPC1 and SRSF7; It may be related to T cell receptor, TNF, IL17 and other signaling pathways. (4) Based on bulk RNA-seq, it was found that compared with HC and AS patients, differentially expressed variable splicing genes (RASGs) may play an important role in the occurrence and development of AS by participating in transcriptional regulation, protein phosphorylation and ubiquitination, DNA replication, angiogenesis, intracellular signal transduction and other related pathways.ConclusionRBPs has specific expression characteristics in different immune cell types of AS patients, and has important regulatory functions. Its abnormal expression and regulation may be closely related to the occurrence and development of AS
Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End Neural Network for 4D-MRI with Simultaneous Motion Estimation and Super-Resolution
Four-dimensional magnetic resonance imaging (4D-MRI) is an emerging technique
for tumor motion management in image-guided radiation therapy (IGRT). However,
current 4D-MRI suffers from low spatial resolution and strong motion artifacts
owing to the long acquisition time and patients' respiratory variations; these
limitations, if not managed properly, can adversely affect treatment planning
and delivery in IGRT. Herein, we developed a novel deep learning framework
called the coarse-super-resolution-fine network (CoSF-Net) to achieve
simultaneous motion estimation and super-resolution in a unified model. We
designed CoSF-Net by fully excavating the inherent properties of 4D-MRI, with
consideration of limited and imperfectly matched training datasets. We
conducted extensive experiments on multiple real patient datasets to verify the
feasibility and robustness of the developed network. Compared with existing
networks and three state-of-the-art conventional algorithms, CoSF-Net not only
accurately estimated the deformable vector fields between the respiratory
phases of 4D-MRI but also simultaneously improved the spatial resolution of
4D-MRI with enhanced anatomic features, yielding 4D-MR images with high
spatiotemporal resolution
Unsupervised Melody-to-Lyric Generation
Automatic melody-to-lyric generation is a task in which song lyrics are
generated to go with a given melody. It is of significant practical interest
and more challenging than unconstrained lyric generation as the music imposes
additional constraints onto the lyrics. The training data is limited as most
songs are copyrighted, resulting in models that underfit the complicated
cross-modal relationship between melody and lyrics. In this work, we propose a
method for generating high-quality lyrics without training on any aligned
melody-lyric data. Specifically, we design a hierarchical lyric generation
framework that first generates a song outline and second the complete lyrics.
The framework enables disentanglement of training (based purely on text) from
inference (melody-guided text generation) to circumvent the shortage of
parallel data.
We leverage the segmentation and rhythm alignment between melody and lyrics
to compile the given melody into decoding constraints as guidance during
inference. The two-step hierarchical design also enables content control via
the lyric outline, a much-desired feature for democratizing collaborative song
creation. Experimental results show that our model can generate high-quality
lyrics that are more on-topic, singable, intelligible, and coherent than strong
baselines, for example SongMASS, a SOTA model trained on a parallel dataset,
with a 24% relative overall quality improvement based on human ratings. OComment: Accepted to ACL 23. arXiv admin note: substantial text overlap with
arXiv:2305.0776
- …