19 research outputs found
Capturing Evolution Genes for Time Series Data
The modeling of time series is becoming increasingly critical in a wide
variety of applications. Overall, data evolves by following different patterns,
which are generally caused by different user behaviors. Given a time series, we
define the evolution gene to capture the latent user behaviors and to describe
how the behaviors lead to the generation of time series. In particular, we
propose a uniform framework that recognizes different evolution genes of
segments by learning a classifier, and adopt an adversarial generator to
implement the evolution gene by estimating the segments' distribution.
Experimental results based on a synthetic dataset and five real-world datasets
show that our approach can not only achieve a good prediction results (e.g.,
averagely +10.56% in terms of F1), but is also able to provide explanations of
the results.Comment: a preprint version. arXiv admin note: text overlap with
arXiv:1703.10155 by other author
Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction
A capsule is a group of neurons, whose activity vector represents the
instantiation parameters of a specific type of entity. In this paper, we
explore the capsule networks used for relation extraction in a multi-instance
multi-label learning framework and propose a novel neural approach based on
capsule networks with attention mechanisms. We evaluate our method with
different benchmarks, and it is demonstrated that our method improves the
precision of the predicted relations. Particularly, we show that capsule
networks improve multiple entity pairs relation extraction.Comment: To be published in EMNLP 201
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
We propose a distance supervised relation extraction approach for
long-tailed, imbalanced data which is prevalent in real-world settings. Here,
the challenge is to learn accurate "few-shot" models for classes existing at
the tail of the class distribution, for which little data is available.
Inspired by the rich semantic correlations between classes at the long tail and
those at the head, we take advantage of the knowledge from data-rich classes at
the head of the distribution to boost the performance of the data-poor classes
at the tail. First, we propose to leverage implicit relational knowledge among
class labels from knowledge graph embeddings and learn explicit relational
knowledge using graph convolution networks. Second, we integrate that
relational knowledge into relation extraction model by coarse-to-fine
knowledge-aware attention mechanism. We demonstrate our results for a
large-scale benchmark dataset which show that our approach significantly
outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201
Relation Adversarial Network for Low Resource Knowledge Graph Completion
Knowledge Graph Completion (KGC) has been proposed to improve Knowledge
Graphs by filling in missing connections via link prediction or relation
extraction. One of the main difficulties for KGC is a low resource problem.
Previous approaches assume sufficient training triples to learn versatile
vectors for entities and relations, or a satisfactory number of labeled
sentences to train a competent relation extraction model. However, low resource
relations are very common in KGs, and those newly added relations often do not
have many known samples for training. In this work, we aim at predicting new
facts under a challenging setting where only limited training instances are
available. We propose a general framework called Weighted Relation Adversarial
Network, which utilizes an adversarial procedure to help adapt
knowledge/features learned from high resource relations to different but
related low resource relations. Specifically, the framework takes advantage of
a relation discriminator to distinguish between samples from different
relations, and help learn relation-invariant features more transferable from
source relations to target relations. Experimental results show that the
proposed approach outperforms previous methods regarding low resource settings
for both link prediction and relation extraction.Comment: WWW202
Multi-tissue integrative analysis of personal epigenomes
Evaluating the impact of genetic variants on transcriptional regulation is a central goal in biological science that has been constrained by reliance on a single reference genome. To address this, we constructed phased, diploid genomes for four cadaveric donors (using long-read sequencing) and systematically charted noncoding regulatory elements and transcriptional activity across more than 25 tissues from these donors. Integrative analysis revealed over a million variants with allele-specific activity, coordinated, locus-scale allelic imbalances, and structural variants impacting proximal chromatin structure. We relate the personal genome analysis to the ENCODE encyclopedia, annotating allele- and tissue-specific elements that are strongly enriched for variants impacting expression and disease phenotypes. These experimental and statistical approaches, and the corresponding EN-TEx resource, provide a framework for personalized functional genomics
When Low Resource NLP Meets Unsupervised Language Model: Meta-Pretraining then Meta-Learning for Few-Shot Text Classification (Student Abstract)
Text classification tends to be difficult when data are deficient or when it is required to adapt to unseen classes. In such challenging scenarios, recent studies have often used meta-learning to simulate the few-shot task, thus negating implicit common linguistic features across tasks. This paper addresses such problems using meta-learning and unsupervised language models. Our approach is based on the insight that having a good generalization from a few examples relies on both a generic model initialization and an effective strategy for adapting this model to newly arising tasks. We show that our approach is not only simple but also produces a state-of-the-art performance on a well-studied sentiment classification dataset. It can thus be further suggested that pretraining could be a promising solution for few-shot learning of many other NLP tasks. The code and the dataset to replicate the experiments are made available at https://github.com/zxlzr/FewShotNLP
BFG&MSF-net: boundary feature guidance and multi-scale fusion network for thyroid nodule segmentation
Accurately segmenting thyroid nodules in ultrasound images is crucial for computer-aided diagnosis. Despite the success of Convolutional Neural Networks (CNNs) and Transformers in natural images processing, they struggle with precise boundaries and small-object segmentation in ultrasound images. To address this, a novel BFG&MSF-Net model is proposed in this paper, utilizing four newly designed modules: (1) a Boundary Feature Guidance Module (BFGM) for improving the edge details capturing; (2) a Multi-Scale Perception Fusion Module (MSPFM) for enhancing the information capture by combining a novel Positional Blended Attention (PBA) with the Pyramid Squeeze Attention (PSA); (3) a Depthwise Separable Atrous Spatial Pyramid Pooling Module (DSASPPM), used in the bottleneck to improve the contextual information capturing; and (4) a Refinement Module (RM) optimizing the low-level features for better organ and boundary identification. Evaluated on the TN3K and DDTI open-access datasets, BFG&MSF-Net demonstrates effective reduction of boundary segmentation errors and superior segmentation performance compared to commonly used segmentation models and state-of-the-art models, which makes it a promising solution for accurate thyroid nodule segmentation in ultrasound images.</p