110,379 research outputs found
Style Transfer in Text: Exploration and Evaluation
Style transfer is an important problem in natural language processing (NLP).
However, the progress in language style transfer is lagged behind other
domains, such as computer vision, mainly because of the lack of parallel data
and principle evaluation metrics. In this paper, we propose to learn style
transfer with non-parallel data. We explore two models to achieve this goal,
and the key idea behind the proposed models is to learn separate content
representations and style representations using adversarial networks. We also
propose novel evaluation metrics which measure two aspects of style transfer:
transfer strength and content preservation. We access our models and the
evaluation metrics on two tasks: paper-news title transfer, and
positive-negative review transfer. Results show that the proposed content
preservation metric is highly correlate to human judgments, and the proposed
models are able to generate sentences with higher style transfer strength and
similar content preservation score comparing to auto-encoder.Comment: To appear in AAAI-1
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
Correlating neural and symbolic representations of language
Analysis methods which enable us to better understand the representations and
functioning of neural models of language are increasingly needed as deep
learning becomes the dominant approach in NLP. Here we present two methods
based on Representational Similarity Analysis (RSA) and Tree Kernels (TK) which
allow us to directly quantify how strongly the information encoded in neural
activation patterns corresponds to information represented by symbolic
structures such as syntax trees. We first validate our methods on the case of a
simple synthetic language for arithmetic expressions with clearly defined
syntax and semantics, and show that they exhibit the expected pattern of
results. We then apply our methods to correlate neural representations of
English sentences with their constituency parse trees.Comment: ACL 201
- …