147,538 research outputs found
Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks
Representation learning on networks aims to derive a meaningful vector
representation for each node, thereby facilitating downstream tasks such as
link prediction, node classification, and node clustering. In heterogeneous
text-rich networks, this task is more challenging due to (1) presence or
absence of text: Some nodes are associated with rich textual information, while
others are not; (2) diversity of types: Nodes and edges of multiple types form
a heterogeneous network structure. As pretrained language models (PLMs) have
demonstrated their effectiveness in obtaining widely generalizable text
representations, a substantial amount of effort has been made to incorporate
PLMs into representation learning on text-rich networks. However, few of them
can jointly consider heterogeneous structure (network) information as well as
rich textual semantic information of each node effectively. In this paper, we
propose Heterformer, a Heterogeneous Network-Empowered Transformer that
performs contextualized text encoding and heterogeneous structure encoding in a
unified model. Specifically, we inject heterogeneous structure information into
each Transformer layer when encoding node texts. Meanwhile, Heterformer is
capable of characterizing node/edge type heterogeneity and encoding nodes with
or without texts. We conduct comprehensive experiments on three tasks (i.e.,
link prediction, node classification, and node clustering) on three large-scale
datasets from different domains, where Heterformer outperforms competitive
baselines significantly and consistently.Comment: KDD 2023. (Code: https://github.com/PeterGriffinJin/Heterformer
Graph Embedding with Rich Information through Heterogeneous Network
Graph embedding has attracted increasing attention due to its critical
application in social network analysis. Most existing algorithms for graph
embedding only rely on the typology information and fail to use the copious
information in nodes as well as edges. As a result, their performance for many
tasks may not be satisfactory. In this paper, we proposed a novel and general
framework of representation learning for graph with rich text information
through constructing a bipartite heterogeneous network. Specially, we designed
a biased random walk to explore the constructed heterogeneous network with the
notion of flexible neighborhood. The efficacy of our method is demonstrated by
extensive comparison experiments with several baselines on various datasets. It
improves the Micro-F1 and Macro-F1 of node classification by 10% and 7% on Cora
dataset.Comment: 9 pages, 7 figures, 4 table
Event based text mining for integrated network construction
The scientific literature is a rich and challenging data source for research in systems biology, providing numerous interactions between biological entities. Text mining techniques have been increasingly useful to extract such information from the literature in an automatic way, but up to now the main focus of text mining in the systems biology field has been restricted mostly to the discovery of protein-protein interactions. Here, we take this approach one step further, and use machine learning techniques combined with text mining to extract a much wider variety of interactions between biological entities. Each particular interaction type gives rise to a separate network, represented as a graph, all of which can be subsequently combined to yield a so-called integrated network representation. This provides a much broader view on the biological system as a whole, which can then be used in further investigations to analyse specific properties of the networ
Learning Semantic Program Embeddings with Graph Interval Neural Network
Learning distributed representations of source code has been a challenging
task for machine learning models. Earlier works treated programs as text so
that natural language methods can be readily applied. Unfortunately, such
approaches do not capitalize on the rich structural information possessed by
source code. Of late, Graph Neural Network (GNN) was proposed to learn
embeddings of programs from their graph representations. Due to the homogeneous
and expensive message-passing procedure, GNN can suffer from precision issues,
especially when dealing with programs rendered into large graphs. In this
paper, we present a new graph neural architecture, called Graph Interval Neural
Network (GINN), to tackle the weaknesses of the existing GNN. Unlike the
standard GNN, GINN generalizes from a curated graph representation obtained
through an abstraction method designed to aid models to learn. In particular,
GINN focuses exclusively on intervals for mining the feature representation of
a program, furthermore, GINN operates on a hierarchy of intervals for scaling
the learning to large graphs. We evaluate GINN for two popular downstream
applications: variable misuse prediction and method name prediction. Results
show in both cases GINN outperforms the state-of-the-art models by a
comfortable margin. We have also created a neural bug detector based on GINN to
catch null pointer deference bugs in Java code. While learning from the same
9,000 methods extracted from 64 projects, GINN-based bug detector significantly
outperforms GNN-based bug detector on 13 unseen test projects. Next, we deploy
our trained GINN-based bug detector and Facebook Infer to scan the codebase of
20 highly starred projects on GitHub. Through our manual inspection, we confirm
38 bugs out of 102 warnings raised by GINN-based bug detector compared to 34
bugs out of 129 warnings for Facebook Infer.Comment: The abstract is simplified, for full abstract, please refer to the
pape
Learning Location from Shared Elevation Profiles in Fitness Apps: A Privacy Perspective
The extensive use of smartphones and wearable devices has facilitated many
useful applications. For example, with Global Positioning System (GPS)-equipped
smart and wearable devices, many applications can gather, process, and share
rich metadata, such as geolocation, trajectories, elevation, and time. For
example, fitness applications, such as Runkeeper and Strava, utilize the
information for activity tracking and have recently witnessed a boom in
popularity. Those fitness tracker applications have their own web platforms and
allow users to share activities on such platforms or even with other social
network platforms. To preserve the privacy of users while allowing sharing,
several of those platforms may allow users to disclose partial information,
such as the elevation profile for an activity, which supposedly would not leak
the location of the users. In this work, and as a cautionary tale, we create a
proof of concept where we examine the extent to which elevation profiles can be
used to predict the location of users. To tackle this problem, we devise three
plausible threat settings under which the city or borough of the targets can be
predicted. Those threat settings define the amount of information available to
the adversary to launch the prediction attacks. Establishing that simple
features of elevation profiles, e.g., spectral features, are insufficient, we
devise both natural language processing (NLP)-inspired text-like representation
and computer vision-inspired image-like representation of elevation profiles,
and we convert the problem at hand into text and image classification problem.
We use both traditional machine learning- and deep learning-based techniques
and achieve a prediction success rate ranging from 59.59\% to 99.80\%. The
findings are alarming, highlighting that sharing elevation information may have
significant location privacy risks.Comment: 16 pages, 12 figures, 10 tables; accepted for publication in IEEE
Transactions on Mobile Computing (October 2022). arXiv admin note:
substantial text overlap with arXiv:1910.0904
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages
In many documents, such as semi-structured webpages, textual semantics are
augmented with additional information conveyed using visual elements including
layout, font size, and color. Prior work on information extraction from
semi-structured websites has required learning an extraction model specific to
a given template via either manually labeled or distantly supervised data from
that template. In this work, we propose a solution for "zero-shot" open-domain
relation extraction from webpages with a previously unseen template, including
from websites with little overlap with existing sources of knowledge for
distant supervision and websites in entirely new subject verticals. Our model
uses a graph neural network-based approach to build a rich representation of
text fields on a webpage and the relationships between them, enabling
generalization to new templates. Experiments show this approach provides a 31%
F1 gain over a baseline for zero-shot extraction in a new subject vertical.Comment: Accepted to ACL 202
CCL: Cross-modal Correlation Learning with Multi-grained Fusion by Hierarchical Network
Cross-modal retrieval has become a highlighted research topic for retrieval
across multimedia data such as image and text. A two-stage learning framework
is widely adopted by most existing methods based on Deep Neural Network (DNN):
The first learning stage is to generate separate representation for each
modality, and the second learning stage is to get the cross-modal common
representation. However, the existing methods have three limitations: (1) In
the first learning stage, they only model intra-modality correlation, but
ignore inter-modality correlation with rich complementary context. (2) In the
second learning stage, they only adopt shallow networks with single-loss
regularization, but ignore the intrinsic relevance of intra-modality and
inter-modality correlation. (3) Only original instances are considered while
the complementary fine-grained clues provided by their patches are ignored. For
addressing the above problems, this paper proposes a cross-modal correlation
learning (CCL) approach with multi-grained fusion by hierarchical network, and
the contributions are as follows: (1) In the first learning stage, CCL exploits
multi-level association with joint optimization to preserve the complementary
context from intra-modality and inter-modality correlation simultaneously. (2)
In the second learning stage, a multi-task learning strategy is designed to
adaptively balance the intra-modality semantic category constraints and
inter-modality pairwise similarity constraints. (3) CCL adopts multi-grained
modeling, which fuses the coarse-grained instances and fine-grained patches to
make cross-modal correlation more precise. Comparing with 13 state-of-the-art
methods on 6 widely-used cross-modal datasets, the experimental results show
our CCL approach achieves the best performance.Comment: 16 pages, accepted by IEEE Transactions on Multimedi
Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models
Many business documents processed in modern NLP and IR pipelines are visually
rich: in addition to text, their semantics can also be captured by visual
traits such as layout, format, and fonts. We study the problem of information
extraction from visually rich documents (VRDs) and present a model that
combines the power of large pre-trained language models and graph neural
networks to efficiently encode both textual and visual information in business
documents. We further introduce new fine-tuning objectives to improve in-domain
unsupervised fine-tuning to better utilize large amount of unlabeled in-domain
data. We experiment on real world invoice and resume data sets and show that
the proposed method outperforms strong text-based RoBERTa baselines by 6.3%
absolute F1 on invoices and 4.7% absolute F1 on resumes. When evaluated in a
few-shot setting, our method requires up to 30x less annotation data than the
baseline to achieve the same level of performance at ~90% F1.Comment: 10 pages, to appear in SIGIR 2020 Industry Trac
Spherical Paragraph Model
Representing texts as fixed-length vectors is central to many language
processing tasks. Most traditional methods build text representations based on
the simple Bag-of-Words (BoW) representation, which loses the rich semantic
relations between words. Recent advances in natural language processing have
shown that semantically meaningful representations of words can be efficiently
acquired by distributed models, making it possible to build text
representations based on a better foundation called the Bag-of-Word-Embedding
(BoWE) representation. However, existing text representation methods using BoWE
often lack sound probabilistic foundations or cannot well capture the semantic
relatedness encoded in word vectors. To address these problems, we introduce
the Spherical Paragraph Model (SPM), a probabilistic generative model based on
BoWE, for text representation. SPM has good probabilistic interpretability and
can fully leverage the rich semantics of words, the word co-occurrence
information as well as the corpus-wide information to help the representation
learning of texts. Experimental results on topical classification and sentiment
analysis demonstrate that SPM can achieve new state-of-the-art performances on
several benchmark datasets.Comment: 10 page
Contextualized Non-local Neural Networks for Sequence Learning
Recently, a large number of neural mechanisms and models have been proposed
for sequence learning, of which self-attention, as exemplified by the
Transformer model, and graph neural networks (GNNs) have attracted much
attention. In this paper, we propose an approach that combines and draws on the
complementary strengths of these two methods. Specifically, we propose
contextualized non-local neural networks (CN), which can both
dynamically construct a task-specific structure of a sentence and leverage rich
local dependencies within a particular neighborhood.
Experimental results on ten NLP tasks in text classification, semantic
matching, and sequence labeling show that our proposed model outperforms
competitive baselines and discovers task-specific dependency structures, thus
providing better interpretability to users.Comment: Accepted by AAAI201
- …