90 research outputs found
Efficient Algorithms for Node Disjoint Subgraph Homeomorphism Determination
Recently, great efforts have been dedicated to researches on the management
of large scale graph based data such as WWW, social networks, biological
networks. In the study of graph based data management, node disjoint subgraph
homeomorphism relation between graphs is more suitable than (sub)graph
isomorphism in many cases, especially in those cases that node skipping and
node mismatching are allowed. However, no efficient node disjoint subgraph
homeomorphism determination (ndSHD) algorithms have been available. In this
paper, we propose two computationally efficient ndSHD algorithms based on state
spaces searching with backtracking, which employ many heuristics to prune the
search spaces. Experimental results on synthetic data sets show that the
proposed algorithms are efficient, require relative little time in most of the
testing cases, can scale to large or dense graphs, and can accommodate to more
complex fuzzy matching cases.Comment: 15 pages, 11 figures, submitted to DASFAA 200
Deep Short Text Classification with Knowledge Powered Attention
Short text classification is one of important tasks in Natural Language
Processing (NLP). Unlike paragraphs or documents, short texts are more
ambiguous since they have not enough contextual information, which poses a
great challenge for classification. In this paper, we retrieve knowledge from
external knowledge source to enhance the semantic representation of short
texts. We take conceptual information as a kind of knowledge and incorporate it
into deep neural networks. For the purpose of measuring the importance of
knowledge, we introduce attention mechanisms and propose deep Short Text
Classification with Knowledge powered Attention (STCKA). We utilize Concept
towards Short Text (C- ST) attention and Concept towards Concept Set (C-CS)
attention to acquire the weight of concepts from two aspects. And we classify a
short text with the help of conceptual information. Unlike traditional
approaches, our model acts like a human being who has intrinsic ability to make
decisions based on observation (i.e., training data for machines) and pays more
attention to important knowledge. We also conduct extensive experiments on four
public datasets for different tasks. The experimental results and case studies
show that our model outperforms the state-of-the-art methods, justifying the
effectiveness of knowledge powered attention
Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation
A type description is a succinct noun compound which helps human and machines
to quickly grasp the informative and distinctive information of an entity.
Entities in most knowledge graphs (KGs) still lack such descriptions, thus
calling for automatic methods to supplement such information. However, existing
generative methods either overlook the grammatical structure or make factual
mistakes in generated texts. To solve these problems, we propose a
head-modifier template-based method to ensure the readability and data fidelity
of generated type descriptions. We also propose a new dataset and two automatic
metrics for this task. Experiments show that our method improves substantially
compared with baselines and achieves state-of-the-art performance on both
datasets.Comment: ACL 201
MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base
The ability to understand and generate similes is an imperative step to
realize human-level AI. However, there is still a considerable gap between
machine intelligence and human cognition in similes, since deep models based on
statistical distribution tend to favour high-frequency similes. Hence, a
large-scale symbolic knowledge base of similes is required, as it contributes
to the modeling of diverse yet unpopular similes while facilitating additional
evaluation and reasoning. To bridge the gap, we propose a novel framework for
large-scale simile knowledge base construction, as well as two probabilistic
metrics which enable an improved understanding of simile phenomena in natural
language. Overall, we construct MAPS-KB, a million-scale probabilistic simile
knowledge base, covering 4.3 million triplets over 0.4 million terms from 70 GB
corpora. We conduct sufficient experiments to justify the effectiveness and
necessity of the methods of our framework. We also apply MAPS-KB on three
downstream tasks to achieve state-of-the-art performance, further demonstrating
the value of MAPS-KB.Comment: Accepted to AAAI 202
- …