146,002 research outputs found
Graph-based Modelling of Concurrent Sequential Patterns
Structural relation patterns have been introduced recently to extend the search for complex patterns often hidden behind large sequences of data. This has motivated a novel approach to sequential patterns post-processing and a corresponding data mining method was proposed for Concurrent Sequential Patterns (ConSP). This article refines the approach in the context of ConSP modelling, where a companion graph-based model is devised as an extension of previous work. Two new modelling methods are presented here together with a construction algorithm, to complete the transformation of concurrent sequential patterns to a ConSP-Graph representation. Customer orders data is used to demonstrate the effectiveness of ConSP mining while synthetic sample data highlights the strength of the modelling technique, illuminating the theories developed
Sequential Patterns Post-processing for Structural Relation Patterns Mining
Sequential patterns mining is an important data-mining technique used to identify frequently observed sequential
occurrence of items across ordered transactions over time. It has been extensively studied in the literature, and there
exists a diversity of algorithms. However, more complex structural patterns are often hidden behind sequences.
This article begins with the introduction of a model for the representation of sequential patterns—Sequential
Patterns Graph—which motivates the search for new structural relation patterns. An integrative framework for
the discovery of these patterns–Postsequential Patterns Mining–is then described which underpins the postprocessing
of sequential patterns. A corresponding data-mining method based on sequential patterns postprocessing
is proposed and shown to be effective in the search for concurrent patterns. From experiments conducted on three
component algorithms, it is demonstrated that sequential patterns-based concurrent patterns mining provides
an efficient method for structural knowledge discover
Improving Neural Relation Extraction with Implicit Mutual Relations
Relation extraction (RE) aims at extracting the relation between two entities
from the text corpora. It is a crucial task for Knowledge Graph (KG)
construction. Most existing methods predict the relation between an entity pair
by learning the relation from the training sentences, which contain the
targeted entity pair. In contrast to existing distant supervision approaches
that suffer from insufficient training corpora to extract relations, our
proposal of mining implicit mutual relation from the massive unlabeled corpora
transfers the semantic information of entity pairs into the RE model, which is
more expressive and semantically plausible. After constructing an entity
proximity graph based on the implicit mutual relations, we preserve the
semantic relations of entity pairs via embedding each vertex of the graph into
a low-dimensional space. As a result, we can easily and flexibly integrate the
implicit mutual relations and other entity information, such as entity types,
into the existing RE methods.
Our experimental results on a New York Times and another Google Distant
Supervision datasets suggest that our proposed neural RE framework provides a
promising improvement for the RE task, and significantly outperforms the
state-of-the-art methods. Moreover, the component for mining implicit mutual
relations is so flexible that can help to improve the performance of both
CNN-based and RNN-based RE models significant.Comment: 12 page
Sentence Bag Graph Formulation for Biomedical Distant Supervision Relation Extraction
We introduce a novel graph-based framework for alleviating key challenges in
distantly-supervised relation extraction and demonstrate its effectiveness in
the challenging and important domain of biomedical data. Specifically, we
propose a graph view of sentence bags referring to an entity pair, which
enables message-passing based aggregation of information related to the entity
pair over the sentence bag. The proposed framework alleviates the common
problem of noisy labeling in distantly supervised relation extraction and also
effectively incorporates inter-dependencies between sentences within a bag.
Extensive experiments on two large-scale biomedical relation datasets and the
widely utilized NYT dataset demonstrate that our proposed framework
significantly outperforms the state-of-the-art methods for biomedical distant
supervision relation extraction while also providing excellent performance for
relation extraction in the general text mining domain
Bayesian model selection in statistical construction of justification
Argumentation mining involves identification of an attack relation between natural language sentences. Bayesian inference characterizing argument-based reasoning addresses this issue by calculating the posterior distribution over attack relations given acceptability statuses of arguments. This paper discusses the use of Bayesian model selection where graph-theoretic properties impose restrictions on the graphic structure of attack relations
Discovering Relations by Entity Search in Lightweight Semantic Text Graphs
Entity search is becoming a popular alternative for full text search. Recently Google released its entity search based on confirmed, human-generated data such as Wikipedia. In spite of these developments, the task of entity discovery, search, or relation search in unstructured text remains a major challenge in the fields of information retrieval and information extraction. This paper tries to address that challenge, focusing specifically on entity relation discovery. This is achieved by processing unstructured text using simple information extraction methods, building lightweight semantic graphs and reusing them for entity relation discovery by applying algorithms from graph theory. An important part is also user interaction with semantic graphs, which can significantly improve information extraction results and entity relation search. Entity relations can be discovered by various text mining methods, but the advantage of the presented method lies in the similarity between the lightweight semantics extracted from a text and the information networks available as structured data. Both graph structures have similar properties and similar relation discovery algorithms can be applied. In addition, we can benefit from the integration of such graph data. We provide both a relevance and performance evaluations of the approach and showcase it in several use case applications
Relation Embedding for Personalised POI Recommendation
Point-of-Interest (POI) recommendation is one of the most important
location-based services helping people discover interesting venues or services.
However, the extreme user-POI matrix sparsity and the varying spatio-temporal
context pose challenges for POI systems, which affects the quality of POI
recommendations. To this end, we propose a translation-based relation embedding
for POI recommendation. Our approach encodes the temporal and geographic
information, as well as semantic contents effectively in a low-dimensional
relation space by using Knowledge Graph Embedding techniques. To further
alleviate the issue of user-POI matrix sparsity, a combined matrix
factorization framework is built on a user-POI graph to enhance the inference
of dynamic personal interests by exploiting the side-information. Experiments
on two real-world datasets demonstrate the effectiveness of our proposed model.Comment: 12 pages, 3 figures, Accepted in the 24th Pacific-Asia Conference on
Knowledge Discovery and Data Mining (PAKDD 2020
Put three and three together: Triangle-driven community detection
Community detection has arisen as one of the most relevant topics in the field of graph data mining due to its applications in many fields such as biology, social networks, or network traffic analysis. Although the existing metrics used to quantify the quality of a community work well in general, under some circumstances, they fail at correctly capturing such notion. The main reason is that these metrics consider the internal community edges as a set, but ignore how these actually connect the vertices of the community. We propose the Weighted Community Clustering (WCC), which is a new community metric that takes the triangle instead of the edge as the minimal structural motif indicating the presence of a strong relation in a graph. We theoretically analyse WCC in depth and formally prove, by means of a set of properties, that the maximization of WCC guarantees communities with cohesion and structure. In addition, we propose Scalable Community Detection (SCD), a community detection algorithm based on WCC, which is designed to be fast and scalable on SMP machines, showing experimentally that WCC correctly captures the concept of community in social networks using real datasets. Finally, using ground-truth data, we show that SCD provides better quality than the best disjoint community detection algorithms of the state of the art while performing faster.Peer ReviewedPostprint (author's final draft
- …