Search CORE

18 research outputs found

Mining Frequent Neighborhood Patterns in Large Labeled Graphs

Author: Han Jialong
Wen Ji-Rong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Over the years, frequent subgraphs have been an important sort of targeted patterns in the pattern mining literatures, where most works deal with databases holding a number of graph transactions, e.g., chemical structures of compounds. These methods rely heavily on the downward-closure property (DCP) of the support measure to ensure an efficient pruning of the candidate patterns. When switching to the emerging scenario of single-graph databases such as Google Knowledge Graph and Facebook social graph, the traditional support measure turns out to be trivial (either 0 or 1). However, to the best of our knowledge, all attempts to redefine a single-graph support resulted in measures that either lose DCP, or are no longer semantically intuitive. This paper targets mining patterns in the single-graph setting. We resolve the "DCP-intuitiveness" dilemma by shifting the mining target from frequent subgraphs to frequent neighborhoods. A neighborhood is a specific topological pattern where a vertex is embedded, and the pattern is frequent if it is shared by a large portion (above a given threshold) of vertices. We show that the new patterns not only maintain DCP, but also have equally significant semantics as subgraph patterns. Experiments on real-life datasets display the feasibility of our algorithms on relatively large graphs, as well as the capability of mining interesting knowledge that is not discovered in prior works.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Word Sense Disambiguation: A Structured Learning Perspective

Author: Wang Ting
Wang Zhiyuan
Zhou Yun
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 02/03/2016
Field of study

This paper explores the application of structured learning methods (SLMs) to word sense disambiguation (WSD). On one hand, the semantic dependencies between polysemous words in the sentence can be encoded in SLMs. On the other hand, SLMs obtained significant achievements in natural language processing, and so it is a natural idea to apply them to WSD. However, there are many theoretical and practical problems when SLMs are applied to WSD, due to characteristics of WSD. Beginning with the method based on hidden Markov model, this paper proposes for the first time a comprehensive and unified solution for WSD based on maximum entropy Markov model, conditional random field and tree-structured conditional random field, and reduces the time complexity and running time of the proposed methods to a reasonable level by beam search, approximate training, and parallel training. The update of models brings performance improvement, the introduction of one step dependency improves performance by 1--5 percent, the adoption of non-independent features improves performance by 2--3 percent, and the extension of underlying structure to dependency parsing tree improves performance by about 1 percent. On the English all-words WSD dataset of Senseval-2004, the method based on tree-structured conditional random field outperforms the best attendee system significantly. Nevertheless, almost all machine learning methods suffer from data sparseness due to the scarcity of sense tagged data, and so do SLMs. Besides improving structured learning methods according to the characteristics of WSD, another approach to improve disambiguation performance is to mine disambiguation knowledge from all kinds of sources, such as Wikipedia, parallel corpus, and to alleviate knowledge acquisition bottleneck of WSD

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

On the Conquest of Scholarly Data:What Are the Key Drivers of Successful Scholars ? - A Machine Learning Approach

Author: Selmani Leutrim
Publication venue
Publication date: 01/01/2021
Field of study

Repository of the University of Namur

Modelado de perfiles de usuario para la recomendación de contenido en Twitter

Author: Godoy Daniela Lis
Rodríguez María Florencia
Publication venue
Publication date: 05/01/2017
Field of study

En este trabajo se investigan diferentes mecanismos para deducir la semántica de los mensajes de Twitter con el fin de modelar perfiles de usuario. Se introducen y analizan métodos de procesamiento de lenguaje natural para plantear diferentes formas de inferir los intereses de los usuarios a partir de sus tweets. Luego, esas estrategias son comparadas para analizar el comportamiento al recomendar mensajes de otros usuarios.Sociedad Argentina de Informática e Investigación Operativa (SADIO

Modelado de perfiles de usuario para la recomendación de contenido en Twitter

Author: Godoy Daniela Lis
Rodríguez María Florencia
Publication venue
Publication date: 01/09/2016
Field of study