Search CORE

34 research outputs found

Measuring vertex centrality in co-occurrence graphs for online social tag recommendation

Author: Cantador Iván
Jose Joemon M.
Vallet Weadon David Jordi
Publication venue: Robert Jäschke
Publication date: 01/01/2009
Field of study

Also published online by CEUR Workshop Proceedings (CEUR-WS.org, ISSN 1613-0073) Proceedings of ECML PKDD (The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases) Discovery Challenge 2009, Bled, Slovenia, September 7, 2009.We present a social tag recommendation model for collaborative bookmarking systems. This model receives as input a bookmark of a web page or scientific publication, and automatically suggests a set of social tags useful for annotating the bookmarked document. Analysing and processing the bookmark textual contents - document title, URL, abstract and descriptions - we extract a set of keywords, forming a query that is launched against an index, and retrieves a number of similar tagged bookmarks. Afterwards, we take the social tags of these bookmarks, and build their global co-occurrence sub-graph. The tags (vertices) of this reduced graph that have the highest vertex centrality constitute our recommendations, whThis research was supported by the European Commission under contracts FP6-027122-SALERO, FP6-033715-MIAUCE and FP6-045032 SEMEDIA. The expressed content is the view of the authors but not necessarily the view of SALERO, MIAUCE and SEMEDIA projects as a whol

Biblos-e Archivo

Mining Characteristic Patterns for Comparative Music Corpus Analysis

Author: Conklin Darrell
Neubarth Kerstin
Publication venue: 'MDPI AG'
Publication date: 14/03/2020
Field of study

A core issue of computational pattern mining is the identification of interesting patterns. When mining music corpora organized into classes of songs, patterns may be of interest because they are characteristic, describing prevalent properties of classes, or because they are discriminant, capturing distinctive properties of classes. Existing work in computational music corpus analysis has focused on discovering discriminant patterns. This paper studies characteristic patterns, investigating the behavior of different pattern interestingness measures in balancing coverage and discriminability of classes in top k pattern mining and in individual top ranked patterns. Characteristic pattern mining is applied to the collection of Native American music by Frances Densmore, and the discovered patterns are shown to be supported by Densmore’s own analyses

Multidisciplinary Digital Publishing Institute

Archivo Digital para la Docencia y la Investigación

Metadata impact on research paper similarity

Author: Cornelis Chris
Hurtado Martín Germán
Naessens Helga
Schockaert Steven
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

While collaborative filtering and citation analysis have been well studied for research paper recommender systems, content-based approaches typically restrict themselves to straightforward application of the vector space model. However, various types of metadata containing potentially useful information are usually available as well. Our work explores several methods to exploit this information in combination with different similarity measures

Crossref

Online Research @ Cardiff

Ghent University Academic Bibliography

International Evaluation of Research and Doctoral Training at the University of Helsinki 2005-2010 : RC-Specific Evaluation of ALKO - Algorithms and Data Analysis

Author
Publication venue
Publication date: 01/01/2012
Field of study

Helsingin yliopiston digitaalinen arkisto

I valori fondanti dell'Unione europea tra prassi giurisprudenziale e meccanismi di controllo politico

Author: Nicolosi Salvo
Publication venue: G. Giappichelli Editore srl
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

Scalable and efficient multi-label classification for evolving data streams

Author: A. Appice
A. Bifet
A. Bifet
A. Bifet
A. Bifet
A. Clare
A. M. Ráez
Albert Bifet
Bernhard Pfahringer
E. Ikonomovska
E. Spyromitros-Xioufis
G. Tsoumakas
G. Tsoumakas
G. Widmer
Geoff Holmes
J. Demšar
J. Fürnkranz
J. Gama
J. Read
J. Read
Jesse Read
K. Crammer
K. Dembczyński
M. Hall
M. L. Zhang
N. C. Oza
N. Cesa-Bianchi
P. Domingos
R. E. Schapire
S. Godbole
W. Cheng
W. Cheng
W. Qu
X. Kong
Y. N. Law
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Tackling scalability issues in mining path patterns from knowledge graphs: a preliminary study

Author: Bresso Emmanuel
Couceiro Miguel
Coulet Adrien
Monnin Pierre
Napoli Amedeo
Smaïl-Tabbone Malika
Publication venue
Publication date: 07/08/2020
Field of study

Features mined from knowledge graphs are widely used within multiple knowledge discovery tasks such as classification or fact-checking. Here, we consider a given set of vertices, called seed vertices, and focus on mining their associated neighboring vertices, paths, and, more generally, path patterns that involve classes of ontologies linked with knowledge graphs. Due to the combinatorial nature and the increasing size of real-world knowledge graphs, the task of mining these patterns immediately entails scalability issues. In this paper, we address these issues by proposing a pattern mining approach that relies on a set of constraints (e.g., support or degree thresholds) and the monotonicity property. As our motivation comes from the mining of real-world knowledge graphs, we illustrate our approach with PGxLOD, a biomedical knowledge graph

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Pattern mining for label ranking

Author: Pinho Rebelo de Sá C.F.
Publication venue
Publication date: 16/12/2016
Field of study

Preferences have always been present in many tasks in our daily lives. Buying the right car, choosing a suitable house or even deciding on the food to eat, are trivial examples of decisions that reveal information, explicitly or implicitly, about our preferences. The recent trend of collecting increasing amounts of data is also true for preference data. Extracting and modeling preferences can provide us with invaluable information about the choices of groups or individuals. In areas like e-commerce, which typically deal with decisions from thousands of users, the acquisition of preferences can be a difficult task. For these reasons, artificial intelligence (in particular, machine learning) methods have been increasingly important to the discovery and automatic learning of models about preferences. In this Ph.D. project, several approaches were analyzed and proposed to deal with the LR problem. Most of which has focused on pattern mining methods.Algorithms and the Foundations of Software technolog

Leiden University Scholary Publications

Pattern Mining for Label Ranking

Author: Cláudio Frederico Pinho Rebelo de Sá
Publication venue
Publication date: 16/12/2016
Field of study

Leiden University Scholary Publications

Repositório Aberto da Universidade do Porto

Scalable Multi-label Classification

Author: Geoff Holmes
Jesse Read
Supervised Bernhard Pfahringer
Publication venue: 'University of Waikato'
Publication date: 01/01/2010
Field of study

Multi-label classification is relevant to many domains, such as text, image and other media, and bioinformatics. Researchers have already noticed that in multi-label data, correlations exist between labels, and a variety of approaches, drawing inspiration from many spheres of machine learning, have been able to model these correlations. However, data sources from the real world are growing ever larger and the multi-label task is particularly sensitive to this due to the complexity associated with multiple labels and the correlations between them. Consequently, many methods do not scale up to large problems. This thesis deals with scalable multi-label classification: methods which exhibit high predictive performance, but are also able to scale up to larger problems. The first major contribution is the pruned sets method, which is able to model label correlations directly for high predictive performance, but reduces overfitting and complexity over related methods by pruning and subsampling label sets, and can thus scale up to larger datasets. The second major contribution is the classifier chains method, which models correlations with a chain of binary classifiers. The use of binary models allows for scalability to even larger datasets. Pruned sets and classifier chains are robust with respect to both the variety and scale of data that they can deal with, and can be incorporated into other methods. In an ensemble scheme, these methods are able to compete with state-of-the-art methods in terms of predictive performance as well as scale up to large datasets of hundreds of thousands of training examples. This thesis also puts a special emphasis on multi-label evaluation; introducing a new evaluation measure and studying threshold calibration. With one of the largest and most varied collections of multi-label datasets in the literature, extensive experimental evaluation shows the advantage of these methods, both in terms of predictive performance, and computational efficiency and scalability

CiteSeerX

Research Commons@Waikato