Search CORE

326 research outputs found

Recommended from our members

Improving tag recommendation using social networks

Author: Rae Adam
Sigurbjörnsson Börkur
van Zwol Roelof
Publication venue
Publication date: 01/04/2010
Field of study

In this paper we address the task of recommending additional tags to partially annotated media objects, in our case images. We propose an extendable framework that can recommend tags using a combination of different personalised and collective contexts. We combine information from four contexts: (1) all the photos in the system, (2) a user's own photos, (3) the photos of a user's social contacts, and (4) the photos posted in the groups of which a user is a member. Variants of methods (1) and (2) have been proposed in previous work, but the use of (3) and (4) is novel. For each of the contexts we use the same probabilistic model and Borda Count based aggregation approach to generate recommendations from different contexts into a unified ranking of recommended tags. We evaluate our system using a large set of real-world data from Flickr. We show that by using personalised contexts we can significantly improve tag recommendation compared to using collective knowledge alone. We also analyse our experimental results to explore the capabilities of our system with respect to a user's social behaviour

Open Research Online (The Open University)

Building user interest profiles from wikipedia clusters

Author: Jones Gareth J.F.
Min Jinming
Publication venue
Publication date: 28/07/2011
Field of study

Users of search systems are often reluctant to explicitly build profiles to indicate their search interests. Thus automatically building user profiles is an important research area for personalized search. One difficult component of doing this is accessing a knowledge system which provides broad coverage of user search interests. In this work, we describe a method to build category id based user profiles from a user's historical search data. Our approach makes significant use of Wikipedia as an external knowledge resource

Irish Universities

DCU Online Research Access Service

Fast redshift clustering with the Baire (ultra) metric

Author: Contreras Pedro
Murtagh Fionn
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 20/04/2011
Field of study

The Baire metric induces an ultrametric on a dataset and is of linear computational complexity, contrasted with the standard quadratic time agglomerative hierarchical clustering algorithm. We apply the Baire distance to spectrometric and photometric redshifts from the Sloan Digital Sky Survey using, in this work, about half a million astronomical objects. We want to know how well the (more cos\ tly to determine) spectrometric redshifts can predict the (more easily obtained) photometric redshifts, i.e. we seek to regress the spectrometric on the photometric redshifts, and we develop a clusterwise nearest neighbor regression procedure for this.Comment: 14 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

Author: Grefenstette Gregory
Muchemi Lawrence
Publication venue
Publication date: 24/05/2016
Field of study

Specialized dictionaries are used to understand concepts in specific domains, especially where those concepts are not part of the general vocabulary, or having meanings that differ from ordinary languages. The first step in creating a specialized dictionary involves detecting the characteristic vocabulary of the domain in question. Classical methods for detecting this vocabulary involve gathering a domain corpus, calculating statistics on the terms found there, and then comparing these statistics to a background or general language corpus. Terms which are found significantly more often in the specialized corpus than in the background corpus are candidates for the characteristic vocabulary of the domain. Here we present two tools, a directed crawler, and a distributional semantics package, that can be used together, circumventing the need of a background corpus. Both tools are available on the web

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework

Author: Frommholz Ingo
Ingwersen Peter
Lalmas Mounia
Larsen Birger
Piwowarski Benjamin
Van Rijsbergen Keith
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

The relevance of a document has many facets, going beyond the usual topical one, which have to be considered to satisfy a user's information need. Multiple representations of documents, like user-given reviews or the actual document content, can give evidence towards certain facets of relevance. In this respect polyrepresentation of documents, where such evidence is combined, is a crucial concept to estimate the relevance of a document. In this paper, we discuss how a geometrical retrieval framework inspired by quantum mechanics can be extended to support polyrepresentation. We show by example how different representations of a document can be modelled in a Hilbert space, similar to physical systems known from quantum mechanics. We further illustrate how these representations are combined by means of the tensor product to support polyrepresentation, and discuss the case that representations of documents are not independent from a user point of view. Besides giving a principled framework for polyrepresentation, the potential of this approach is to capture and formalise the complex interdependent relationships that the different representations can have between each other

CiteSeerX

Crossref

Copenhagen University Research Information System

VBN

University of Bedfordshire Repository

Text Classification: A Sequential Reading Approach

Author: Denoyer Ludovic
Dulac-Arnold Gabriel
Gallinari Patrick
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

We propose to model the text classification process as a sequential decision process. In this process, an agent learns to classify documents into topics while reading the document sentences sequentially and learns to stop as soon as enough information was read for deciding. The proposed algorithm is based on a modelisation of Text Classification as a Markov Decision Process and learns by using Reinforcement Learning. Experiments on four different classical mono-label corpora show that the proposed approach performs comparably to classical SVM approaches for large training sets, and better for small training sets. In addition, the model automatically adapts its reading process to the quantity of training information provided.Comment: ECIR201

arXiv.org e-Print Archive

Modèle de langue pour l'ordonnancement conjoint d'entités pertinentes dans un réseau d'informations hétérogènes

Author: Bahsoun Wahiba
Ben Jabeur Lamjed
Soulier Laure
Tamine Lynda
Publication venue: HAL CCSD
Publication date: 29/05/2012
Field of study

National audienceDans ce papier, nous proposons un nouveau modèle, appelé BibRank, ayant pour objectif d'ordonnancer conjointement des ressources hétérogènes, documents et auteurs, d'un réseau bibliographique selon leur degré de pertinence vis-à-vis d'une requête. Ce modèle utilise le principe de propagation des scores des entités en considérant à la fois la structure du réseau et le sujet de la requête. De plus, ce modèle introduit deux indicateurs de proximité thématique entre entités connectées suivant le type des entités reliées. Pour les relations entre entités homogènes, cet indicateur détecte les citations marginales tandis que pour les relations entre entités hétérogènes, il utilise deux sources d'évidence : le sujet du document et l'expertise de l'auteur. Des expérimentations, menées en utilisant le réseau bibliographique CiteSeerX, montrent l'efficacité du modèle d'ordonnancement proposé

Scientific Publications of the University of Toulouse II Le Mirail

A social model for Literature Access: Towards a weighted social network of authors

Author: Ben Jabeur Lamjed
Boughanem Mohand
Tamine Lynda
Publication venue: Centre de hautes études internationales d'Informatique Documentaire (C.I.D.)
Publication date: 01/01/2010
Field of study

International audienceThis paper presents a novel retrieval approach for literature access based on social network analysis. In fact, we investigate a social model where authors represent the main entities and relationships are extracted from co-author and citation links. Moreover, we define a weighting model for social relationships which takes into account the authors positions in the social network and their mutual collaborations. Assigned weights express influence, knowledge transfer and shared interest between authors. Furthermore, we estimate document relevance by combing the document-query similarity and the document social importance derived from corresponding authors. To evaluate the effectiveness of our model, we conduct a series of experiments on a scientific document dataset that includes textual content and social data extracted from the academic social network CiteULike. Final results show that the proposed model improves the retrieval effectiveness and outperforms traditional and social information retrieval baselines

CiteSeerX

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries

Author: Ananiadou S
Thompson P
Wang X
Publication venue
Publication date: 01/01/2012
Field of study

The University of Manchester - Institutional Repository