Search CORE

68 research outputs found

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging

Author: Chen Sheng
Mehdad Yashar
Pappu Aasish
Soni Akshay
Publication venue
Publication date: 01/01/2017
Field of study

Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and search. In this work, we propose a novel yet simple approach called DocTag2Vec to accomplish this task. We substantially extend Word2Vec and Doc2Vec---two popular models for learning distributed representation of words and documents. In DocTag2Vec, we simultaneously learn the representation of words, documents, and tags in a joint vector space during training, and employ the simple

k

-nearest neighbor search to predict tags for unseen documents. In contrast to previous multi-label learning methods, DocTag2Vec directly deals with raw text instead of provided feature vector, and in addition, enjoys advantages like the learning of tag representation, and the ability of handling newly created tags. To demonstrate the effectiveness of our approach, we conduct experiments on several datasets and show promising results against state-of-the-art methods.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Recommended from our members

Improving tag recommendation using social networks

Author: Rae Adam
Sigurbjörnsson Börkur
van Zwol Roelof
Publication venue
Publication date: 01/04/2010
Field of study

In this paper we address the task of recommending additional tags to partially annotated media objects, in our case images. We propose an extendable framework that can recommend tags using a combination of different personalised and collective contexts. We combine information from four contexts: (1) all the photos in the system, (2) a user's own photos, (3) the photos of a user's social contacts, and (4) the photos posted in the groups of which a user is a member. Variants of methods (1) and (2) have been proposed in previous work, but the use of (3) and (4) is novel. For each of the contexts we use the same probabilistic model and Borda Count based aggregation approach to generate recommendations from different contexts into a unified ranking of recommended tags. We evaluate our system using a large set of real-world data from Flickr. We show that by using personalised contexts we can significantly improve tag recommendation compared to using collective knowledge alone. We also analyse our experimental results to explore the capabilities of our system with respect to a user's social behaviour

Open Research Online (The Open University)

Automatic tagging and geotagging in video collections and communities

Author: Jones Gareth J.F.
Larson Martha
Serdyukov Pavel
Soleymani Mohammad
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2011
Field of study

Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

Irish Universities

DCU Online Research Access Service

Automated Text Abstraction from Documents and Webpages Metadata using Probabilistic Clusteringalgorithms

Author: Silja Joy, Asst Prof. Nisha J. R.
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/06/2016
Field of study

Annotations are comments, notes, explanations, tagsor other types of external remarks. Annotation can be added to a text document or few portions of document or to a webpage. Annotation helps effective information retrievals. Webpage metadata is the data related with website, it is machine understandable information about web resources or other tags.Collaborative annotations are based on user created tags to annotate new objects. These tags are related user created labels for entities and allows user to organize and index the contents. Tagging is the act of adding keywords to objects. There have been significant amount of work to be performed in coming up with the tags for text documents or other resources like webpages, images and videos. Automated Annotation System (AAS) which uses algorithms like K-Means and Distributed Hash Table (DHT) to automatically create the attribute or annotation from documents or metadata of webpages. This proposed annotation technique provides the processing of metadata and/or text to efficiently come up with annotations rather than manually understanding the metadata or analyzing the text

International Journal on Recent and Innovation Trends in Computing and Communication

Tag-Aware Recommender Systems: A State-of-the-art Survey

Author: A Capocci
A Clauset
A Gunawardana
A Hotho
AE Gelfand
AP Dempster
B Pittel
C Cattuto
C Cattuto
C Cattuto
C Liu
DM Blei
G Adomavicius
G Cimini
G Ghoshal
G Koutrika
G Linden
G Salton
GQ Zhang
J Scott
JA Hanley
JB Schafer
JL Herlocker
JM Kleinberg
JW Wang
K Tso
L Lathauwer De
L Lü
L Spiteri
LdaF Costa
M Dubinko
M Girvan
M Medo
MEJ Newman
MJ Pazzani
MS Shang
MS Shang
MS Shang
O Nov
P Kazienko
P Mika
P Resnick
P Resnick
P Wu
R Albert
R Lambiotte
S Boccaletti
S Brin
S Deerwester
SN Dorogovtsev
T Zhou
T Zhou
T Zhou
Tao Zhou
TG Kolda
V Zlatić
X Si
Y Ding
YC Zhang
Yi-Cheng Zhang
Z Huang
Zi-Ke Zhang
ZK Zhang
ZK Zhang
ZK Zhang
ZK Zhang
ZK Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/02/2012
Field of study

In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related works and future challenges of tag-aware recommendation algorithms.Comment: 19 pages, 3 figure

arXiv.org e-Print Archive

Crossref

RERO DOC Digital Library

Adaptive Technique for Document Annotation to Identify Attributes of Interest

Author: Nupoor Gade, Prof. Rugraj
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2016
Field of study

Many application domains generate and share information which describes their products and services. Such description contains unstructured information. So, it is always difficult to find the useful metadata. The information extraction algorithms are very expensive or inaccurate when operating on such unstructured information. This paper proposes adaptive technique for document annotation process to retrieve the useful information. This approach is based on Collaborative Adaptive Data Sharing (CADS) platform for document annotation. A CADS uses query workload to direct the annotation process. A key attribute of CADS is that it identifies important data attributes of the application. Further it uses this information to direct the data insertion and querying

International Journal on Recent and Innovation Trends in Computing and Communication

Propagating fine-grained topic labels in news snippets

Author: Eugénio Oliveira
Jorge Teixeira
Luís Sarmento
Sérgio Nunes
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

We propose an unsupervised method for propagating automatically extracted fine-grained topic labels among news items to improve their topic description for subsequent text classification procedure. This method compares vector representations of news items and assigns to each news item the label of its closest neighbour with a different topic label. Results obtained show that high precision can be achieved in propagating the top ranked topic label, and that 2-gram and 3-gram feature representations optimize the precision

Crossref

Repositório Aberto da Universidade do Porto