162 research outputs found
Music recommendation: audio neighbourhoods to discover music in the long tail.
Millions of people use online music services every day and recommender systems are essential to browse these music collections. Users are looking for high quality recommendations, but also want to discover tracks and artists that they do not already know, newly released tracks, and the more niche music found in the 'long tail' of on-line music. Tag-based recommenders are not effective in this 'long tail' because relatively few people are listening to these tracks and so tagging tends to be sparse. However, similarity neighbourhoods in audio space can provide additional tag knowledge that is useful to augment sparse tagging. A new recommender exploits the combined knowledge, from audio and tagging, using a hybrid representation that extends the track's tag-based representation by adding semantic knowledge extracted from the tags of similar music tracks. A user evaluation and a larger experiment using Last.fm user data both show that the new hybrid recommender provides better quality recommendations than using only tags, together with a higher level of discovery of unknown and niche music. This approach of augmenting the representation for items that have missing information, with corresponding information from similar items in a complementary space, offers opportunities beyond content-based music recommendation
Content-awareness and graph-based ranking for tag recommendation in folksonomies
Tag recommendation algorithms aid the social tagging process in many userdriven
document indexing applications, such as social bookmarking and publication
sharing websites. This thesis gives an overview of existing tag recommendation
methods and proposes novel approaches that address the new document problem
and the task of ranking tags. The focus is on graph-based methods such as Folk-
Rank that apply weight spreading algorithms to a graph representation of the folksonomy.
In order to suggest tags for previously untagged documents, extensions are
presented that introduce content into the recommendation process as an additional
information source. To address the problem of ranking tags, an in-depth analysis
of graph models as well as ranking algorithms is conducted. Implicit assumptions
made by the widely-used graph model of the folksonomy are highlighted and an
improved model is proposed that captures the characteristics of the social tagging
data more accurately. Additionally, issues in the tag rank computation of FolkRank
are analysed and an adapted weight spreading approach for social tagging data is
presented. Moreover, the applicability of conventional weight spreading methods to
data from the social tagging domain is examined in detail. Finally, indications of
implicit negative feedback in the data structure of folksonomies are analysed and
novel approaches of identifying negative relationships are presented. By exploiting
the three-dimensional characteristics of social tagging data the proposed metrics are
based on stronger evidence and provide reliable measures of negative feedback.
Including content into the tag recommendation process leads to a significant
increase in recommendation accuracy on real-world datasets. The proposed adaptations
to graph models and ranking algorithms result in more accurate and computationally
less expensive recommenders. Moreover, new insights into the fundamental
characteristics of social tagging data are revealed and a novel data interpretation
that takes negative feedback into account is proposed
The impact of image descriptions on user tagging behavior: A study of the nature and functionality of crowdsourced tags
Crowdsourcing has emerged as a way to harvest social wisdom from thousands of volunteers to perform a series of tasks online. However, little research has been devoted to exploring the impact of various factors such as the content of a resource or crowdsourcing interface design on user tagging behavior. Although images' titles and descriptions are frequently available in image digital libraries, it is not clear whether they should be displayed to crowdworkers engaged in tagging. This paper focuses on offering insight to the curators of digital image libraries who face this dilemma by examining (i) how descriptions influence the user in his/her tagging behavior and (ii) how this relates to the (a) nature of the tags, (b) the emergent folksonomy, and (c) the findability of the images in the tagging system. We compared two different methods for collecting image tags from Amazon's Mechanical Turk's crowdworkers - with and without image descriptions. Several properties of generated tags were examined from different perspectives: diversity, specificity, reusability, quality, similarity, descriptiveness, and so on. In addition, the study was carried out to examine the impact of image description on supporting users' information seeking with a tag cloud interface. The results showed that the properties of tags are affected by the crowdsourcing approach. Tags from the "with description" condition are more diverse and more specific than tags from the "without description" condition, while the latter has a higher tag reuse rate. A user study also revealed that different tag sets provided different support for search. Tags produced "with description" shortened the path to the target results, whereas tags produced without description increased user success in the search task
Changing Higher Education Learning with Web 2.0 and Open Education Citation, Annotation, and Thematic Coding Appendices
Appendices of citations, annotations and themes for research conducted on four websites: Delicious, Wikipedia, YouTube, and Facebook
Tag based Bayesian latent class models for movies : economic theory reaches out to big data science
For the past 50 years, cultural economics has developed as an independent research specialism. At its core are the creative industries and the peculiar economics associated with them, central to which is a tension that arises from the notion that creative goods need to be experienced before an assessment can be made about the utility they deliver to the consumer. In this they differ from the standard private good that forms the basis of demand theory in economic textbooks, in which utility is known ex ante. Furthermore, creative goods are typically complex in composition and subject to heterogeneous and shifting consumer preferences. In response to this, models of linear optimization, rational addiction and Bayesian learning have been applied to better understand consumer decision- making, belief formation and revision. While valuable, these approaches do not lend themselves to forming verifiable hypothesis for the critical reason that they by-pass an essential aspect of creative products: namely, that of novelty. In contrast, computer sciences, and more specifically recommender theory, embrace creative products as a study object. Being items of online transactions, users of creative products share opinions on a massive scale and in doing so generate a flow of data driven research. Not limited by the multiple assumptions made in economic theory, data analysts deal with this type of commodity in a less constrained way, incorporating the variety of item characteristics, as well as their co-use by agents. They apply statistical techniques supporting big data, such as clustering, latent class analysis or singular value decomposition.
This thesis is drawn from both disciplines, comparing models, methods and data sets. Based upon movie consumption, the work contrasts bottom-up versus top-down approaches, individual versus collective data, distance measures versus the utility-based comparisons. Rooted in Bayesian latent class models, a synthesis is formed, supported by the random utility theory and recommender algorithm methods. The Bayesian approach makes explicit the experience good nature of creative goods by formulating the prior uncertainty of users towards both movie features and preferences. The latent class method, thus, infers the heterogeneous aspect of preferences, while its dynamic variant- the latent Markov model - gets around one of the main paradoxes in studying creative products: how to analyse taste dynamics when confronted with a good that is novel at each decision point. Generated by mainly movie-user-rating and movie-user-tag triplets, collected from the Movielens recommender system and made available as open data for research by the GroupLens research team, this study of preference patterns formation for creative goods is drawn from individual level data
- …