13,845 research outputs found
Semantic modelling of user interests based on cross-folksonomy analysis
The continued increase in Web usage, in particular participation in folksonomies, reveals a trend towards a more dynamic and interactive Web where individuals can organise and share resources. Tagging has emerged as the de-facto standard for the organisation of such resources, providing a versatile and reactive knowledge management mechanism that users find easy to use and understand. It is common nowadays for users to have multiple profiles in various folksonomies, thus distributing their tagging activities. In this paper, we present a method for the automatic consolidation of user profiles across two popular social networking sites, and subsequent semantic modelling of their interests utilising Wikipedia as a multi-domain model. We evaluate how much can be learned from such sites, and in which domains the knowledge acquired is focussed. Results show that far richer interest profiles can be generated for users when multiple tag-clouds are combine
Machine Learning of User Profiles: Representational Issues
As more information becomes available electronically, tools for finding
information of interest to users becomes increasingly important. The goal of
the research described here is to build a system for generating comprehensible
user profiles that accurately capture user interest with minimum user
interaction. The research described here focuses on the importance of a
suitable generalization hierarchy and representation for learning profiles
which are predictively accurate and comprehensible. In our experiments we
evaluated both traditional features based on weighted term vectors as well as
subject features corresponding to categories which could be drawn from a
thesaurus. Our experiments, conducted in the context of a content-based
profiling system for on-line newspapers on the World Wide Web (the IDD News
Browser), demonstrate the importance of a generalization hierarchy and the
promise of combining natural language processing techniques with machine
learning (ML) to address an information retrieval (IR) problem.Comment: 6 page
On content-based recommendation and user privacy in social-tagging systems
Recommendation systems and content filtering approaches based on annotations and ratings, essentially rely on users expressing their preferences and interests through their actions, in order to provide personalised content. This activity, in which users engage collectively has been named social tagging, and it is one of the most popular in which users engage online, and although it has opened new possibilities for application interoperability on the semantic web, it is also posing new privacy threats. It, in fact, consists of describing online or offline resources by using free-text labels (i.e. tags), therefore exposing the user profile and activity to privacy attacks. Users, as a result, may wish to adopt a privacy-enhancing strategy in order not to reveal their interests completely. Tag forgery is a privacy enhancing technology consisting of generating tags for categories or resources that do not reflect the user's actual preferences. By modifying their profile, tag forgery may have a negative impact on the quality of the recommendation system, thus protecting user privacy to a certain extent but at the expenses of utility loss. The impact of tag forgery on content-based recommendation is, therefore, investigated in a real-world application scenario where different forgery strategies are evaluated, and the consequent loss in utility is measured and compared.Peer ReviewedPostprint (author’s final draft
Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings
In this paper we present a novel interactive multimodal learning system,
which facilitates search and exploration in large networks of social multimedia
users. It allows the analyst to identify and select users of interest, and to
find similar users in an interactive learning setting. Our approach is based on
novel multimodal representations of users, words and concepts, which we
simultaneously learn by deploying a general-purpose neural embedding model. We
show these representations to be useful not only for categorizing users, but
also for automatically generating user and community profiles. Inspired by
traditional summarization approaches, we create the profiles by selecting
diverse and representative content from all available modalities, i.e. the
text, image and user modality. The usefulness of the approach is evaluated
using artificial actors, which simulate user behavior in a relevance feedback
scenario. Multiple experiments were conducted in order to evaluate the quality
of our multimodal representations, to compare different embedding strategies,
and to determine the importance of different modalities. We demonstrate the
capabilities of the proposed approach on two different multimedia collections
originating from the violent online extremism forum Stormfront and the
microblogging platform Twitter, which are particularly interesting due to the
high semantic level of the discussions they feature
Multi-dimensional clustering in user profiling
User profiling has attracted an enormous number of technological methods and
applications. With the increasing amount of products and services, user profiling
has created opportunities to catch the attention of the user as well as achieving
high user satisfaction. To provide the user what she/he wants, when and how,
depends largely on understanding them. The user profile is the representation of
the user and holds the information about the user. These profiles are the
outcome of the user profiling.
Personalization is the adaptation of the services to meet the user’s needs and
expectations. Therefore, the knowledge about the user leads to a personalized
user experience. In user profiling applications the major challenge is to build and
handle user profiles. In the literature there are two main user profiling methods,
collaborative and the content-based. Apart from these traditional profiling
methods, a number of classification and clustering algorithms have been used
to classify user related information to create user profiles. However, the profiling,
achieved through these works, is lacking in terms of accuracy. This is because,
all information within the profile has the same influence during the profiling even
though some are irrelevant user information.
In this thesis, a primary aim is to provide an insight into the concept of user
profiling. For this purpose a comprehensive background study of the literature
was conducted and summarized in this thesis. Furthermore, existing user
profiling methods as well as the classification and clustering algorithms were investigated. Being one of the objectives of this study, the use of these
algorithms for user profiling was examined. A number of classification and
clustering algorithms, such as Bayesian Networks (BN) and Decision Trees
(DTs) have been simulated using user profiles and their classification accuracy
performances were evaluated. Additionally, a novel clustering algorithm for the
user profiling, namely Multi-Dimensional Clustering (MDC), has been proposed.
The MDC is a modified version of the Instance Based Learner (IBL) algorithm.
In IBL every feature has an equal effect on the classification regardless of their
relevance. MDC differs from the IBL by assigning weights to feature values to
distinguish the effect of the features on clustering. Existing feature weighing
methods, for instance Cross Category Feature (CCF), has also been
investigated. In this thesis, three feature value weighting methods have been
proposed for the MDC. These methods are; MDC weight method by Cross
Clustering (MDC-CC), MDC weight method by Balanced Clustering (MDC-BC)
and MDC weight method by changing the Lower-limit to Zero (MDC-LZ). All of
these weighted MDC algorithms have been tested and evaluated. Additional
simulations were carried out with existing weighted and non-weighted IBL
algorithms (i.e. K-Star and Locally Weighted Learning (LWL)) in order to
demonstrate the performance of the proposed methods. Furthermore, a real life scenario is implemented to show how the MDC can be used for the user
profiling to improve personalized service provisioning in mobile environments.
The experiments presented in this thesis were conducted by using user profile
datasets that reflect the user’s personal information, preferences and interests.
The simulations with existing classification and clustering algorithms (e.g. Bayesian Networks (BN), Naïve Bayesian (NB), Lazy learning of Bayesian
Rules (LBR), Iterative Dichotomister 3 (Id3)) were performed on the WEKA
(version 3.5.7) machine learning platform. WEKA serves as a workbench to
work with a collection of popular learning schemes implemented in JAVA. In
addition, the MDC-CC, MDC-BC and MDC-LZ have been implemented on
NetBeans IDE 6.1 Beta as a JAVA application and MATLAB. Finally, the real life
scenario is implemented as a Java Mobile Application (Java ME) on NetBeans
IDE 7.1. All simulation results were evaluated based on the error rate and
accuracy
- …