69 research outputs found

    Tag disambiguation based on social network information

    No full text
    Within 20 years the Web has grown from a tool for scientists at CERN into a global information space. While returning to its roots as a read/write tool, its entering a more social and participatory phase. Hence a new, improved version called the Social Web where users are responsible for generating and sharing content on the global information space, they are also accountable for replicating the information. This collaborative activity can be observed in two of the most widely practised Social Web services such as social network sites and social tagging systems. Users annotate their interests and inclinations with free form keywords while they share them with their social connections. Although these keywords (tag) assist information organization and retrieval, theysuffer from polysemy.In this study we employ the effectiveness of social network sites to address the issue of ambiguity in social tagging. Moreover, we also propose that homophily in social network sites can be a useful aspect is disambiguating tags. We have extracted the ‘Likes’ of 20 Facebook users and employ them in disambiguation tags on Flickr. Classifiers are generated on the retrieved clusters from Flickr using K-Nearest-Neighbour algorithm and then their degree of similarity is calculated with user keywords. As tag disambiguation techniques lack gold standards for evaluation, we asked the users to indicate the contexts and used them as ground truth while examining the results. We analyse the performance of our approach by quantitative methods and report successful results. Our proposed method is able classify images with an accuracy of 6 out of 10 (on average). Qualitative analysis reveal some factors that affect the findings, and if addressed can produce more precise results

    Utilising semantic technologies for intelligent indexing and retrieval of digital images

    Get PDF
    The proliferation of digital media has led to a huge interest in classifying and indexing media objects for generic search and usage. In particular, we are witnessing colossal growth in digital image repositories that are difficult to navigate using free-text search mechanisms, which often return inaccurate matches as they in principle rely on statistical analysis of query keyword recurrence in the image annotation or surrounding text. In this paper we present a semantically-enabled image annotation and retrieval engine that is designed to satisfy the requirements of the commercial image collections market in terms of both accuracy and efficiency of the retrieval process. Our search engine relies on methodically structured ontologies for image annotation, thus allowing for more intelligent reasoning about the image content and subsequently obtaining a more accurate set of results and a richer set of alternatives matchmaking the original query. We also show how our well-analysed and designed domain ontology contributes to the implicit expansion of user queries as well as the exploitation of lexical databases for explicit semantic-based query expansion

    Discovering the Impact of Knowledge in Recommender Systems: A Comparative Study

    Get PDF
    Recommender systems engage user profiles and appropriate filtering techniques to assist users in finding more relevant information over the large volume of information. User profiles play an important role in the success of recommendation process since they model and represent the actual user needs. However, a comprehensive literature review of recommender systems has demonstrated no concrete study on the role and impact of knowledge in user profiling and filtering approache. In this paper, we review the most prominent recommender systems in the literature and examine the impression of knowledge extracted from different sources. We then come up with this finding that semantic information from the user context has substantial impact on the performance of knowledge based recommender systems. Finally, some new clues for improvement the knowledge-based profiles have been proposed.Comment: 14 pages, 3 tables; International Journal of Computer Science & Engineering Survey (IJCSES) Vol.2, No.3, August 201

    A concept–relationship acquisition and inference approach for hierarchical taxonomy construction from tags

    Get PDF
    Author name used in this publication: W. M. WangAuthor name used in this publication: C. F. CheungAuthor name used in this publication: Adela S. M. Lau2009-2010 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe

    Ontology learning from folksonomies.

    Get PDF
    Chen, Wenhao.Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.Includes bibliographical references (p. 63-70).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Ontologies and Folksonomies --- p.1Chapter 1.2 --- Motivation --- p.3Chapter 1.2.1 --- Semantics in Folksonomies --- p.3Chapter 1.2.2 --- Ontologies with basic level concepts --- p.5Chapter 1.2.3 --- Context and Context Effect --- p.6Chapter 1.3 --- Contributions --- p.6Chapter 1.4 --- Structure of the Thesis --- p.8Chapter 2 --- Background Study --- p.10Chapter 2.1 --- Semantic Web --- p.10Chapter 2.2 --- Ontology --- p.12Chapter 2.3 --- Folksonomy --- p.14Chapter 2.4 --- Cognitive Psychology --- p.17Chapter 2.4.1 --- Category (Concept) --- p.17Chapter 2.4.2 --- Basic Level Categories (Concepts) --- p.17Chapter 2.4.3 --- Context and Context Effect --- p.20Chapter 2.5 --- F1 Evaluation Metric --- p.21Chapter 2.6 --- State of the Art --- p.23Chapter 2.6.1 --- Ontology Learning --- p.23Chapter 2.6.2 --- Semantics in Folksonomy --- p.26Chapter 3 --- Ontology Learning from Folksonomies --- p.28Chapter 3.1 --- Generating Ontologies with Basic Level Concepts from Folksonomies --- p.29Chapter 3.1.1 --- Modeling Instances and Concepts in Folksonomies --- p.29Chapter 3.1.2 --- The Metric of Basic Level Categories (Concepts) --- p.30Chapter 3.1.3 --- Basic Level Concepts Detection Algorithm --- p.31Chapter 3.1.4 --- Ontology Generation Algorithm --- p.34Chapter 3.2 --- Evaluation --- p.35Chapter 3.2.1 --- Data Set and Experiment Setup --- p.35Chapter 3.2.2 --- Quantitative Analysis --- p.36Chapter 3.2.3 --- Qualitative Analysis --- p.39Chapter 4 --- Context Effect on Ontology Learning from Folksonomies --- p.43Chapter 4.1 --- Context-aware Basic Level Concepts Detection --- p.44Chapter 4.1.1 --- Modeling Context in Folksonomies --- p.44Chapter 4.1.2 --- Context Effect on Category Utility --- p.45Chapter 4.1.3 --- Context-aware Basic Level Concepts Detection Algorithm --- p.46Chapter 4.2 --- Evaluation --- p.47Chapter 4.2.1 --- Data Set and Experiment Setup --- p.47Chapter 4.2.2 --- Result Analysis --- p.49Chapter 5 --- Potential Applications --- p.54Chapter 5.1 --- Categorization of Web Resources --- p.54Chapter 5.2 --- Applications of Ontologies --- p.55Chapter 6 --- Conclusion and Future Work --- p.57Chapter 6.1 --- Conclusion --- p.57Chapter 6.2 --- Future Work --- p.59Bibliography --- p.6

    Multi-dimensional clustering in user profiling

    Get PDF
    User profiling has attracted an enormous number of technological methods and applications. With the increasing amount of products and services, user profiling has created opportunities to catch the attention of the user as well as achieving high user satisfaction. To provide the user what she/he wants, when and how, depends largely on understanding them. The user profile is the representation of the user and holds the information about the user. These profiles are the outcome of the user profiling. Personalization is the adaptation of the services to meet the user’s needs and expectations. Therefore, the knowledge about the user leads to a personalized user experience. In user profiling applications the major challenge is to build and handle user profiles. In the literature there are two main user profiling methods, collaborative and the content-based. Apart from these traditional profiling methods, a number of classification and clustering algorithms have been used to classify user related information to create user profiles. However, the profiling, achieved through these works, is lacking in terms of accuracy. This is because, all information within the profile has the same influence during the profiling even though some are irrelevant user information. In this thesis, a primary aim is to provide an insight into the concept of user profiling. For this purpose a comprehensive background study of the literature was conducted and summarized in this thesis. Furthermore, existing user profiling methods as well as the classification and clustering algorithms were investigated. Being one of the objectives of this study, the use of these algorithms for user profiling was examined. A number of classification and clustering algorithms, such as Bayesian Networks (BN) and Decision Trees (DTs) have been simulated using user profiles and their classification accuracy performances were evaluated. Additionally, a novel clustering algorithm for the user profiling, namely Multi-Dimensional Clustering (MDC), has been proposed. The MDC is a modified version of the Instance Based Learner (IBL) algorithm. In IBL every feature has an equal effect on the classification regardless of their relevance. MDC differs from the IBL by assigning weights to feature values to distinguish the effect of the features on clustering. Existing feature weighing methods, for instance Cross Category Feature (CCF), has also been investigated. In this thesis, three feature value weighting methods have been proposed for the MDC. These methods are; MDC weight method by Cross Clustering (MDC-CC), MDC weight method by Balanced Clustering (MDC-BC) and MDC weight method by changing the Lower-limit to Zero (MDC-LZ). All of these weighted MDC algorithms have been tested and evaluated. Additional simulations were carried out with existing weighted and non-weighted IBL algorithms (i.e. K-Star and Locally Weighted Learning (LWL)) in order to demonstrate the performance of the proposed methods. Furthermore, a real life scenario is implemented to show how the MDC can be used for the user profiling to improve personalized service provisioning in mobile environments. The experiments presented in this thesis were conducted by using user profile datasets that reflect the user’s personal information, preferences and interests. The simulations with existing classification and clustering algorithms (e.g. Bayesian Networks (BN), Naïve Bayesian (NB), Lazy learning of Bayesian Rules (LBR), Iterative Dichotomister 3 (Id3)) were performed on the WEKA (version 3.5.7) machine learning platform. WEKA serves as a workbench to work with a collection of popular learning schemes implemented in JAVA. In addition, the MDC-CC, MDC-BC and MDC-LZ have been implemented on NetBeans IDE 6.1 Beta as a JAVA application and MATLAB. Finally, the real life scenario is implemented as a Java Mobile Application (Java ME) on NetBeans IDE 7.1. All simulation results were evaluated based on the error rate and accuracy

    Analysis of user profile in social networks

    Get PDF
    Dissertação de mestrado em Engenharia de InformáticaWith this work it is intended to create / identify user profiles through their actions on social networks. This identification is to determine, in a specific way, which profile each user has, linking between the following dimensions and their sets of variables: sociodemographic characteristics (gender, age, education, situation before the economic activity indicator and occupational class) the specific type of aggregate practices conducted over the internet (study, work, services, search for information, communication and entertainment), the context of use of social networks (home, school, workplace or other), frequency of use (daily, frequent or sporadic) and, finally, the range of motivations of users of social networks (professional, informational, recreational or other). After a careful analysis of these dimensions, we are able to separate the different types of users use only analyzing their sets of variables that are associated with each other. This analysis also allows to deepen knowledge about the various uses of social networks, and may also be useful to the market in that it provides substantive information concerning the forms of articulation between the social characteristics of users and their activities, schemes and contexts of use

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Learning to select for information retrieval

    Get PDF
    The effective ranking of documents in search engines is based on various document features, such as the frequency of the query terms in each document, the length, or the authoritativeness of each document. In order to obtain a better retrieval performance, instead of using a single or a few features, there is a growing trend to create a ranking function by applying a learning to rank technique on a large set of features. Learning to rank techniques aim to generate an effective document ranking function by combining a large number of document features. Different ranking functions can be generated by using different learning to rank techniques or on different document feature sets. While the generated ranking function may be uniformly applied to all queries, several studies have shown that different ranking functions favour different queries, and that the retrieval performance can be significantly enhanced if an appropriate ranking function is selected for each individual query. This thesis proposes Learning to Select (LTS), a novel framework that selectively applies an appropriate ranking function on a per-query basis, regardless of the given query's type and the number of candidate ranking functions. In the learning to select framework, the effectiveness of a ranking function for an unseen query is estimated from the available neighbouring training queries. The proposed framework employs a classification technique (e.g. k-nearest neighbour) to identify neighbouring training queries for an unseen query by using a query feature. In particular, a divergence measure (e.g. Jensen-Shannon), which determines the extent to which a document ranking function alters the scores of an initial ranking of documents for a given query, is proposed for use as a query feature. The ranking function which performs the best on the identified training query set is then chosen for the unseen query. The proposed framework is thoroughly evaluated on two different TREC retrieval tasks (namely, Web search and adhoc search tasks) and on two large standard LETOR feature sets, which contain as many as 64 document features, deriving conclusions concerning the key components of LTS, namely the query feature and the identification of neighbouring queries components. Two different types of experiments are conducted. The first one is to select an appropriate ranking function from a number of candidate ranking functions. The second one is to select multiple appropriate document features from a number of candidate document features, for building a ranking function. Experimental results show that our proposed LTS framework is effective in both selecting an appropriate ranking function and selecting multiple appropriate document features, on a per-query basis. In addition, the retrieval performance is further enhanced when increasing the number of candidates, suggesting the robustness of the learning to select framework. This thesis also demonstrates how the LTS framework can be deployed to other search applications. These applications include the selective integration of a query independent feature into a document weighting scheme (e.g. BM25), the selective estimation of the relative importance of different query aspects in a search diversification task (the goal of the task is to retrieve a ranked list of documents that provides a maximum coverage for a given query, while avoiding excessive redundancy), and the selective application of an appropriate resource for expanding and enriching a given query for document search within an enterprise. The effectiveness of the LTS framework is observed across these search applications, and on different collections, including a large scale Web collection that contains over 50 million documents. This suggests the generality of the proposed learning to select framework. The main contributions of this thesis are the introduction of the LTS framework and the proposed use of divergence measures as query features for identifying similar queries. In addition, this thesis draws insights from a large set of experiments, involving four different standard collections, four different search tasks and large document feature sets. This illustrates the effectiveness, robustness and generality of the LTS framework in tackling various retrieval applications
    corecore