481 research outputs found

    Exploring The Value Of Folksonomies For Creating Semantic Metadata

    No full text
    Finding good keywords to describe resources is an on-going problem: typically we select such words manually from a thesaurus of terms, or they are created using automatic keyword extraction techniques. Folksonomies are an increasingly well populated source of unstructured tags describing web resources. This paper explores the value of the folksonomy tags as potential source of keyword metadata by examining the relationship between folksonomies, community produced annotations, and keywords extracted by machines. The experiment has been carried-out in two ways: subjectively, by asking two human indexers to evaluate the quality of the generated keywords from both systems; and automatically, by measuring the percentage of overlap between the folksonomy set and machine generated keywords set. The results of this experiment show that the folksonomy tags agree more closely with the human generated keywords than those automatically generated. The results also showed that the trained indexers preferred the semantics of folksonomy tags compared to keywords extracted automatically. These results can be considered as evidence for the strong relationship of folksonomies to the human indexer’s mindset, demonstrating that folksonomies used in the del.icio.us bookmarking service are a potential source for generating semantic metadata to annotate web resources

    A lightweight web video model with content and context descriptions for integration with linked data

    Get PDF
    The rapid increase of video data on the Web has warranted an urgent need for effective representation, management and retrieval of web videos. Recently, many studies have been carried out for ontological representation of videos, either using domain dependent or generic schemas such as MPEG-7, MPEG-4, and COMM. In spite of their extensive coverage and sound theoretical grounding, they are yet to be widely used by users. Two main possible reasons are the complexities involved and a lack of tool support. We propose a lightweight video content model for content-context description and integration. The uniqueness of the model is that it tries to model the emerging social context to describe and interpret the video. Our approach is grounded on exploiting easily extractable evolving contextual metadata and on the availability of existing data on the Web. This enables representational homogeneity and a firm basis for information integration among semantically-enabled data sources. The model uses many existing schemas to describe various ontology classes and shows the scope of interlinking with the Linked Data cloud

    Study of result presentation and interaction for aggregated search

    Get PDF
    The World Wide Web has always attracted researchers and commercial search engine companies due to the enormous amount of information available on it. "Searching" on web has become an integral part of today's world, and many people rely on it when looking for information. The amount and the diversity of information available on the Web has also increased dramatically. Due to which, the researchers and the search engine companies are making constant efforts in order to make this information accessible to the people effectively. Not only there is an increase in the amount and diversity of information available online, users are now often seeking information on broader topics. Users seeking information on broad topics, gather information from various information sources (e.g, image, video, news, blog, etc). For such information requests, not only web results but results from different document genre and multimedia contents are also becoming relevant. For instance, users' looking for information on "Glasgow" might be interested in web results about Glasgow, Map of Glasgow, Images of Glasgow, News of Glasgow, and so on. Aggregated search aims to provide access to this diverse information in a unified manner by aggregating results from different information sources on a single result page. Hence making information gathering process easier for broad topics. This thesis aims to explore the aggregated search from the users' perspective. The thesis first and foremost focuses on understanding and describing the phenomena related to the users' search process in the context of the aggregated search. The goal is to participate in building theories and in understanding constraints, as well as providing insights into the interface design space. In building this understanding, the thesis focuses on the click-behavior, information need, source relevance, dynamics of search intents. The understanding comes partly from conducting users studies and, from analyzing search engine log data. While the thematic (or topical) relevance of documents is important, this thesis argues that the "source type" (source-orientation) may also be an important dimension in the relevance space for investigating in aggregated search. Therefore, relevance is multi-dimensional (topical and source-orientated) within the context of aggregated search. Results from the study suggest that the effect of the source-orientation was a significant factor in an aggregated search scenario. Hence adds another dimension to the relevance space within the aggregated search scenario. The thesis further presents an effective method which combines rule base and machine learning techniques to identify source-orientation behind a user query. Furthermore, after analyzing log-data from a search engine company and conducting user study experiments, several design issues that may arise with respect to the aggregated search interface are identified. In order to address these issues, suitable design guidelines that can be beneficial from the interface perspective are also suggested. To conclude, aim of this thesis is to explore the emerging aggregated search from users' perspective, since it is a very important for front-end technologies. An additional goal is to provide empirical evidence for influence of aggregated search on users searching behavior, and identify some of the key challenges of aggregated search. During this work several aspects of aggregated search will be uncovered. Furthermore, this thesis will provide a foundations for future research in aggregated search and will highlight the potential research directions

    Automatic tagging and geotagging in video collections and communities

    Get PDF
    Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

    I'm In Ur Bookmarks, Stealin' Ur Tags!: Closed Communities and Their Influence On Consistent Vocabularies In User Developed Folksonomies

    Get PDF
    Metadata technology allowing users to create and modify their own personal descriptive metadata for World Wide Web pages has also given rise to similarly-interested communities of web users registered at sites such as Delicious, who are refining their own content vocabularies. This research examined these vocabularies to determine if trends, patterns and unspoken vocabulary policy exists amongst the users. This study extracted data from Delicious' URL history pages and analyzed the data via content analysis. The research found that vocabulary consistency exists within the community, despite individually and independently generated data. The analysis was based specifically ontype and content descriptor identifiers for the Stargate: Atlantis fandom, which is a community of fans of the Sci Fi television show

    Top-N Recommendation on Graphs

    Full text link
    Recommender systems play an increasingly important role in online applications to help users find what they need or prefer. Collaborative filtering algorithms that generate predictions by analyzing the user-item rating matrix perform poorly when the matrix is sparse. To alleviate this problem, this paper proposes a simple recommendation algorithm that fully exploits the similarity information among users and items and intrinsic structural information of the user-item matrix. The proposed method constructs a new representation which preserves affinity and structure information in the user-item rating matrix and then performs recommendation task. To capture proximity information about users and items, two graphs are constructed. Manifold learning idea is used to constrain the new representation to be smooth on these graphs, so as to enforce users and item proximities. Our model is formulated as a convex optimization problem, for which we need to solve the well-known Sylvester equation only. We carry out extensive empirical evaluations on six benchmark datasets to show the effectiveness of this approach.Comment: CIKM 201

    When the System Becomes Your Personal Docent: Curated Book Recommendations

    Get PDF
    Curation is the act of selecting, organizing, and presenting content most often guided by professional or expert knowledge. While many popular applications have attempted to emulate this process by turning users into curators, we put an accent on a recommendation system which can leverage multiple data sources to accomplish the curation task. We introduce QBook, a recommender that acts as a personal docent by identifying and suggesting books tailored to the various preferences of each individual user. The goal of the designed system is to address several limitations often associated with recommenders in order to provide diverse and personalized book recommendations that can foster trust, effectiveness of the system, and improve the decision making process. QBook considers multiple perspectives, from analyzing user reviews, user historical data, and items\u27 metadata, to considering experts\u27 reviews and constantly evolving users\u27 preferences, to enhance the recommendation process, as well as quality and usability of the suggestions. QBook pairs each generated suggestion with an explanation that (i) showcases why a particular book was recommended and (ii) helps users decide which items, among the ones recommended, will best suit their individual interests. Empirical studies conducted using the Amazon/LibraryThing benchmark corpus demonstrate the correctness of the proposed methodology and QBook\u27s ability to outperform baseline and state-of-the-art methodologies for book recommendations
    corecore