46,262 research outputs found

    Adding Context to Social Tagging Systems

    Get PDF
    Many of the features of Web 2.0 encourage users to actively interact with each other. Social tagging systems represent one of the good examples that reflect this trend on the Web. The primary purpose of social tagging systems is to facilitate shared access to resources. Our focus in this paper is on the attempts to overcome some of the limitations in social tagging systems such as the flat structure of folksonomies and the absence of semantics in terms of information retrieval. We propose and develop an integrated approach, social tagging systems with directory facility, which can overcome the limitations of both traditional taxonomies and folksonomies. Our preliminary experiments indicate that this approach is promising and that the context provided by the directory facility improves the precision of information retrieval. As well, our synonym detection algorithm is capable of finding synonyms in social tagging systems without any external inputs

    POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset

    Full text link
    Many services that perform information retrieval for Points of Interest (POI) utilize a Lucene-based setup with spatial filtering. While this type of system is easy to implement it does not make use of semantics but relies on direct word matches between a query and reviews leading to a loss in both precision and recall. To study the challenging task of semantically enriching POIs from unstructured data in order to support open-domain search and question answering (QA), we introduce a new dataset POIReviewQA. It consists of 20k questions (e.g."is this restaurant dog friendly?") for 1022 Yelp business types. For each question we sampled 10 reviews, and annotated each sentence in the reviews whether it answers the question and what the corresponding answer is. To test a system's ability to understand the text we adopt an information retrieval evaluation by ranking all the review sentences for a question based on the likelihood that they answer this question. We build a Lucene-based baseline model, which achieves 77.0% AUC and 48.8% MAP. A sentence embedding-based model achieves 79.2% AUC and 41.8% MAP, indicating that the dataset presents a challenging problem for future research by the GIR community. The result technology can help exploit the thematic content of web documents and social media for characterisation of locations

    Semantics-driven event clustering in Twitter feeds

    Get PDF
    Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use different information sources - either textual, temporal, geographic or community features - have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic information can also be used to drive the actual event detection, which is less covered by academic research. We therefore supplemented an existing baseline event clustering algorithm with semantic information about the tweets in order to improve its performance. This paper lays out the details of the semantics-driven event clustering algorithms developed, discusses a novel method to aid in the creation of a ground truth for event detection purposes, and analyses how well the algorithms improve over baseline. We find that assigning semantic information to every individual tweet results in just a worse performance in F1 measure compared to baseline. If however semantics are assigned on a coarser, hashtag level the improvement over baseline is substantial and significant in both precision and recall

    Metadata Augmentation for Semantic- and Context- Based Retrieval of Digital Cultural Objects

    Get PDF
    Cultural objects are increasingly stored and generated in digital form, yet effective methods for their indexing and retrieval still remain an open area of research. The main problem arises from the disconnection between the content-based indexing approach used by computer scientists and the description-based approach used by information scientists. There is also a lack of representational schemes that allow the alignment of the semantics and context with keywords and low-level features that can be automatically extracted from the content of these cultural objects. This paper presents an integrated approach to address these problems, taking advantage of both computer science and information science approaches. The focus is on the rationale and conceptual design of the system and its various components. In particular, we discuss techniques for augmenting commonly used metadata with visual features and domain knowledge to generate high-level abstract metadata which in turn can be used for semantic and context-based indexing and retrieval. We use a sample collection of Vietnamese traditional woodcuts to demonstrate the usefulness of this approach

    Developing a Formal Model for Mind Maps

    Get PDF
    Mind map is a graphical technique, which is used to represent words, concepts, tasks or other connected items or arranged around central topic or idea. Mind maps are widely used, therefore exist plenty of software programs to create or edit them, while there is none format for the model representation, neither a standard format. This paper presents and effort to propose a formal mind map model aiming to describe the structure, content, semantics and social connections. The structure describes the basic mind map graph consisted of a node set, an edge set, a cloud set and a graphical connections set. The content includes the set of the texts and objects linked to the nodes. The social connections are the mind maps of other users, which form the neighborhood of the mind map owner in a social networking system. Finally, the mind map semantics is any true logic connection between mind map textual parts and a concept. Each of these elements of the model is formally described building the suggested mind map model. Its establishment will support the application of algorithms and methods towards their information extraction
    • …
    corecore