20,960 research outputs found

    KERT: Automatic Extraction and Ranking of Topical Keyphrases from Content-Representative Document Titles

    Full text link
    We introduce KERT (Keyphrase Extraction and Ranking by Topic), a framework for topical keyphrase generation and ranking. By shifting from the unigram-centric traditional methods of unsupervised keyphrase extraction to a phrase-centric approach, we are able to directly compare and rank phrases of different lengths. We construct a topical keyphrase ranking function which implements the four criteria that represent high quality topical keyphrases (coverage, purity, phraseness, and completeness). The effectiveness of our approach is demonstrated on two collections of content-representative titles in the domains of Computer Science and Physics.Comment: 9 page

    Information-theoretic measures of music listening behaviour

    Get PDF
    We present an information-theoretic approach to the mea- surement of users’ music listening behaviour and selection of music features. Existing ethnographic studies of mu- sic use have guided the design of music retrieval systems however are typically qualitative and exploratory in nature. We introduce the SPUD dataset, comprising 10, 000 hand- made playlists, with user and audio stream metadata. With this, we illustrate the use of entropy for analysing music listening behaviour, e.g. identifying when a user changed music retrieval system. We then develop an approach to identifying music features that reflect users’ criteria for playlist curation, rejecting features that are independent of user behaviour. The dataset and the code used to produce it are made available. The techniques described support a quantitative yet user-centred approach to the evaluation of music features and retrieval systems, without assuming objective ground truth labels

    Enhancing multi-source content delivery in content-centric networks with fountain coding

    Get PDF
    Fountain coding has been considered as especially suitable for lossy environments, such as wireless networks, as it provides redundancy while reducing coordination overheads between sender(s) and receiver(s). As such it presents beneficial properties for multi-source and/or multicast communication. In this paper we investigate enhancing/increasing multi-source content delivery efficiency in the context of Content-Centric Networking (CCN) with the usage of fountain codes. In particular, we examine whether the combination of fountain coding with the in-network caching capabilities of CCN can further improve performance. We also present an enhancement of CCN's Interest forwarding mechanism that aims at minimizing duplicate transmissions that may occur in a multi-source transmission scenario, where all available content providers and caches with matching (cached) content transmit data packets simultaneously. Our simulations indicate that the use of fountain coding in CCN is a valid approach that further increases network performance compared to traditional schemes

    Information-theoretic measures of music listening behaviour

    Get PDF
    We present an information-theoretic approach to the mea- surement of users’ music listening behaviour and selection of music features. Existing ethnographic studies of mu- sic use have guided the design of music retrieval systems however are typically qualitative and exploratory in nature. We introduce the SPUD dataset, comprising 10, 000 hand- made playlists, with user and audio stream metadata. With this, we illustrate the use of entropy for analysing music listening behaviour, e.g. identifying when a user changed music retrieval system. We then develop an approach to identifying music features that reflect users’ criteria for playlist curation, rejecting features that are independent of user behaviour. The dataset and the code used to produce it are made available. The techniques described support a quantitative yet user-centred approach to the evaluation of music features and retrieval systems, without assuming objective ground truth labels

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur

    Current Challenges and Visions in Music Recommender Systems Research

    Full text link
    Music recommender systems (MRS) have experienced a boom in recent years, thanks to the emergence and success of online streaming services, which nowadays make available almost all music in the world at the user's fingertip. While today's MRS considerably help users to find interesting music in these huge catalogs, MRS research is still facing substantial challenges. In particular when it comes to build, incorporate, and evaluate recommendation strategies that integrate information beyond simple user--item interactions or content-based descriptors, but dig deep into the very essence of listener needs, preferences, and intentions, MRS research becomes a big endeavor and related publications quite sparse. The purpose of this trends and survey article is twofold. We first identify and shed light on what we believe are the most pressing challenges MRS research is facing, from both academic and industry perspectives. We review the state of the art towards solving these challenges and discuss its limitations. Second, we detail possible future directions and visions we contemplate for the further evolution of the field. The article should therefore serve two purposes: giving the interested reader an overview of current challenges in MRS research and providing guidance for young researchers by identifying interesting, yet under-researched, directions in the field

    The aceToolbox: low-level audiovisual feature extraction for retrieval and classification

    Get PDF
    In this paper we present an overview of a software platform that has been developed within the aceMedia project, termed the aceToolbox, that provides global and local lowlevel feature extraction from audio-visual content. The toolbox is based on the MPEG-7 eXperimental Model (XM), with extensions to provide descriptor extraction from arbitrarily shaped image segments, thereby supporting local descriptors reflecting real image content. We describe the architecture of the toolbox as well as providing an overview of the descriptors supported to date. We also briefly describe the segmentation algorithm provided. We then demonstrate the usefulness of the toolbox in the context of two different content processing scenarios: similarity-based retrieval in large collections and scene-level classification of still images

    InSPeCT: Integrated Surveillance for Port Container Traffic

    Get PDF
    This paper describes a fully-operational content-indexing and management system, designed for monitoring and profiling freight-based vehicular traffic in a seaport environment. The 'InSPeCT' system captures video footage of passing vehicles and uses tailored OCR to index the footage according to vehicle license plates and freight codes. In addition to real-time functionality such as alerting, the system provides advanced search techniques for the efficient retrieval of records, where each vehicle is profiled according to multi-angled video, context information, and links to external information sources. Currently being piloted at a busy national seaport, the feedback from port officials indicates the system to be extremely useful in supplementing their existing transportation-security structures
    • …
    corecore