23,720 research outputs found

    KERT: Automatic Extraction and Ranking of Topical Keyphrases from Content-Representative Document Titles

    Full text link
    We introduce KERT (Keyphrase Extraction and Ranking by Topic), a framework for topical keyphrase generation and ranking. By shifting from the unigram-centric traditional methods of unsupervised keyphrase extraction to a phrase-centric approach, we are able to directly compare and rank phrases of different lengths. We construct a topical keyphrase ranking function which implements the four criteria that represent high quality topical keyphrases (coverage, purity, phraseness, and completeness). The effectiveness of our approach is demonstrated on two collections of content-representative titles in the domains of Computer Science and Physics.Comment: 9 page

    Semantic user profiling techniques for personalised multimedia recommendation

    Get PDF
    Due to the explosion of news materials available through broadcast and other channels, there is an increasing need for personalised news video retrieval. In this work, we introduce a semantic-based user modelling technique to capture users’ evolving information needs. Our approach exploits implicit user interaction to capture long-term user interests in a profile. The organised interests are used to retrieve and recommend news stories to the users. In this paper, we exploit the Linked Open Data Cloud to identify similar news stories that match the users’ interest. We evaluate various recommendation parameters by introducing a simulation-based evaluation scheme

    A Bootstrapping architecture for time expression recognition in unlabelled corpora via syntactic-semantic patterns

    Get PDF
    In this paper we describe a semi-supervised approach to the extraction of time expression mentions in large unlabelled corpora based on bootstrapping. Bootstrapping techniques rely on a relatively small amount of initial human-supplied examples (termed “seeds”) of the type of entity or concept to be learned, in order to capture an initial set of patterns or rules from the unlabelled text that extract the supplied data. In turn, the learned patterns are employed to find new potential examples, and the process is repeated to grow the set of patterns and (optionally) the set of examples. In order to prevent the learned pattern set from producing spurious results, it becomes essential to implement a ranking and selection procedure to filter out “bad” patterns and, depending on the case, new candidate examples. Therefore, the type of patterns employed (knowledge representation) as well as the ranking and selection procedure are paramount to the quality of the results. We present a complete bootstrapping algorithm for recognition of time expressions, with a special emphasis on the type of patterns used (a combination of semantic and morpho- syntantic elements) and the ranking and selection criteria. Bootstrap- ping techniques have been previously employed with limited success for several NLP problems, both of recognition and classification, but their application to time expression recognition is, to the best of our knowledge, novel. As of this writing, the described architecture is in the final stages of implementation, with experimention and evalution being already underway.Postprint (published version

    Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR) 2007

    Get PDF
    This is the proceedings of the Workshop on Semantic Content Acquisition and Representation, held in conjunction with NODALIDA 2007, on May 24 2007 in Tartu, Estonia.</p
    • …
    corecore