396 research outputs found

    SportsAnno: what do you think?

    Get PDF
    The automatic summarisation of sports video is of growing importance with the increased availability of on-demand content. Consumers who are unable to view events live often have a desire to watch a summary which allows then to quickly come to terms with all that has happened during a sporting event. Sports forums show that it is not only summaries that are desirable but also the opportunity to share one’s own point of view and discuss the opinions with a community of similar users. In this paper we give an overview of the ways in which annotations have been used to augment existing visual media. We present SportsAnno, a system developed to summarise World Cup 2006 matches and provide a means for open discussion of events within these matches

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

    The Físchlár-News-Stories system: personalised access to an archive of TV news

    Get PDF
    The “Físchlár” systems are a family of tools for capturing, analysis, indexing, browsing, searching and summarisation of digital video information. Físchlár-News-Stories, described in this paper, is one of those systems, and provides access to a growing archive of broadcast TV news. Físchlár-News-Stories has several notable features including the fact that it automatically records TV news and segments a broadcast news program into stories, eliminating advertisements and credits at the start/end of the broadcast. Físchlár-News-Stories supports access to individual stories via calendar lookup, text search through closed captions, automatically-generated links between related stories, and personalised access using a personalisation and recommender system based on collaborative filtering. Access to individual news stories is supported either by browsing keyframes with synchronised closed captions, or by playback of the recorded video. One strength of the Físchlár-News-Stories system is that it is actually used, in practice, daily, to access news. Several aspects of the Físchlár systems have been published before, bit in this paper we give a summary of the Físchlár-News-Stories system in operation by following a scenario in which it is used and also outlining how the underlying system realises the functions it offers

    POLIS: a probabilistic summarisation logic for structured documents

    Get PDF
    PhDAs the availability of structured documents, formatted in markup languages such as SGML, RDF, or XML, increases, retrieval systems increasingly focus on the retrieval of document-elements, rather than entire documents. Additionally, abstraction layers in the form of formalised retrieval logics have allowed developers to include search facilities into numerous applications, without the need of having detailed knowledge of retrieval models. Although automatic document summarisation has been recognised as a useful tool for reducing the workload of information system users, very few such abstraction layers have been developed for the task of automatic document summarisation. This thesis describes the development of an abstraction logic for summarisation, called POLIS, which provides users (such as developers or knowledge engineers) with a high-level access to summarisation facilities. Furthermore, POLIS allows users to exploit the hierarchical information provided by structured documents. The development of POLIS is carried out in a step-by-step way. We start by defining a series of probabilistic summarisation models, which provide weights to document-elements at a user selected level. These summarisation models are those accessible through POLIS. The formal definition of POLIS is performed in three steps. We start by providing a syntax for POLIS, through which users/knowledge engineers interact with the logic. This is followed by a definition of the logics semantics. Finally, we provide details of an implementation of POLIS. The final chapters of this dissertation are concerned with the evaluation of POLIS, which is conducted in two stages. Firstly, we evaluate the performance of the summarisation models by applying POLIS to two test collections, the DUC AQUAINT corpus, and the INEX IEEE corpus. This is followed by application scenarios for POLIS, in which we discuss how POLIS can be used in specific IR tasks

    A framework for responsive content adaptation in electronic display networks

    Get PDF
    Recent trends show an increase in the availability and functionality of handheld devices, wireless network technology, and electronic display networks. We propose the novel integration of these technologies to provide wireless access to content delivered to large-screen display systems. Content adaptation is used as a method of reformatting web pages to display more appropriately on handheld devices, and to remove unwanted content. A framework is presented that facilitates content adaptation, implemented as an adaptation layer, which is extended to provide personalization of adaptation settings and response to network conditions. The framework is implemented as a proxy server for a wireless network, and handles HTML and XML documents. Once a document has been requested by a user, the HTML/XML is retrieved and parsed, creating a Document Object Model tree representation. It is then altered according to the user’s personal settings or predefined settings, based on current network usage and the network resources available. Three adaptation techniques were implemented; spatial representation, which generates an image map of the document, text summarization, which creates a tree view representation of a document, and tag extraction, which replaces specific tags with links. Three proof-of-concept systems were developed in order to test the robustness of the framework. A system for use with digital slide shows, a digital signage system, and a generalized system for use with the internet were implemented. Testing was performed by accessing sample web pages through the content adaptation proxy server. Tag extraction works correctly for all HTML and XML document structures, whereas spatial representation and text summarization are limited to a controlled subset. Results indicate that the adaptive system has the ability to reduce average bandwidth usage, by decreasing the amount of data on the network, thereby allowing a greater number of users access to content. This suggests that responsive content adaptation has a positive influence on network performance metrics

    Sentence classification experiments for legal text summarisation

    Get PDF
    Abstract. We describe experiments in building a classifier which determines the rhetorica

    The HOLJ corpus: supporting summarisation of legal texts

    Get PDF
    We describe an XML-encoded corpus of texts in the legal domain which was gathered for an automatic summarisation project. We describe two distinct layers of annotation: manual annotation of the rhetorical status of sentences and an entirely automatic annotation process incorporating a host of individual linguistic processors. The manual rhetorical status annotation has been developed as training and testing material for a summarisation system based on the work of Teufel and Moens, while the automatic layer of annotation encodes linguistic information as features for a machine learning approach to rhetorical status classification. 1 Project Overvie

    Smartbook: Semantics Inside

    Get PDF
    This paper presents a vision for the future of the e-books which entails further development of technologies that will facilitate the creation and use of a new generation of "smart" books: e-books that are evolving, highly interactive, customisable, adaptable, intelligent, and furnished with a rich set of collaborative authoring and reading support services. The proposed set of tools will be integrated into an intelligent framework for collaborative book authoring and experiencing called SmartBook. The paper promotes the idea that the semantic technologies, intensively developed recently in connection with the Semantic Web initiative, can be incorporated in the book and become the key factor of making it "smarter"

    Summarising Legal Texts: Sentential Tense and Argumentative Roles

    Get PDF
    We report on the SUM project which applies automatic summarisation techniques to the legal domain. We pursue a methodology based on Teufel and Moens (2002) where sentences are classified according to their argumentative role. We describe some experiments with judgments of the House of Lords where we have performed automatic linguistic annotation of a small sample set in order to explore correlations between linguistic features and argumentative roles. We use state-of-the-art NLP techniques to perform the linguistic annotation using XML-based tools and a combination of rulebased and statistical methods. We focus here on the predictive capacity of tense and aspect features for a classifier

    Automated PDF highlighting to support faster curation of literature for Parkinson's and Alzheimer's disease

    Get PDF
    Neurodegenerative disorders such as Parkinson’s and Alzheimer’s disease are devastating and costly illnesses, a source of major global burden. In order to provide successful interventions for patients and reduce costs, both causes and pathological processes need to be understood. The ApiNATOMY project aims to contribute to our understanding of neurodegenerative disorders by manually curating and abstracting data from the vast body of literature amassed on these illnesses. As curation is labour-intensive, we aimed to speed up the process by automatically highlighting those parts of the PDF document of primary importance to the curator. Using techniques similar to those of summarisation, we developed an algorithm that relies on linguistic, semantic and spatial features. Employing this algorithm on a test set manually corrected for tool imprecision, we achieved a macro F1-measure of 0.51, which is an increase of 132% compared to the best bag-of-words baseline model. A user based evaluation was also conducted to assess the usefulness of the methodology on 40 unseen publications, which reveals that in 85% of cases all highlighted sentences are relevant to the curation task and in about 65% of the cases, the highlights are sufficient to support the knowledge curation task without needing to consult the full text. In conclusion, we believe that these are promising results for a step in automating the recognition of curation-relevant sentences. Refining our approach to pre-digest papers will lead to faster processing and cost reduction in the curation process
    corecore