244,462 research outputs found

    Personal digital libraries: Keeping track of academic reading material

    Get PDF
    This paper discusses optionsfor tracking academic reading material and introduces a personal digital library solution. We combined and extended the open source projects Zotero and Greenstone such that material can be easily downloaded and ingested into the combined system. Our prototype system has been explored in a small user study

    The TREC2001 video track: information retrieval on digital video information

    Get PDF
    The development of techniques to support content-based access to archives of digital video information has recently started to receive much attention from the research community. During 2001, the annual TREC activity, which has been benchmarking the performance of information retrieval techniques on a range of media for 10 years, included a ”track“ or activity which allowed investigation into approaches to support searching through a video library. This paper is not intended to provide a comprehensive picture of the different approaches taken by the TREC2001 video track participants but instead we give an overview of the TREC video search task and a thumbnail sketch of the approaches taken by different groups. The reason for writing this paper is to highlight the message from the TREC video track that there are now a variety of approaches available for searching and browsing through digital video archives, that these approaches do work, are scalable to larger archives and can yield useful retrieval performance for users. This has important implications in making digital libraries of video information attainable

    Version Control in Online Software Repositories

    No full text
    Software version control repositories provide a uniform and stable interface to manage documents and their version histories. Unfortunately, Open Source systems, for example, CVS, Subversion, and GNU Arch are not well suited to highly collaborative environments and fail to track semantic changes in repositories. We introduce document provenance as our Description Logic framework to track the semantic changes in software repositories and draw interesting results about their historic behaviour using a rule-based inference engine. To support the use of this framework, we have developed our own online collaborative tool, leveraging the fluency of the modern WikiWikiWeb

    TRECVID: benchmarking the effectiveness of information retrieval tasks on digital video

    Get PDF
    Many research groups worldwide are now investigating techniques which can support information retrieval on archives of digital video and as groups move on to implement these techniques they inevitably try to evaluate the performance of their techniques in practical situations. The difficulty with doing this is that there is no test collection or any environment in which the effectiveness of video IR or video IR sub-tasks, can be evaluated and compared. The annual series of TREC exercises has, for over a decade, been benchmarking the effectiveness of systems in carrying out various information retrieval tasks on text and audio and has contributed to a huge improvement in many of these. Two years ago, a track was introduced which covers shot boundary detection, feature extraction and searching through archives of digital video. In this paper we present a summary of the activities in the TREC Video track in 2002 where 17 teams from across the world took part

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Applying digital content management to support localisation

    Get PDF
    The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM

    Criteria and Procedures for the Promotion and Tenure of Library Faculty

    Get PDF

    Enriching Existing Test Collections with OXPath

    Full text link
    Extending TREC-style test collections by incorporating external resources is a time consuming and challenging task. Making use of freely available web data requires technical skills to work with APIs or to create a web scraping program specifically tailored to the task at hand. We present a light-weight alternative that employs the web data extraction language OXPath to harvest data to be added to an existing test collection from web resources. We demonstrate this by creating an extended version of GIRT4 called GIRT4-XT with additional metadata fields harvested via OXPath from the social sciences portal Sowiport. This allows the re-use of this collection for other evaluation purposes like bibliometrics-enhanced retrieval. The demonstrated method can be applied to a variety of similar scenarios and is not limited to extending existing collections but can also be used to create completely new ones with little effort.Comment: Experimental IR Meets Multilinguality, Multimodality, and Interaction - 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September 11-14, 201

    Text categorization and similarity analysis: similarity measure, literature review

    Get PDF
    Document classification and provenance has become an important area of computer science as the amount of digital information is growing significantly. Organisations are storing documents on computers rather than in paper form. Software is now required that will show the similarities between documents (i.e. document classification) and to point out duplicates and possibly the history of each document (i.e. provenance). Poor organisation is common and leads to situations like above. There exists a number of software solutions in this area designed to make document organisation as simple as possible. I'm doing my project with Pingar who are a company based in Auckland who aim to help organise the growing amount of unstructured digital data. This reports analyses the existing literature in this area with the aim to determine what already exists and how my project will be different from existing solutions

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
    • 

    corecore