6,051 research outputs found

    Building a Document Genre Corpus: a Profile of the KRYS I Corpus

    Get PDF
    This paper describes the KRYS I corpus (http://www.krys-corpus.eu/Info.html), consisting of documents classified into 70 genre classes. It has been constructed as part of an effort to automate document genre classification as distinct from topic detection. Previously there has been very little work on building corpora of texts which have been classified using a non-topical genre palette. The reason for this is partly due to the fact that genre as a concept, is rooted in philosophy, rhetoric and literature, and highly complex and domain dependent in its interpretation ([11]). The usefulness of genre in everyday information search is only now starting to be recognised and there is no genre classification schema that has been consolidated to have applicable value in this direction. By presenting here our experiences in constructing the KRYS I corpus, we hope to shed light on the information gathering and seeking behaviour and the role of genre in these activities, as well as a way forward for creating a better corpus for testing automated genre classification tasks and the application of these tasks to other domains

    A survey of comics research in computer science

    Full text link
    Graphical novels such as comics and mangas are well known all over the world. The digital transition started to change the way people are reading comics, more and more on smartphones and tablets and less and less on paper. In the recent years, a wide variety of research about comics has been proposed and might change the way comics are created, distributed and read in future years. Early work focuses on low level document image analysis: indeed comic books are complex, they contains text, drawings, balloon, panels, onomatopoeia, etc. Different fields of computer science covered research about user interaction and content generation such as multimedia, artificial intelligence, human-computer interaction, etc. with different sets of values. We propose in this paper to review the previous research about comics in computer science, to state what have been done and to give some insights about the main outlooks

    Building a document genre corpus: a profile of the KRYS I corpus

    Get PDF
    This paper describes the KRYS I corpus, consisting of documents classified into 70 genre classes. It has been constructed as part of an effort to automate document genre classification as distinct from topic detection. Previously there has been very little work on building corpora of texts which have been classified using a nontopical genre palette. The reason for this is partly due to the fact that genre as a concept, is rooted in philosophy, rhetoric and literature, and highly complex and domain dependent in its interpretation ([11]). The usefulness of genre in everyday information search is only now starting to be recognised and there is no genre classification schema that has been consolidated to have applicable value in this direction. By presenting here our experiences in constructing the KRYS I corpus, we hope to shed light on the information gathering and seeking behaviour and the role of genre in these activities, as well as a way forward for creating a better corpus for testing automated genre classification tasks and the application of these tasks to other domains.

    Video information retrieval using objects and ostensive relevance feedback

    Get PDF
    In this paper, we present a brief overview of current approaches to video information retrieval (IR) and we highlight its limitations and drawbacks in terms of satisfying user needs. We then describe a method of incorporating object-based relevance feedback into video IR which we believe opens up new possibilities for helping users find information in video archives. Following this we describe our own work on shot retrieval from video archives which uses object detection, object-based relevance feedback and a variation of relevance feedback called ostensive RF which is particularly appropriate for this type of retrieval

    An ontology-based framework for the automated analysis and interpretation of comic books' images

    Get PDF
    International audienceSince the beginning of the twenty-first century, the cultural industry has been through a massive and historical mutation induced by the rise of digital technologies. The comic books industry keeps looking for the right solution and has not yet produced anything as convincing as the music or movie have. A lot of energy has been spent to transfer printed material to digital supports so far. The specificities of those supports are not always exploited at the best of their capabilities, while they could potentially be used to create new reading conventions. In spite of the needs induced by the large amount of data created since the beginning of the comics history, content indexing has been left behind. It is indeed quite a challenge to index such a composition of textual and visual information. While a growing number of researchers are working on comic books' image analysis from a low-level point of view, only a few are tackling the issue of representing the content at a high semantic level. We propose in this article a framework to handle the content of a comic book, to support the automatic extraction of its visual components and to formalize the semantic of the domain's codes. We tested our framework over two applications: 1) the unsupervised content discovery of comic books' images, 2) its capabilities to handle complex layouts and to produce a respectful browsing experience to the digital comics reader

    Guidelines for the presentation and visualisation of lifelog content

    Get PDF
    Lifelogs offer rich voluminous sources of personal and social data for which visualisation is ideally suited to providing access, overview, and navigation. We explore through examples of our visualisation work within the domain of lifelogging the major axes on which lifelogs operate, and therefore, on which their visualisations should be contingent. We also explore the concept of ‘events’ as a way to significantly reduce the complexity of the lifelog for presentation and make it more human-oriented. Finally we present some guidelines and goals which should be considered when designing presentation modes for lifelog conten

    Decomposing Complex Queries for Tip-of-the-tongue Retrieval

    Full text link
    When re-finding items, users who forget or are uncertain about identifying details often rely on creative strategies for expressing their information needs -- complex queries that describe content elements (e.g., book characters or events), information beyond the document text (e.g., descriptions of book covers), or personal context (e.g., when they read a book). This retrieval setting, called tip of the tongue (TOT), is especially challenging for models heavily reliant on lexical and semantic overlap between query and document text. In this work, we introduce a simple yet effective framework for handling such complex queries by decomposing the query into individual clues, routing those as sub-queries to specialized retrievers, and ensembling the results. This approach allows us to take advantage of off-the-shelf retrievers (e.g., CLIP for retrieving images of book covers) or incorporate retriever-specific logic (e.g., date constraints). We show that our framework incorportating query decompositions into retrievers can improve gold book recall up to 7% relative again for Recall@5 on a new collection of 14,441 real-world query-book pairs from an online community for resolving TOT inquiries

    Search and Display

    Get PDF

    New literacies and future educational culture

    Get PDF
    The paper draws attention to three developments that are crucial to online education. First, the new literacy required by group discussion in writing, i.e. by computer‐mediated communication ('e‐talk') is discussed Educators are urged to delimit and structure their courses so that online conversations in writing are successfully framed for effective discourse. Second, new literacy arising from the merging of multimedia with text is considered It is maintained that this will enhance communication, not debase it. Third, the way that increasing ease of information retrieval is eroding boundaries between traditional disciplines is discussed It is argued that this may create new difficulties in education. The paper recommends various ways of overcoming the problems that arise from the three developments
    corecore