112,430 research outputs found

    Introducing a corpus of conversational stories. Construction and annotation of the Narrative Corpus

    Get PDF
    Although widely seen as critical both in terms of its frequency and its social significance as a prime means of encoding and perpetuating moral stance and configuring self and identity, conversational narrative has received little attention in corpus linguistics. In this paper we describe the construction and annotation of a corpus that is intended to advance the linguistic theory of this fundamental mode of everyday social interaction: the Narrative Corpus (NC). The NC contains narratives extracted from the demographically-sampled sub-corpus of the British National Corpus (BNC) (XML version). It includes more than 500 narratives, socially balanced in terms of participant sex, age, and social class. We describe the extraction techniques, selection criteria, and sampling methods used in constructing the NC. Further, we describe four levels of annotation implemented in the corpus: speaker (social information on speakers), text (text Ids, title, type of story, type of embedding etc.), textual components (pre-/post-narrative talk, narrative, and narrative-initial/final utterances), and utterance (participation roles, quotatives and reporting modes). A brief rationale is given for each level of annotation, and possible avenues of research facilitated by the annotation are sketched out

    Bridging the gap between social tagging and semantic annotation: E.D. the Entity Describer

    Get PDF
    Semantic annotation enables the development of efficient computational methods for analyzing and interacting with information, thus maximizing its value. With the already substantial and constantly expanding data generation capacity of the life sciences as well as the concomitant increase in the knowledge distributed in scientific articles, new ways to produce semantic annotations of this information are crucial. While automated techniques certainly facilitate the process, manual annotation remains the gold standard in most domains. In this manuscript, we describe a prototype mass-collaborative semantic annotation system that, by distributing the annotation workload across the broad community of biomedical researchers, may help to produce the volume of meaningful annotations needed by modern biomedical science. We present E.D., the Entity Describer, a mashup of the Connotea social tagging system, an index of semantic web-accessible controlled vocabularies, and a new public RDF database for storing social semantic annotations

    Exploiting Social Annotation for Automatic Resource Discovery

    Full text link
    Information integration applications, such as mediators or mashups, that require access to information resources currently rely on users manually discovering and integrating them in the application. Manual resource discovery is a slow process, requiring the user to sift through results obtained via keyword-based search. Although search methods have advanced to include evidence from document contents, its metadata and the contents and link structure of the referring pages, they still do not adequately cover information sources -- often called ``the hidden Web''-- that dynamically generate documents in response to a query. The recently popular social bookmarking sites, which allow users to annotate and share metadata about various information sources, provide rich evidence for resource discovery. In this paper, we describe a probabilistic model of the user annotation process in a social bookmarking system del.icio.us. We then use the model to automatically find resources relevant to a particular information domain. Our experimental results on data obtained from \emph{del.icio.us} show this approach as a promising method for helping automate the resource discovery task.Comment: 6 pages, submitted to AAAI07 workshop on Information Integration on the We

    ELAN as flexible annotation framework for sound and image processing detectors

    Get PDF
    Annotation of digital recordings in humanities research still is, to a largeextend, a process that is performed manually. This paper describes the firstpattern recognition based software components developed in the AVATecH projectand their integration in the annotation tool ELAN. AVATecH (AdvancingVideo/Audio Technology in Humanities Research) is a project that involves twoMax Planck Institutes (Max Planck Institute for Psycholinguistics, Nijmegen,Max Planck Institute for Social Anthropology, Halle) and two FraunhoferInstitutes (Fraunhofer-Institut fĂĽr Intelligente Analyse- undInformationssysteme IAIS, Sankt Augustin, Fraunhofer Heinrich-Hertz-Institute,Berlin) and that aims to develop and implement audio and video technology forsemi-automatic annotation of heterogeneous media collections as they occur inmultimedia based research. The highly diverse nature of the digital recordingsstored in the archives of both Max Planck Institutes, poses a huge challenge tomost of the existing pattern recognition solutions and is a motivation to makesuch technology available to researchers in the humanities

    Developing 21st Century Skills with Online Curation and Social Annotation

    Get PDF
    Comprend des références bibliographiques
    • …
    corecore