31,190 research outputs found

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    A lightweight web video model with content and context descriptions for integration with linked data

    Get PDF
    The rapid increase of video data on the Web has warranted an urgent need for effective representation, management and retrieval of web videos. Recently, many studies have been carried out for ontological representation of videos, either using domain dependent or generic schemas such as MPEG-7, MPEG-4, and COMM. In spite of their extensive coverage and sound theoretical grounding, they are yet to be widely used by users. Two main possible reasons are the complexities involved and a lack of tool support. We propose a lightweight video content model for content-context description and integration. The uniqueness of the model is that it tries to model the emerging social context to describe and interpret the video. Our approach is grounded on exploiting easily extractable evolving contextual metadata and on the availability of existing data on the Web. This enables representational homogeneity and a firm basis for information integration among semantically-enabled data sources. The model uses many existing schemas to describe various ontology classes and shows the scope of interlinking with the Linked Data cloud

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    Text-based Semantic Annotation Service for Multimedia Content in the Esperonto project

    Get PDF
    Within the Esperonto project, an integration of NLP, ontologies and other knowledge bases, is being performed with the goal to implement a semantic annotation service that upgrades the actual Web towards the emerging Semantic Web. Research is being currently conducted on how to apply the Esperonto semantic annotation service to text material associated with still images in web pages. In doing so, the project will allow for semantic querying of still images in the web, but also (automatically) create a large set of text-based semantic annotations of still images, which can be used by the Multimedia community in order to support the task of content indexing of image material, possibly combining the Esperonto type of annotations with the annotations resulting from image analysis

    Integrated content presentation for multilingual and multimedia information access

    Get PDF
    For multilingual and multimedia information retrieval from multiple potentially distributed collections generating the output in the form of standard ranked lists may often mean that a user has to explore the contents of many lists before finding sufficient relevant or linguistically accessible material to satisfy their information need. In some situations delivering an integrated multilingual multimedia presentation could enable the user to explore a topic allowing them to select from among a range of available content based on suitably chosen displayed metadata. A presentation of this type has similarities with the outputs of existing adaptive hypermedia systems. However, such systems are generated based on “closed” content with sophisticated user and domain models. Extending them to “open” domain information retrieval applications would raise many issues. We present an outline exploration of what will form a challenging new direction for research in multilingual information access

    Automated speech and audio analysis for semantic access to multimedia

    Get PDF
    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

    CHORUS Deliverable 3.4: Vision Document

    Get PDF
    The goal of the CHORUS Vision Document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area and to highlight trends and challenges in this domain. The vision of CHORUS is strongly connected to the CHORUS Roadmap Document (D2.3). A concise document integrating the outcomes of the two deliverables will be prepared for the end of the project (NEM Summit)

    Organizational challenges of the semantic web in digital libraries

    Get PDF
    The Semantic Web initiative holds large promises for the future. There is, however, a considerable gap in Semantic Web research between the contributions in the technological field and the research done in the organizational field. This paper examines, from a socio-technical point of view the impact of Semantic Web technology on the strategic, organizational and technological levels. Building on a comprehensive case study at the National Library in Norway our findings indicate that the highest impact will be at the organizational level. The reason is mainly because inter-organizational and cross-organizational structures have to be established to address the problems of ontology engineering, and a development framework for ontology engineering in digital libraries must be examined

    Topic-dependent sentiment analysis of financial blogs

    Get PDF
    While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full documentclassification and that word-based approaches perform better than sentence-based or paragraph-based approaches
    corecore