750 research outputs found

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search

    Full text link
    Despite substantial interest in applications of neural networks to information retrieval, neural ranking models have only been applied to standard ad hoc retrieval tasks over web pages and newswire documents. This paper proposes MP-HCNN (Multi-Perspective Hierarchical Convolutional Neural Network) a novel neural ranking model specifically designed for ranking short social media posts. We identify document length, informal language, and heterogeneous relevance signals as features that distinguish documents in our domain, and present a model specifically designed with these characteristics in mind. Our model uses hierarchical convolutional layers to learn latent semantic soft-match relevance signals at the character, word, and phrase levels. A pooling-based similarity measurement layer integrates evidence from multiple types of matches between the query, the social media post, as well as URLs contained in the post. Extensive experiments using Twitter data from the TREC Microblog Tracks 2011--2014 show that our model significantly outperforms prior feature-based as well and existing neural ranking models. To our best knowledge, this paper presents the first substantial work tackling search over social media posts using neural ranking models.Comment: AAAI 2019, 10 page

    Content And Multimedia Database Management Systems

    Get PDF
    A database management system is a general-purpose software system that facilitates the processes of defining, constructing, and manipulating databases for various applications. The main characteristic of the ‘database approach’ is that it increases the value of data by its emphasis on data independence. DBMSs, and in particular those based on the relational data model, have been very successful at the management of administrative data in the business domain. This thesis has investigated data management in multimedia digital libraries, and its implications on the design of database management systems. The main problem of multimedia data management is providing access to the stored objects. The content structure of administrative data is easily represented in alphanumeric values. Thus, database technology has primarily focused on handling the objects’ logical structure. In the case of multimedia data, representation of content is far from trivial though, and not supported by current database management systems

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    A Conceptual Representation of Documents and Queries for Information Retrieval Systems by Using Light Ontologies

    Get PDF
    International audienceThis article presents a vector space model approach to representing documents and queries, based on concepts instead of terms and using WordNet as a light ontology. Such representation reduces information overlap with respect to classic semantic expansion techniques. Experiments carried out on the MuchMore benchmark and on the TREC-7 and TREC-8 Ad-hoc collections demonstrate the effectiveness of the proposed approach
    corecore