230 research outputs found
Information extraction from multimedia web documents: an open-source platform and testbed
The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Recommended from our members
Retrieval Models based on Linguistic Features of Verbose Queries
Natural language expressions are more familiar to users than choosing keywords for queries. Given that, people can use natural language expressions to represent their sophisticated information needs. Instead of listing keywords, verbose queries are expressed in a grammatically well-formed phrase or sentence in which terms are used together to represent the more specific meanings of a concept, and the relationships of these concepts are expressed by function words.
The goal of this thesis is to investigate methods of using the semantic and syntactic features of natural language queries to maximize the effectiveness of search. For this purpose, we propose the synchronous framework in which we use syntactic parsing techniques for modeling term dependencies. We use the Generative Relevance Hypothesis (GRH) to evaluate valid variations in dependence relationships between queries and documents. This is one of the first results demonstrating that dependency parsing can be used to improve retrieval effectiveness.
We propose a method for classifying concepts in verbose queries as key concepts and secondary concepts that are used in the statistical translation model for query term expansion. Key concepts are the most important terms of queries. We use key concepts as the context for translating terms. Although secondary (key) concepts are not as important as key concepts, they are still important because they provide clues about what kinds of information users are looking for. Using concept classification results, we elaborate a translation model in which the key concepts of queries are used as the context of translation. The secondary concepts of queries are used to selectively apply the translation model to query terms.
We define the important new task of focused retrieval of answer passages that aims to immediately provide answers for users\u27 information needs while the length of answer passage should be suitable for restricted search environments such as mobile devices and voice-based search systems
Generating queries from user-selected text
People browsing the web or reading a document may see text passages that describe a topic of interest, and want to know more about it by searching. Manually formulating a query from that text can be difficult, however, and an effec-tive search is not guaranteed. In this paper, to address this scenario, we propose a learning-based approach which gener-ates effective queries from the content of an arbitrary user-selected text passage. Specifically, the approach extracts and selects representative chunks (noun phrases or named entities) from the content (a text passage) using a rich set of features. We carry out experiments showing that the se-lected chunks can be effectively used to generate queries both in a TREC environment, where weights and query structure can be directly incorporated, and with a “black-box ” web search engine, where query structure is more limited
Bridging the gap within text-data analytics: a computer environment for data analysis in linguistic research
Since computer technology became widespread available at universities during the last quarter of the twentieth century, language researchers have been successfully employing software to analyse usage patterns in corpora. However, although there has been a proliferation of software for different disciplines within text-data analytics, e.g. corpus linguistics, statistics, natural language processing and text mining, this article demonstrates that any computer environment intended to support advanced linguistic research more effectively should be grounded on a user-centred approach to holistically integrate cross-disciplinary methods and techniques in a linguist-friendly manner. To this end, I examine not only the tasks that are derived from linguists' needs and goals but also the technologies that appropriately deal with the properties of linguistic data. This research results in the implementation of DAMIEN, an online workbench designed to conduct linguistic experiments on corpora
- …