4,939 research outputs found
Unsupervised Visual and Textual Information Fusion in Multimedia Retrieval - A Graph-based Point of View
Multimedia collections are more than ever growing in size and diversity.
Effective multimedia retrieval systems are thus critical to access these
datasets from the end-user perspective and in a scalable way. We are interested
in repositories of image/text multimedia objects and we study multimodal
information fusion techniques in the context of content based multimedia
information retrieval. We focus on graph based methods which have proven to
provide state-of-the-art performances. We particularly examine two of such
methods : cross-media similarities and random walk based scores. From a
theoretical viewpoint, we propose a unifying graph based framework which
encompasses the two aforementioned approaches. Our proposal allows us to
highlight the core features one should consider when using a graph based
technique for the combination of visual and textual information. We compare
cross-media and random walk based results using three different real-world
datasets. From a practical standpoint, our extended empirical analysis allow us
to provide insights and guidelines about the use of graph based methods for
multimodal information fusion in content based multimedia information
retrieval.Comment: An extended version of the paper: Visual and Textual Information
Fusion in Multimedia Retrieval using Semantic Filtering and Graph based
Methods, by J. Ah-Pine, G. Csurka and S. Clinchant, submitted to ACM
Transactions on Information System
Strategies for Searching Video Content with Text Queries or Video Examples
The large number of user-generated videos uploaded on to the Internet
everyday has led to many commercial video search engines, which mainly rely on
text metadata for search. However, metadata is often lacking for user-generated
videos, thus these videos are unsearchable by current search engines.
Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity
problem by directly analyzing the visual and audio streams of each video. CBVR
encompasses multiple research topics, including low-level feature design,
feature fusion, semantic detector training and video search/reranking. We
present novel strategies in these topics to enhance CBVR in both accuracy and
speed under different query inputs, including pure textual queries and query by
video examples. Our proposed strategies have been incorporated into our
submission for the TRECVID 2014 Multimedia Event Detection evaluation, where
our system outperformed other submissions in both text queries and video
example queries, thus demonstrating the effectiveness of our proposed
approaches
End-to-end Learning for Short Text Expansion
Effectively making sense of short texts is a critical task for many real
world applications such as search engines, social media services, and
recommender systems. The task is particularly challenging as a short text
contains very sparse information, often too sparse for a machine learning
algorithm to pick up useful signals. A common practice for analyzing short text
is to first expand it with external information, which is usually harvested
from a large collection of longer texts. In literature, short text expansion
has been done with all kinds of heuristics. We propose an end-to-end solution
that automatically learns how to expand short text to optimize a given learning
task. A novel deep memory network is proposed to automatically find relevant
information from a collection of longer documents and reformulate the short
text through a gating mechanism. Using short text classification as a
demonstrating task, we show that the deep memory network significantly
outperforms classical text expansion methods with comprehensive experiments on
real world data sets.Comment: KDD'201
Multimedia Answering and Retrieval System based on CQA with Media Query Generation
The question answering system which has recently received an attention from the various information retrieval systems, machine learning, information extraction and the natural language processing the goal of the QAS is to retrieve the answer to the question than full documents. This question answering system which works on the various modules related only to the question processing, the document processing, and the answer processing. This QAS which doesn’t work properly with the main module which is questioning processing this system fails to categorize properly the questions. So to overcome the QAS the Community question answering (CQA) has gained popularity. As compare to QAS and automated QA sites the CQA sites are more effective. In this drawback available for community question answering system is that it only provides the textual answer. Here in this paper, we propose a scheme that enhances the textual answer with the multimedia data. The outline of Community question answering which mainly consists of three components: the selection of answer medium, the query generation for multimedia search and the selection and presentation of multimedia data. This approach automatically defines which type of media information should be added for the textual answer. Then it automatically collects the data from the web to supplement the answer.by handling an available dataset of QA pairs and adding them to a pool, in this, our approach is to allow a new multimedia question answering (MMQA) approach so as the users can find the answer in multimedia matching the questions pair those in the pool. Therefore, the users can approach MMQA from Web information will answer the questions in different media formats (text, video, and image) as particularly selected by the users
combining multimodal external resources for event-based news video retrieval and question answering
Ph.DDOCTOR OF PHILOSOPH
Feature Extraction and Duplicate Detection for Text Mining: A Survey
Text mining, also known as Intelligent Text Analysis is an important research area. It is very difficult to focus on the most appropriate information due to the high dimensionality of data. Feature Extraction is one of the important techniques in data reduction to discover the most important features. Proce- ssing massive amount of data stored in a unstructured form is a challenging task. Several pre-processing methods and algo- rithms are needed to extract useful features from huge amount of data. The survey covers different text summarization, classi- fication, clustering methods to discover useful features and also discovering query facets which are multiple groups of words or phrases that explain and summarize the content covered by a query thereby reducing time taken by the user. Dealing with collection of text documents, it is also very important to filter out duplicate data. Once duplicates are deleted, it is recommended to replace the removed duplicates. Hence we also review the literature on duplicate detection and data fusion (remove and replace duplicates).The survey provides existing text mining techniques to extract relevant features, detect duplicates and to replace the duplicate data to get fine grained knowledge to the user
IntentSearch: capturing user intention for internet image search.
Liu, Ke.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (leaves 41-46).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 2 --- Related Work --- p.7Chapter 2.1 --- Keyword Expansion --- p.7Chapter 2.2 --- Content-based Image Search and Visual Expansion --- p.8Chapter 3 --- Algorithm --- p.12Chapter 3.1 --- Overview --- p.12Chapter 3.2 --- Visual Distance Calculation --- p.14Chapter 3.2.1 --- Visual Features --- p.15Chapter 3.2.2 --- Adaptive Weight Schema --- p.17Chapter 3.3 --- Keyword Expansion --- p.18Chapter 3.4 --- Visual Query Expansion --- p.22Chapter 3.5 --- Image Pool Expansion --- p.24Chapter 3.6 --- Textual Feature Combination --- p.26Chapter 4 --- Experimental Evaluation --- p.27Chapter 4.1 --- Dataset --- p.27Chapter 4.2 --- Experiment One: Evaluation with Ground Truth --- p.28Chapter 4.2.1 --- Precisions on Different Steps --- p.28Chapter 4.2.2 --- Accuracy of Keyword Expansion --- p.31Chapter 4.3 --- Experiment Two: User Study --- p.33Chapter 5 --- Conclusion --- p.3
- …