66,427 research outputs found
Data Modeling and Hybrid Query for Video Database
Video data management is important since the effective use of video in multimedia
applications is often impeded by the difficulty in cataloging and managing video data.
Major aspects of video data management include data modelling, indexing and querying.
Modelling is concerned with representing the structural properties of video as well as its
content. A video data model should be expressive enough to capture several
characteristics inherent to video. Depending on the underlying data model, video can
be indexed by text for describing semantics or by their low-level visual features such as
colour. It is not reasonable to assume that all types of multimedia data can be described
sufficiently with words alone. Although query by text annotations complements query
by low-level features, query formulation in existing systems is still done separately.
Existing systems do not support combination of these two types of queries since there
are essential differences between querying multimedia data and traditional databases.
These differences cause us to consider new types of queries. The purpose of this research is to model video data that would allow users to formulate
queries using hybrid query mechanism. In this research, we define a video data model
that captures the hierarchical structure and contents of video. Based on this data model,
we design and develop a Video Database System (VDBS). We compared query
formulation using single types against a hybrid query type. Results of the hybrid query
type are better than the single query types. We extend the Structured Query Language
(SQL) to support video functions and design a visual query interface for supporting
hybrid queries, which is a combination of exact and similarity-based queries.
Our research contributions include a video data model that captures the hierarchical
structure of video (sequence, scene, shot and key frame), as well as high-level concepts
(object, activity, event) and low-level visual features (colour, texture, shape and
location). By introducing video functions, the extended SQL supports queries on video
segments, semantic as well as low-level visual features. The hybrid query formulation
has allowed the combination of query by text and query by example in a single query
statement. We have designed a visual query interface that would facilitate the hybrid
query formulation. In addition we have proposed a video database system architecture
that includes shot detection, annotation and query formulation modules. Further works
consider the implementation and integration of these modules with other attributes of
video data such as spatio-temporal and object motion
viSQLizer: Using visualization for learning SQL
Structured Query Language (SQL) is used for interaction between database technology and its users. In higher education, students often struggle with understanding the underlying logic of SQL, thus have trouble with understanding how and why a result table is created from a query. A prototype of a visual learning tool for SQL, viSQLizer, has been developed to determine if visualizations could help students create a mental model and thus enhance their understanding of the underlying logic of SQL. Trough the use of animations and decomposing, our results indicate that visualizations might give students a better understanding of the underlying logic, and that students gain the same learning outcome through visualizations as when using an online tutorial with explanatory text and exercises. Feedback from both professors and students from conducted interviews and experiments indicate that the tool could be used by professors as a visualization tool in lectures, and by students as a practical tool; not as a replacement of, but as an addition to traditional teaching methods
Visual exploration and retrieval of XML document collections with the generic system X2
This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user
first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically.
After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed
FVQA: Fact-based Visual Question Answering
Visual Question Answering (VQA) has attracted a lot of attention in both
Computer Vision and Natural Language Processing communities, not least because
it offers insight into the relationships between two important sources of
information. Current datasets, and the models built upon them, have focused on
questions which are answerable by direct analysis of the question and image
alone. The set of such questions that require no external information to answer
is interesting, but very limited. It excludes questions which require common
sense, or basic factual knowledge to answer, for example. Here we introduce
FVQA, a VQA dataset which requires, and supports, much deeper reasoning. FVQA
only contains questions which require external information to answer.
We thus extend a conventional visual question answering dataset, which
contains image-question-answerg triplets, through additional
image-question-answer-supporting fact tuples. The supporting fact is
represented as a structural triplet, such as .
We evaluate several baseline models on the FVQA dataset, and describe a novel
model which is capable of reasoning about an image on the basis of supporting
facts.Comment: 16 page
DCU and UTA at ImageCLEFPhoto 2007
Dublin City University (DCU) and University of Tampere(UTA) participated in the ImageCLEF 2007 photographic ad-hoc retrieval task with several monolingual and bilingual
runs. Our approach was language independent: text retrieval based on fuzzy s-gram query translation was combined with visual retrieval. Data fusion between text and image content
was performed using unsupervised query-time weight generation approaches. Our baseline was a combination of dictionary-based query translation and visual retrieval, which achieved the best result. The best mixed modality runs using fuzzy s-gram translation achieved on average around 83% of the performance of the baseline. Performance was more similar when only top rank precision levels of P10 and P20 were considered. This suggests that fuzzy sgram
query translation combined with visual retrieval is a cheap alternative for cross-lingual image retrieval where only a small number of relevant items are required. Both sets of results emphasize the merit of our query-time weight generation schemes for data fusion, with the fused runs exhibiting marked performance increases over single modalities, this is achieved without the use of any prior training data
- …