66,427 research outputs found

    Data Modeling and Hybrid Query for Video Database

    Get PDF
    Video data management is important since the effective use of video in multimedia applications is often impeded by the difficulty in cataloging and managing video data. Major aspects of video data management include data modelling, indexing and querying. Modelling is concerned with representing the structural properties of video as well as its content. A video data model should be expressive enough to capture several characteristics inherent to video. Depending on the underlying data model, video can be indexed by text for describing semantics or by their low-level visual features such as colour. It is not reasonable to assume that all types of multimedia data can be described sufficiently with words alone. Although query by text annotations complements query by low-level features, query formulation in existing systems is still done separately. Existing systems do not support combination of these two types of queries since there are essential differences between querying multimedia data and traditional databases. These differences cause us to consider new types of queries. The purpose of this research is to model video data that would allow users to formulate queries using hybrid query mechanism. In this research, we define a video data model that captures the hierarchical structure and contents of video. Based on this data model, we design and develop a Video Database System (VDBS). We compared query formulation using single types against a hybrid query type. Results of the hybrid query type are better than the single query types. We extend the Structured Query Language (SQL) to support video functions and design a visual query interface for supporting hybrid queries, which is a combination of exact and similarity-based queries. Our research contributions include a video data model that captures the hierarchical structure of video (sequence, scene, shot and key frame), as well as high-level concepts (object, activity, event) and low-level visual features (colour, texture, shape and location). By introducing video functions, the extended SQL supports queries on video segments, semantic as well as low-level visual features. The hybrid query formulation has allowed the combination of query by text and query by example in a single query statement. We have designed a visual query interface that would facilitate the hybrid query formulation. In addition we have proposed a video database system architecture that includes shot detection, annotation and query formulation modules. Further works consider the implementation and integration of these modules with other attributes of video data such as spatio-temporal and object motion

    viSQLizer: Using visualization for learning SQL

    Get PDF
    Structured Query Language (SQL) is used for interaction between database technology and its users. In higher education, students often struggle with understanding the underlying logic of SQL, thus have trouble with understanding how and why a result table is created from a query. A prototype of a visual learning tool for SQL, viSQLizer, has been developed to determine if visualizations could help students create a mental model and thus enhance their understanding of the underlying logic of SQL. Trough the use of animations and decomposing, our results indicate that visualizations might give students a better understanding of the underlying logic, and that students gain the same learning outcome through visualizations as when using an online tutorial with explanatory text and exercises. Feedback from both professors and students from conducted interviews and experiments indicate that the tool could be used by professors as a visualization tool in lectures, and by students as a practical tool; not as a replacement of, but as an addition to traditional teaching methods

    Visual exploration and retrieval of XML document collections with the generic system X2

    Get PDF
    This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed

    FVQA: Fact-based Visual Question Answering

    Full text link
    Visual Question Answering (VQA) has attracted a lot of attention in both Computer Vision and Natural Language Processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of such questions that require no external information to answer is interesting, but very limited. It excludes questions which require common sense, or basic factual knowledge to answer, for example. Here we introduce FVQA, a VQA dataset which requires, and supports, much deeper reasoning. FVQA only contains questions which require external information to answer. We thus extend a conventional visual question answering dataset, which contains image-question-answerg triplets, through additional image-question-answer-supporting fact tuples. The supporting fact is represented as a structural triplet, such as . We evaluate several baseline models on the FVQA dataset, and describe a novel model which is capable of reasoning about an image on the basis of supporting facts.Comment: 16 page

    DCU and UTA at ImageCLEFPhoto 2007

    Get PDF
    Dublin City University (DCU) and University of Tampere(UTA) participated in the ImageCLEF 2007 photographic ad-hoc retrieval task with several monolingual and bilingual runs. Our approach was language independent: text retrieval based on fuzzy s-gram query translation was combined with visual retrieval. Data fusion between text and image content was performed using unsupervised query-time weight generation approaches. Our baseline was a combination of dictionary-based query translation and visual retrieval, which achieved the best result. The best mixed modality runs using fuzzy s-gram translation achieved on average around 83% of the performance of the baseline. Performance was more similar when only top rank precision levels of P10 and P20 were considered. This suggests that fuzzy sgram query translation combined with visual retrieval is a cheap alternative for cross-lingual image retrieval where only a small number of relevant items are required. Both sets of results emphasize the merit of our query-time weight generation schemes for data fusion, with the fused runs exhibiting marked performance increases over single modalities, this is achieved without the use of any prior training data
    corecore