Search CORE

1,605 research outputs found

Highly efficient low-level feature extraction for video representation and retrieval.

Author: Calie Janko
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2004
Field of study

PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

Queen Mary Research Online

OpenGrey Repository

Text-to-picture tools, systems, and approaches: a survey

Author: Al Ja’am J.
Saleh M.
Zakraoui J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Text-to-picture systems attempt to facilitate high-level, user-friendly communication between humans and computers while promoting understanding of natural language. These systems interpret a natural language text and transform it into a visual format as pictures or images that are either static or dynamic. In this paper, we aim to identify current difficulties and the main problems faced by prior systems, and in particular, we seek to investigate the feasibility of automatic visualization of Arabic story text through multimedia. Hence, we analyzed a number of well-known text-to-picture systems, tools, and approaches. We showed their constituent steps, such as knowledge extraction, mapping, and image layout, as well as their performance and limitations. We also compared these systems based on a set of criteria, mainly natural language processing, natural language understanding, and input/output modalities. Our survey showed that currently emerging techniques in natural language processing tools and computer vision have made promising advances in analyzing general text and understanding images and videos. Furthermore, important remarks and findings have been deduced from these prior works, which would help in developing an effective text-to-picture system for learning and educational purposes. - 2019, The Author(s).This work was made possible by NPRP grant #10-0205-170346 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors

Qatar University Institutional Repository

Complex query learning in semantic video search

Author: YUAN JIN
Publication venue
Publication date: 10/01/2013
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Video retrieval using objects and ostensive relevance feedback

Author: Browne Paul
Publication venue: Dublin City University. School of Computing
Publication date: 01/03/2004
Field of study

The thesis discusses and evaluates a model of video information retrieval that incorporates a variation of Relevance Feedback and facilitates object-based interaction and ranking. Video and image retrieval systems suffer from poor retrieval performance compared to text-based information retrieval systems and this is mainly due to the poor discrimination power of visual features that provide the search index. Relevance Feedback is an iterative approach where the user provides the system with relevant and non-relevant judgements of the results and the system re-ranks the results based on the user judgements. Relevance feedback for video retrieval can help overcome the poor discrimination power of the features with the user essentially pointing the system in the right direction based on their judgements. The ostensive relevance feedback approach discussed in this work weights user judgements based on the o r d e r in which they are made with newer judgements weighted higher than older judgements. The main aim of the thesis is to explore the benefit of ostensive relevance feedback for video retrieval with a secondary aim of exploring the effectiveness of object retrieval. A user experiment has been developed in which three video retrieval system variants are evaluated on a corpus of video content. The first system applies standard relevance feedback weighting while the second and third apply ostensive relevance feedback with variations in the decay weight. In order to evaluate effective object retrieval, animated video content provides the corpus content for the evaluation experiment as animated content offers the highest performance for object detection and extraction

Irish Universities

DCU Online Research Access Service

Inter-query Learning in Content-based Image Retrieval

Author: Gondra Iker
Publication venue: 'Oklahoma State University Library'
Publication date: 01/07/2005
Field of study

Computer Scienc

SHAREOK repository

Self-supervised Face Representation Learning

Author: Sharma Vivek
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2020
Field of study

This thesis investigates fine-tuning deep face features in a self-supervised manner for discriminative face representation learning, wherein we develop methods to automatically generate pseudo-labels for training a neural network. Most importantly solving this problem helps us to advance the state-of-the-art in representation learning and can be beneficial to a variety of practical downstream tasks. Fortunately, there is a vast amount of videos on the internet that can be used by machines to learn an effective representation. We present methods that can learn a strong face representation from large-scale data be the form of images or video. However, while learning a good representation using a deep learning algorithm requires a large-scale dataset with manually curated labels, we propose self-supervised approaches to generate pseudo-labels utilizing the temporal structure of the video data and similarity constraints to get supervision from the data itself. We aim to learn a representation that exhibits small distances between samples from the same person, and large inter-person distances in feature space. Using metric learning one could achieve that as it is comprised of a pull-term, pulling data points from the same class closer, and a push-term, pushing data points from a different class further away. Metric learning for improving feature quality is useful but requires some form of external supervision to provide labels for the same or different pairs. In the case of face clustering in TV series, we may obtain this supervision from tracks and other cues. The tracking acts as a form of high precision clustering (grouping detections within a shot) and is used to automatically generate positive and negative pairs of face images. Inspired from that we propose two variants of discriminative approaches: Track-supervised Siamese network (TSiam) and Self-supervised Siamese network (SSiam). In TSiam, we utilize the tracking supervision to obtain the pair, additional we include negative training pairs for singleton tracks -- tracks that are not temporally co-occurring. As supervision from tracking may not always be available, to enable the use of metric learning without any supervision we propose an effective approach SSiam that can generate the required pairs automatically during training. In SSiam, we leverage dynamic generation of positive and negative pairs based on sorting distances (i.e. ranking) on a subset of frames and do not have to only rely on video/track based supervision. Next, we present a method namely Clustering-based Contrastive Learning (CCL), a new clustering-based representation learning approach that utilizes automatically discovered partitions obtained from a clustering algorithm (FINCH) as weak supervision along with inherent video constraints to learn discriminative face features. As annotating datasets is costly and difficult, using label-free and weak supervision obtained from a clustering algorithm as a proxy learning task is promising. Through our analysis, we show that creating positive and negative training pairs using clustering predictions help to improve the performance for video face clustering. We then propose a method face grouping on graphs (FGG), a method for unsupervised fine-tuning of deep face feature representations. We utilize a graph structure with positive and negative edges over a set of face-tracks based on their temporal structure of the video data and similarity-based constraints. Using graph neural networks, the features communicate over the edges allowing each track\u27s feature to exchange information with its neighbors, and thus push each representation in a direction in feature space that groups all representations of the same person together and separates representations of a different person. Having developed these methods to generate weak-labels for face representation learning, next we propose to learn compact yet effective representation for describing face tracks in videos into compact descriptors, that can complement previous methods towards learning a more powerful face representation. Specifically, we propose Temporal Compact Bilinear Pooling (TCBP) to encode the temporal segments in videos into a compact descriptor. TCBP possesses the ability to capture interactions between each element of the feature representation with one-another over a long-range temporal context. We integrated our previous methods TSiam, SSiam and CCL with TCBP and demonstrated that TCBP has excellent capabilities in learning a strong face representation. We further show TCBP has exceptional transfer abilities to applications such as multimodal video clip representation that jointly encodes images, audio, video and text, and video classification. All of these contributions are demonstrated on benchmark video clustering datasets: The Big Bang Theory, Buffy the Vampire Slayer and Harry Potter 1. We provide extensive evaluations on these datasets achieving a significant boost in performance over the base features, and in comparison to the state-of-the-art results

KITopen

Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

Author: Bunte Kerstin
Canducci Marco
De Rijcke Sven
Mastropietro Michele
Peletier Reynier
Taghribi Albolfazl
Tino Peter
Yin H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2021
Field of study

The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

University of Birmingham Research Portal

Dissertations of the University of Groningen

Social impact retrieval: measuring author inﬂuence on information retrieval

Author: Lanagan James
Publication venue: Dublin City University. CLARITY: The Centre for Sensor Web Technologies
Publication date: 01/11/2009
Field of study

The increased presence of technologies collectively referred to as Web 2.0 mean the entire process of new media production and dissemination has moved away from an authorcentric approach. Casual web users and browsers are increasingly able to play a more active role in the information creation process. This means that the traditional ways in which information sources may be validated and scored must adapt accordingly. In this thesis we propose a new way in which to look at a user's contributions to the network in which they are present, using these interactions to provide a measure of authority and centrality to the user. This measure is then used to attribute an query-independent interest score to each of the contributions the author makes, enabling us to provide other users with relevant information which has been of greatest interest to a community of like-minded users. This is done through the development of two algorithms; AuthorRank and MessageRank. We present two real-world user experiments which focussed around multimedia annotation and browsing systems that we built; these systems were novel in themselves, bringing together video and text browsing, as well as free-text annotation. Using these systems as examples of real-world applications for our approaches, we then look at a larger-scale experiment based on the author and citation networks of a ten year period of the ACM SIGIR conference on information retrieval between 1997-2007. We use the citation context of SIGIR publications as a proxy for annotations, constructing large social networks between authors. Against these networks we show the eﬀectiveness of incorporating user generated content, or annotations, to improve information retrieval

Irish Universities

DCU Online Research Access Service

Modellgetriebene Entwicklung inhaltsbasierter Bildretrieval-Systeme auf der Basis von objektrelationalen Datenbank-Management-Systeme

Author: Ignatova Temenushka (gnd: 13696043X)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

In this thesis, the model-driven software development paradigm is employed in order to support the development of Content-based Image Retrieval Systems (CBIRS) for different application domains. Modeling techniques, based on an adaptable conceptual framework model, are proposed for deriving the components of a concrete CBIRS. Transformation techniques are defined to automatically implement the derived application specific models in an object-relational database management system. A set of criteria assuring the quality of the transformation are derived from the theory for preserving information capacity applied in database design.In dieser Dissertation wird das Paradigma des modellgetriebenen Softwareentwurfs für die Erstellung von inhaltsbasierten Bildretrieval-Systemen verwendet. Ein adaptierbares Frameworkmodell wird für die Ableitung des Modells eines konkreten Bildretrieval-Systems eingesetzt. Transformationstechniken für die automatische Generierung von Implementierungen in Objektorientierten Datenbank-Management-Systemen aus dem konzeptuellen Modell werden erarbeitet. Die aus der Theorie des Datenbankentwurfs bekannten Anforderungen zur Kapazitätserhaltung der Transformation werden verwendet, um Kriterien für die erforderliche Qualität der Transformation zu definieren

Rostocker Dokumentenserver

A survey of the application of soft computing to investment and financial trading

Author: Tan Clarence
Vanstone Bruce J
Publication venue: The Australian Pattern Recognition Society
Publication date: 01/01/2003
Field of study

Bond University Research Portal