5,690 research outputs found
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
Highly efficient low-level feature extraction for video representation and retrieval.
PhDWitnessing the omnipresence of digital video media, the research community has
raised the question of its meaningful use and management. Stored in immense
multimedia databases, digital videos need to be retrieved and structured in an
intelligent way, relying on the content and the rich semantics involved. Current
Content Based Video Indexing and Retrieval systems face the problem of the semantic
gap between the simplicity of the available visual features and the richness of user
semantics.
This work focuses on the issues of efficiency and scalability in video indexing and
retrieval to facilitate a video representation model capable of semantic annotation. A
highly efficient algorithm for temporal analysis and key-frame extraction is developed.
It is based on the prediction information extracted directly from the compressed domain
features and the robust scalable analysis in the temporal domain. Furthermore,
a hierarchical quantisation of the colour features in the descriptor space is presented.
Derived from the extracted set of low-level features, a video representation model that
enables semantic annotation and contextual genre classification is designed.
Results demonstrate the efficiency and robustness of the temporal analysis algorithm
that runs in real time maintaining the high precision and recall of the detection task.
Adaptive key-frame extraction and summarisation achieve a good overview of the
visual content, while the colour quantisation algorithm efficiently creates hierarchical
set of descriptors. Finally, the video representation model, supported by the genre
classification algorithm, achieves excellent results in an automatic annotation system by
linking the video clips with a limited lexicon of related keywords
Recommended from our members
MAC-REALM: A video content feature extraction and modelling framework
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A consequence of the âdata delugeâ is the exponential increase in digital video footage, while the ability to find relevant video clips diminishes. Traditional text based search engines are no longer optimal for searching, as they cannot provide a granular search of the content inside video footage. To be able to search the video in a content based manner, the content features of the video need to be extracted and modelled into a content model, which can then act as a searchable proxy for the video content. This thesis focuses on the extraction of syntactic and semantic content features and content modelling, using machine driven processes, with either little or no user interaction. Our abstract framework design extracts syntactic and semantic content features and compiles them into an integrated content model. The framework integrates a four plane strategy that consists of a pre-processing plane that removes redundant data and filters the media to improve the feature extraction properties of the media; a syntactic feature extraction plane that extracts low level syntactic feature and mid-level syntactic features that have semantic attributes; a semantic relationship analysis and linkage plane, where the spatial and temporal relationships of all the content features are defined, and finally a content modelling stage where the syntactic and semantic content features are integrated into a content model. Each of the four planes can be split into three layers namely, the content layer, where the content to be processed is stored; the application layer, where the content is converted into content descriptions, and the MPEG-7 layer, where content descriptions are serialised. Using MPEG-7 standards to produce the content model will provide wide-ranging interoperability, while facilitating granular multi-content type searches. The framework is aiming to âbridgeâ the semantic gap, by integrating the syntactic and semantic content features from extraction through to modelling. The design of the framework has been implemented into a prototype called MAC-REALM, which has been tested and evaluated for its effectiveness to extract and model content features. Conclusions are drawn about the research output as a whole and whether they have met the objectives. Finally, future work is presented on how concept detection and crowd sourcing can be used with MAC-REALM
Region-based Multimedia Indexing and Retrieval Framework
Many systems have been proposed for automatic description and indexing of digital data, for posterior retrieval. One of such content-based indexing-and-retrieval systems, and the one used as a framework in this thesis, is the MUVIS system, which was developed at Tampere University of Technology, in Finland. Moreover, Content-based Image Retrieval (CBIR) utilising frame-based and region-based features has been a dynamic research area in the past years. Several systems have been developed using their specific segmentation, feature extraction, and retrieval methods.
In this thesis, a framework to model a regionalised CBIR framework is presented. The framework does not specify or fix the segmentation and local feature extraction methods, which are instead considered as âblack-boxesâ so as to allow the application of any segmentation method and visual descriptor. The proposed framework adopts a grouping approach in order to correct possible over- segmentation faults and a spatial feature called region proximity is introduced to describe regions topology in a visual scene by a block-based approach.
Using the MUVIS system, a prototype system of the proposed framework is implemented as a region-based feature extraction module, which integrates simple colour segmentation and region-based feature description based on colour and texture. The spatial region
proximity feature represents regions and describes their topology by a novel metric proposed in this thesis based on the block-based approach and average distance calculation.
After the region-based feature extraction step, a feature vector is formed which holds information about all image regions with their local low-level and spatial properties. During the retrieval process, those feature vectors are used for computing the (dis-)similarity distances between two images, taking into account each of their individual components. In this case a many-to-one matching scheme between regions characterised by a similarity maximisation approach is integrated into a query-by-example scheme.
Retrieval performance is evaluated between frame-based feature combination and the proposed framework with two different grouping approaches. Experiments are carried out on synthetic and natural image databases and the results indicate that a promising retrieval performance can be obtained as long as a reasonable segmentation quality is obtained. The integration of the region proximity feature further improves the retrieval performance especially for divisible, object-based image content.
Finally, frame-based and region-based texture extraction schemes are compared to evaluate the effect of a region on the texture description and retrieval performance utilising the proposed framework. Results show that significant degradations over the retrieval performance occur on region-based texture descriptors compared with the frame-based approaches
CTex - an adaptive unsupervised segmentation algorithm based on color-texture coherence
This paper presents the development of an unsupervised image segmentation framework (referred to as CTex) that is based on the adaptive inclusion of color and texture in the process of data partition. An important contribution of this work consists of a new formulation for the extraction of color features that evaluates the input image in a multispace color representation. To achieve this, we have used the opponent characteristics of the RGB and YIQ color spaces where the key component was the inclusion of the self organizing map (SOM) network in the computation of the dominant colors and estimation of the optimal number of clusters in the image. The texture features are computed using a multichannel texture decomposition scheme based on Gabor filtering. The major contribution of this work resides in the adaptive integration of the color and texture features in a compound mathematical descriptor with the aim of identifying the homogenous regions in the image. This integration is performed by a novel adaptive clustering algorithm that enforces the spatial continuity during the data assignment process. A comprehensive qualitative and quantitative performance evaluation has been carried out and the experimental results indicate that the proposed technique is accurate in capturing the color and texture characteristics when applied to complex natural images
Hierarchical colour image segmentation by leveraging RGB channels independently
In this paper, we introduce a hierarchical colour image segmentation based on cuboid partitioning using simple statistical features of the pixel intensities in the RGB channels. Estimating the difference between any two colours is a challenging task. As most of the colour models are not perceptually uniform, investigation of an alternative strategy is highly demanding. To address this issue, for our proposed technique, we present a new concept for colour distance measure based on the inconsistency of pixel intensities of an image which is more compliant to human perception. Constructing a reliable set of superpixels from an image is fundamental for further merging. As cuboid partitioning is a superior candidate to produce superpixels, we use the agglomerative merging to yield the final segmentation results exploiting the outcome of our proposed cuboid partitioning. The proposed cuboid segmentation based algorithm significantly outperforms not only the quadtree-based segmentation but also existing state-of-the-art segmentation algorithms in terms of quality of segmentation for the benchmark datasets used in image segmentation. © 2019, Springer Nature Switzerland AG
Recommended from our members
Multimedia: information representation and access
[About the book]
Information retrieval (IR) is a complex human activity supported by sophisticated systems. Information science has contributed much to the design and evaluation of previous generations of IR system development and to our general understanding of how such systems should be designed and yet, due to the increasing success and diversity of IR systems, many recent textbooks concentrate on IR systems themselves and ignore the human side of searching for information. This book is the first text to provide an information science perspective on IR
Geological and structural analysis of the Hwange area-Northwest Zimbabwe: using remotely sensed data and geographic information systems (GIS)
There is a continuous need to locate more targets for coal exploration and evaluation of geological structures in the north-west coalfields in Zimbabwe. Conventional methods of analysing geological structures and field mapping are being hindered by inaccessibility of some areas and thick covers of Recent sediments. Remote sensing has been found to be a valuable method of identifying lithologic units and geological structures in the· area. Integration of the remotely sensed data in a 2D GIS resulted in recognition of spatial relationships between lithologic units, geological structures , coal seams and vegetation patterns. The Hwange area constitutes the western part of the Mid-Zambezi Karoo basin. The area consist of a wide spectrum of rocks ranging from Precambrian gneisses, Proterozoic schists and granulites, Karoo sediments to Tertiary and Recent sands. The area has been affected by a number of faults and shears some of which post date the Karoo sediments. These faults are an expression of the major tectonic events associated with this area. Some of the faults have been attributed to the effects of the Zambezi Rift System. Fault zones in the area, such as the Deka, Entuba and Inyantue Zones have been recognised as part of this system and these divide the Lower Karoo rocks into different coalfields. To try and evaluate the outcrop patterns and geological structures in the Hwange area, all the available geological and structural data were captured in a spatial database. The diversity of data incorporated in the spatial database demanded the need for a structured database design approach. The Entity-Relationship model was used to conceptualise the geological data of the ' Hwange area This model was transformed into the Relational Model that formed the implementation model of the database. Landsat 5 TM data covering the area from the Zimbabwean winter (20 June 1984) path 172, row 73 were also analysed for the information required to locate Karoo rift faults and the distribution of lithologic units associated with coal. The use of directional filters in the E-W and NE-SW directions and vegetation reflection characteristics during the dry season (June 1984) proved very effective in mapping fractures in the Karoo rocks. Landsat TM image enhancement techniques such as principal components analysis, edge enhancement, decorrelation stretching, band ratios; and colour composites made following these techniques, allowed mapping of different lithological units and discrimination between Karoo rocks and the crystalline basement rocks. Lineament analysis defined E-W, ENE-WSW, NE-SW and NW-SE conjugate sets of lineaments. The first three sets are related to the regional fracture zones of the Zambezi rift system The Entuba fault zone was found to be associated with most of the fractures affecting the Hwange coalfields. These have a dominant NE-SW and ENE-WSW trend in the Western Areas, Wankie Concession, Chaba, Entuba and Sinamatella coalfields. The E-W trending fracture set is dominated by joint sets in the Karoo basalt covering the north-west portion of the Hwange Coalfields. These show no relationship with the linear features of the Zambezi Rift system The NW-SE trending lineaments are dominantly developed on tilted bedding planes in the Karoo rocks as well as a few sparse joints in the Karoo basalt. Overlaying enhanced Landsat TM images on mapped faults and lithology data in a GIS revealed a number of features along the Entuba zone which were not previously known. The south-western part of the Entuba inlier was shown to consist of a synformal fold plunging to the south and bound on both sides by strike slip faults. Several kinematic indicators such as displacement of sedimentary strata have shown that the Entuba fault displays right lateral strike-slip coupled with dipslip movement. Proximity analysis using borehole data (depth to top and bottom of a coal seam) showed that most of the lineaments in the area are normal faults which have caused considerable displacements of the main coal seam Comparison of seam depth across most of these faults within coalfields and from one field to another shows that local and regional variations in depths of the main seam is primarily a function of vertical displacements along the faults over and above variations in the morphology of the pre-Karoo floor. The Entuba field was found to have greatest vertical variations over very short distances across faults, with depths varying from 60m to 520m from west to east over distances of less than 500m This part of the field has been partly affected by extensive normal faults, some of which can be traced for more than 10km. In the Hwange area, the Karoo rocks have been down faulted into a rift margin which is in turn divided into smaller fault blocks by intra-rift faulting. The shape of the fault blocks are further controlled by the orientation of the post-Karoo faults which have also down faulted the main coal seam Exploration activity in the area should also seek to establish the locations of these faults to help further decipher variations in depths of coal seams
The COST292 experimental framework for TRECVID 2007
In this paper, we give an overview of the four tasks submitted to TRECVID 2007 by COST292. In shot boundary (SB) detection task, four SB detectors have been developed and the results are merged using two merging algorithms. The framework developed for the high-level feature extraction task comprises four systems. The first system transforms a set of low-level descriptors into the semantic space using
Latent Semantic Analysis and utilises neural networks for feature detection. The second system uses a Bayesian classifier trained with a âbag of subregionsâ. The third system uses a multi-modal classifier based on SVMs and several descriptors. The fourth system uses two image classifiers based on ant colony optimisation and particle swarm optimisation respectively. The system submitted to the search task is
an interactive retrieval application combining retrieval functionalities in various modalities with a user interface supporting automatic and interactive search over all queries submitted. Finally, the rushes task submission is based on a video summarisation and browsing system comprising two different interest curve algorithms and three features
- âŠ