494 research outputs found
Medical Image Modality Classification using Feature Weighted Clustering Approach.
Sistem Dapat Semula Imej Perubatan merupakan satu bidang yang amat penting bagi pembekal penjagaan kesihatan.
Medical Image Retrieval System is an area of great importance to the healthcare providers
Recommended from our members
Multimedia: information representation and access
[About the book]
Information retrieval (IR) is a complex human activity supported by sophisticated systems. Information science has contributed much to the design and evaluation of previous generations of IR system development and to our general understanding of how such systems should be designed and yet, due to the increasing success and diversity of IR systems, many recent textbooks concentrate on IR systems themselves and ignore the human side of searching for information. This book is the first text to provide an information science perspective on IR
Automatic object classification for surveillance videos.
PhDThe recent popularity of surveillance video systems, specially located in urban
scenarios, demands the development of visual techniques for monitoring purposes.
A primary step towards intelligent surveillance video systems consists on automatic
object classification, which still remains an open research problem and the keystone
for the development of more specific applications.
Typically, object representation is based on the inherent visual features. However,
psychological studies have demonstrated that human beings can routinely categorise
objects according to their behaviour. The existing gap in the understanding
between the features automatically extracted by a computer, such as appearance-based
features, and the concepts unconsciously perceived by human beings but
unattainable for machines, or the behaviour features, is most commonly known
as semantic gap. Consequently, this thesis proposes to narrow the semantic gap
and bring together machine and human understanding towards object classification.
Thus, a Surveillance Media Management is proposed to automatically detect and
classify objects by analysing the physical properties inherent in their appearance
(machine understanding) and the behaviour patterns which require a higher level of
understanding (human understanding). Finally, a probabilistic multimodal fusion
algorithm bridges the gap performing an automatic classification considering both
machine and human understanding.
The performance of the proposed Surveillance Media Management framework
has been thoroughly evaluated on outdoor surveillance datasets. The experiments
conducted demonstrated that the combination of machine and human understanding
substantially enhanced the object classification performance. Finally, the inclusion
of human reasoning and understanding provides the essential information to bridge
the semantic gap towards smart surveillance video systems
Multi Voxel Descriptor for 3D Texture Retrieval
In this paper, we present a new feature descriptors which exploit voxels for 3D textured retrieval system when models vary either by geometric shape or texture or both. First, we perform pose normalisation to modify arbitrary 3D models in order to have same orientation. We then map the structure of 3D models into voxels. This purposes to make all the 3D models have the same dimensions. Through this voxels, we can capture information from a number of ways. First, we build biner voxel histogram and color voxel histogram. Second, we compute distance from centre voxel into other voxels and generate histogram. Then we also compute fourier transform in spectral space. For capturing texture feature, we apply voxel tetra pattern. Finally, we merge all features by linear combination. For experiment, we use standard evaluation measures such as Nearest Neighbor (NN), First Tier (FT), Second Tier (ST), Average Dynamic Recall (ADR). Dataset in SHREC 2014 and its evaluation program is used to verify the proposed method. Experiment result show that the proposed method is more accurate when compared with some methods of state-of-the-art
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
View point robust visual search technique
In this thesis, we have explored visual search techniques for images taken from diferent view
points and have tried to enhance the matching capability under view point changes. We have proposed
the Homography based back-projection as post-processing stage of Compact Descriptors for
Visual Search (CDVS), the new MPEG standard; moreover, we have deined the aine adapted
scale space based aine detection, which steers the Gaussian scale space to capture the features
from aine transformed images; we have also developed the corresponding gradient based aine
descriptor. Using these proposed techniques, the image retrieval robustness to aine transformations
has been signiicantly improved.
The irst chapter of this thesis introduces the background on visual search.
In the second chapter, we propose a homography based back-projection used as the postprocessing
stage of CDVS to improve the resilience to view point changes. The theory behind
this proposal is that each perspective projection of the image of 2D object can be simulated as an
aine transformation. Each pair of aine transformations are mathematically related by homography
matrix. Given that matrix, the image can be back-projected to simulate the image of another
view point. In this way, the real matched images can then be declared as matching because the perspective
distortion has been reduced by the back-projection. An accurate homography estimation
from the images of diferent view point requires at least 4 correspondences, which could be ofered
by the CDVS pipeline. In this way, the homography based back-projection can be used to scrutinize
the images with not enough matched keypoints. If they contain some homography relations,
the perspective distortion can then be reduced exploiting the few provided correspondences. In the
experiment, this technique has been proved to be quite efective especially to the 2D object images.
The third chapter introduces the scale space, which is also the kernel to the feature detection
for the scale invariant visual search techniques. Scale space, which is made by a series of Gaussian
blurred images, represents the image structures at diferent level of details. The Gaussian smoothed
images in the scale space result in feature detection being not invariant to aine transformations.
That is the reason why scale invariant visual search techniques are sensitive to aine transformations.
Thus, in this chapter, we propose an aine adapted scale space, which employs the aine
steered Gaussian ilters to smooth the images. This scale space is lexible to diferent aine transformations
and it well represents the image structures from diferent view points. With the help of
this structure, the features from diferent view points can be well captured.
In practice, the scale invariant visual search techniques have employed a pyramid structure
to speed up the construction. By employing the aine Gaussian scale space principles, we also
propose two structures to build the aine Gaussian scale space. The structure of aine Gaussian
scale space is similar to the pyramid structure because of the similiar sampling and cascading
iii
properties. Conversely, the aine Laplacian of Gaussian (LoG) structure is completely diferent.
The Laplacian operator, under aine transformation, is hard to be aine deformed. Diferently from
a simple Laplacian operation on the scale space to build the general LoG construction, the aine
LoG can only be obtained by aine LoG convolution and the cascade implementations on the aine
scale space. Using our proposed structures, both the aine Gaussian scale space and aine LoG can
be constructed.
We have also explored the aine scale space implementation in frequency domain. In the second
chapter, we will also explore the spectrum of Gaussian image smoothing under the aine transformation,
and propose two structures. General speaking, the implementation in frequency domain is
more robust to aine transformations at the expense of a higher computational complexity.
It makes sense to adopt an aine descriptor for the aine invariant visual search. In the fourth
chapter, we will propose an aine invariant feature descriptor based on aine gradient. Currently,
the state of the art feature descriptors, including SIFT and Gradient location and orientation histogram
(GLOH), are based on the histogram of image gradient around the detected features. If
the image gradient is calculated as the diference of the adjacent pixels, it will not be aine invariant.
Thus in that chapter, we irst propose an aine gradient which will contribute the aine
invariance to the descriptor. This aine gradient will be calculated directly by the derivative of the
aine Gaussian blurred images. To simplify the processing, we will also create the corresponding
aine Gaussian derivative ilters for diferent detected scales to quickly generate the aine gradient.
With this aine gradient, we can apply the same scheme of SIFT descriptor to generate the
gradient histogram. By normalizing the histogram, the aine descriptor can then be formed. This
aine descriptor is not only aine invariant but also rotation invariant, because the direction of the
area to form the histogram is determined by the main direction of the gradient around the features.
In practice, this aine descriptor is fully aine invariant and its performance for image matching is
extremely good.
In the conclusions chapter, we draw some conclusions and describe some future work
- …