148 research outputs found
DTW-Radon-based Shape Descriptor for Pattern Recognition
International audienceIn this paper, we present a pattern recognition method that uses dynamic programming (DP) for the alignment of Radon features. The key characteristic of the method is to use dynamic time warping (DTW) to match corresponding pairs of the Radon features for all possible projections. Thanks to DTW, we avoid compressing the feature matrix into a single vector which would otherwise miss information. To reduce the possible number of matchings, we rely on a initial normalisation based on the pattern orientation. A comprehensive study is made using major state-of-the-art shape descriptors over several public datasets of shapes such as graphical symbols (both printed and hand-drawn), handwritten characters and footwear prints. In all tests, the method proves its generic behaviour by providing better recognition performance. Overall, we validate that our method is robust to deformed shape due to distortion, degradation and occlusion
Appearance modeling under geometric context for object recognition in videos
Object recognition is a very important high-level task in
surveillance applications. This dissertation focuses on building
appearance models for object recognition and exploring the
relationship between shape and appearance for two key types of
objects, human and vehicle. The dissertation proposes a generic
framework that models the appearance while incorporating certain
geometric prior information, or the so-called geometric
context. Then under this framework, special methods are developed
for recognizing humans and vehicles based on their appearance and
shape attributes in surveillance videos.
The first part of the dissertation presents a unified framework
based on a general definition of geometric transform (GeT) which is
applied to modeling object appearances under geometric context. The
GeT models the appearance by applying designed functionals over
certain geometric sets. GeT unifies Radon transform, trace
transform, image warping etc. Moreover, five novel types of GeTs are
introduced and applied to fingerprinting the appearance inside a
contour. They include GeT based on level sets, GeT based on shape
matching, GeT based on feature curves, GeT invariant to occlusion,
and a multi-resolution GeT (MRGeT) that combines both shape and
appearance information.
The second part focuses on how to use the GeT to build appearance
models for objects like walking humans, which have articulated
motion of body parts. This part also illustrates the application of
GeT for object recognition, image segmentation, video retrieval, and
image synthesis. The proposed approach produces promising results
when applied to automatic body part segmentation and fingerprinting
the appearance of a human and body parts despite the presence of
non-rigid deformations and articulated motion.
It is very important to understand the 3D structure of vehicles in
order to recognize them. To reconstruct the 3D model of a vehicle,
the third part presents a factorization method for structure from
planar motion. Experimental results show that the algorithm
is accurate and fairly robust to noise and inaccurate calibration.
Differences and the dual relationship between planar motion and
planar object are also clarified in this part. Based on our method,
a fully automated vehicle reconstruction system has been designed
Shape-based invariant features extraction for object recognition
International audienceThe emergence of new technologies enables generating large quantity of digital information including images; this leads to an increasing number of generated digital images. Therefore it appears a necessity for automatic systems for image retrieval. These systems consist of techniques used for query specification and re-trieval of images from an image collection. The most frequent and the most com-mon means for image retrieval is the indexing using textual keywords. But for some special application domains and face to the huge quantity of images, key-words are no more sufficient or unpractical. Moreover, images are rich in content; so in order to overcome these mentioned difficulties, some approaches are pro-posed based on visual features derived directly from the content of the image: these are the content-based image retrieval (CBIR) approaches. They allow users to search the desired image by specifying image queries: a query can be an exam-ple, a sketch or visual features (e.g., colour, texture and shape). Once the features have been defined and extracted, the retrieval becomes a task of measuring simi-larity between image features. An important property of these features is to be in-variant under various deformations that the observed image could undergo. In this chapter, we will present a number of existing methods for CBIR applica-tions. We will also describe some measures that are usually used for similarity measurement. At the end, and as an application example, we present a specific ap-proach, that we are developing, to illustrate the topic by providing experimental results
Preserving Trustworthiness and Confidentiality for Online Multimedia
Technology advancements in areas of mobile computing, social networks, and cloud computing have rapidly changed the way we communicate and interact. The wide adoption of media-oriented mobile devices such as smartphones and tablets enables people to capture information in various media formats, and offers them a rich platform for media consumption. The proliferation of online services and social networks makes it possible to store personal multimedia collection online and share them with family and friends anytime anywhere. Considering the increasing impact of digital multimedia and the trend of cloud computing, this dissertation explores the problem of how to evaluate trustworthiness and preserve confidentiality of online multimedia data.
The dissertation consists of two parts. The first part examines the problem of evaluating trustworthiness of multimedia data distributed online. Given the digital nature of multimedia data, editing and tampering of the multimedia content becomes very easy. Therefore, it is important to analyze and reveal the processing history of a multimedia document in order to evaluate its trustworthiness. We propose a new forensic technique called ``Forensic Hash", which draws synergy between two related research areas of image hashing and non-reference multimedia forensics. A forensic hash is a compact signature capturing important information from the original multimedia document to assist forensic analysis and reveal processing history of a multimedia document under question. Our proposed technique is shown to have the advantage of being compact and offering efficient and accurate analysis to forensic questions that cannot be easily answered by convention forensic techniques. The answers that we obtain from the forensic hash provide valuable information on the trustworthiness of online multimedia data.
The second part of this dissertation addresses the confidentiality issue of multimedia data stored with online services. The emerging cloud computing paradigm makes it attractive to store private multimedia data online for easy access and sharing. However, the potential of cloud services cannot be fully reached unless the issue of how to preserve confidentiality of sensitive data stored in the cloud is addressed. In this dissertation, we explore techniques that enable confidentiality-preserving search of encrypted multimedia, which can play a critical role in secure online multimedia services. Techniques from image processing, information retrieval, and cryptography are jointly and strategically applied to allow efficient rank-ordered search over encrypted multimedia database and at the same time preserve data confidentiality against malicious intruders and service providers. We demonstrate high efficiency and accuracy of the proposed techniques and provide a quantitative comparative study with conventional techniques based on heavy-weight cryptography primitives
Rotation Recovery from Spherical Images without Correspondences
This paper addresses the problem of rotation estimation directly from images defined on the sphere and without correspondence. The method is particularly useful for the alignment of large rotations and has potential impact on 3D shape alignment. The foundation of the method lies in the fact that the spherical harmonic coefficients undergo a unitary mapping when the original image is rotated. The correlation between two images is a function of rotations and we show that it has an SO(3)-Fourier transform equal to the pointwise product of spherical harmonic coefficients of the original images. The resolution of the rotation space depends on the bandwidth we choose for the harmonic expansion and the rotation estimate is found through a direct search in this 3D discretized space. A refinement of the rotation estimate can be obtained from the conservation of harmonic coefficients in the rotational shift theorem. A novel decoupling of the shift theorem with respect to the Euler angles is presented and exploited in an iterative scheme to refine the initial rotation estimates. Experiments show the suitability of the method for large rotations and the dependence of the method on bandwidth and the choice of the spherical harmonic coefficients
Diffeomorphic Transformations for Time Series Analysis: An Efficient Approach to Nonlinear Warping
The proliferation and ubiquity of temporal data across many disciplines has
sparked interest for similarity, classification and clustering methods
specifically designed to handle time series data. A core issue when dealing
with time series is determining their pairwise similarity, i.e., the degree to
which a given time series resembles another. Traditional distance measures such
as the Euclidean are not well-suited due to the time-dependent nature of the
data. Elastic metrics such as dynamic time warping (DTW) offer a promising
approach, but are limited by their computational complexity,
non-differentiability and sensitivity to noise and outliers. This thesis
proposes novel elastic alignment methods that use parametric \& diffeomorphic
warping transformations as a means of overcoming the shortcomings of DTW-based
metrics. The proposed method is differentiable \& invertible, well-suited for
deep learning architectures, robust to noise and outliers, computationally
efficient, and is expressive and flexible enough to capture complex patterns.
Furthermore, a closed-form solution was developed for the gradient of these
diffeomorphic transformations, which allows an efficient search in the
parameter space, leading to better solutions at convergence. Leveraging the
benefits of these closed-form diffeomorphic transformations, this thesis
proposes a suite of advancements that include: (a) an enhanced temporal
transformer network for time series alignment and averaging, (b) a
deep-learning based time series classification model to simultaneously align
and classify signals with high accuracy, (c) an incremental time series
clustering algorithm that is warping-invariant, scalable and can operate under
limited computational and time resources, and finally, (d) a normalizing flow
model that enhances the flexibility of affine transformations in coupling and
autoregressive layers.Comment: PhD Thesis, defended at the University of Navarra on July 17, 2023.
277 pages, 8 chapters, 1 appendi
Motion correction of PET/CT images
Indiana University-Purdue University Indianapolis (IUPUI)The advances in health care technology help physicians make more accurate diagnoses about the health conditions of their patients. Positron Emission Tomography/Computed Tomography (PET/CT) is one of the many tools currently used to diagnose health and disease in patients. PET/CT explorations are typically used to detect: cancer, heart diseases, disorders in the central nervous system. Since PET/CT studies can take up to 60 minutes or more, it is impossible for patients to remain motionless throughout the scanning process. This movements create motion-related artifacts which alter the quantitative and qualitative results produced by the scanning process. The patient's motion results in image blurring, reduction in the image signal to noise ratio, and reduced image contrast, which could lead to misdiagnoses.
In the literature, software and hardware-based techniques have been studied to implement motion correction over medical files. Techniques based on the use of an external motion tracking system are preferred by researchers because they present a better accuracy. This thesis proposes a motion correction system that uses 3D affine registrations using particle swarm optimization and an off-the-shelf Microsoft Kinect camera to eliminate or reduce errors caused by the patient's motion during a medical imaging study
- …