76 research outputs found

    Minimal information to determine affine shape equivalence.

    Get PDF

    Shape description and matching using integral invariants on eccentricity transformed images

    Get PDF
    Matching occluded and noisy shapes is a problem frequently encountered in medical image analysis and more generally in computer vision. To keep track of changes inside the breast, for example, it is important for a computer aided detection system to establish correspondences between regions of interest. Shape transformations, computed both with integral invariants (II) and with geodesic distance, yield signatures that are invariant to isometric deformations, such as bending and articulations. Integral invariants describe the boundaries of planar shapes. However, they provide no information about where a particular feature lies on the boundary with regard to the overall shape structure. Conversely, eccentricity transforms (Ecc) can match shapes by signatures of geodesic distance histograms based on information from inside the shape; but they ignore the boundary information. We describe a method that combines the boundary signature of a shape obtained from II and structural information from the Ecc to yield results that improve on them separately

    Affine equivalences of trigonometric curves

    Get PDF
    We provide an efficient algorithm to detect whether two given trigonometric curves, i.e. two parametrized curves whose components are truncated Fourier series, in any dimension, are affinely equivalent, i.e. whether there exists an affine mapping transforming one of the curves onto the other. If the coefficients of the parametrizations are known exactly (the exact case), the algorithm boils down to univariate gcd computation, so it is efficient and fast. If the coefficients of the parametrizations are known with finite precision, e.g. floating point numbers (the approximate case), the univariate gcd computation is replaced by the computation of singular values of an appropriate matrix. Our experiments show that the method works well, even for high degrees.Agencia Estatal de InvestigaciĂł

    Differential invariant signatures and flows in computer vision : a symmetry group approach

    Get PDF
    Includes bibliographical references (p. 40-44).Supported by the National Science Foundation. DMS-9204192 DMS-8811084 ECS-9122106 Supported by the Air Force Office of Scientific Research. AFOSR-90-0024 Supported by the Rothschild Foundation-Yad Hanadiv and by Image Evolutions, Ltd.Peter J. Olver, Guillermo Sapiro, Allen Tannenbaum

    Study Of Human Activity In Video Data With An Emphasis On View-invariance

    Get PDF
    The perception and understanding of human motion and action is an important area of research in computer vision that plays a crucial role in various applications such as surveillance, HCI, ergonomics, etc. In this thesis, we focus on the recognition of actions in the case of varying viewpoints and different and unknown camera intrinsic parameters. The challenges to be addressed include perspective distortions, differences in viewpoints, anthropometric variations, and the large degrees of freedom of articulated bodies. In addition, we are interested in methods that require little or no training. The current solutions to action recognition usually assume that there is a huge dataset of actions available so that a classifier can be trained. However, this means that in order to define a new action, the user has to record a number of videos from different viewpoints with varying camera intrinsic parameters and then retrain the classifier, which is not very practical from a development point of view. We propose algorithms that overcome these challenges and require just a few instances of the action from any viewpoint with any intrinsic camera parameters. Our first algorithm is based on the rank constraint on the family of planar homographies associated with triplets of body points. We represent action as a sequence of poses, and decompose the pose into triplets. Therefore, the pose transition is broken down into a set of movement of body point planes. In this way, we transform the non-rigid motion of the body points into a rigid motion of body point iii planes. We use the fact that the family of homographies associated with two identical poses would have rank 4 to gauge similarity of the pose between two subjects, observed by different perspective cameras and from different viewpoints. This method requires only one instance of the action. We then show that it is possible to extend the concept of triplets to line segments. In particular, we establish that if we look at the movement of line segments instead of triplets, we have more redundancy in data thus leading to better results. We demonstrate this concept on “fundamental ratios.” We decompose a human body pose into line segments instead of triplets and look at set of movement of line segments. This method needs only three instances of the action. If a larger dataset is available, we can also apply weighting on line segments for better accuracy. The last method is based on the concept of “Projective Depth”. Given a plane, we can find the relative depth of a point relative to the given plane. We propose three different ways of using “projective depth:” (i) Triplets - the three points of a triplet along with the epipole defines the plane and the movement of points relative to these body planes can be used to recognize actions; (ii) Ground plane - if we are able to extract the ground plane, we can find the “projective depth” of the body points with respect to it. Therefore, the problem of action recognition would translate to curve matching; and (iii) Mirror person - We can use the mirror view of the person to extract mirror symmetric planes. This method also needs only one instance of the action. Extensive experiments are reported on testing view invariance, robustness to noisy localization and occlusions of body points, and action recognition. The experimental results are very promising and demonstrate the efficiency of our proposed invariants. i

    Geometry-driven feature detection

    Get PDF
    Matching images taken from different viewpoints is a fundamental step for many computer vision applications including 3D reconstruction, scene recognition, virtual reality, robot localization, etc. The typical approaches detect feature keypoints based on local properties to achieve robustness to viewpoint changes, and establish correspondences between keypoints to recover the 3D geometry or determine the similarity between images. The complexity of perspective distortion challenges the detection of viewpoint invariant features; the lack of 3D geometric information about local features makes their matching inefficient. In this thesis, I explore feature detection based on 3D geometric information for improved projective invariance. The main novel research contributions of this thesis are as follows. First, I give a projective invariant feature detection method that exploits 3D structures recovered from simple stereo matching. By leveraging the rich geometric information of the detected features, I present an efficient 3D matching algorithm to handle large viewpoint changes. Second, I propose a compact high-level feature detector that robustly extracts repetitive structures in urban scenes, which allows efficient wide-baseline matching. I further introduce a novel single-view reconstruction approach to recover the 3D dense geometry of the repetition-based features

    Projective shapes: topology and means

    Get PDF
    The projective shape of an object consists of the geometric information that is invariant under different camera views. When describing an object as a configuration of k points or ``landmarks'' in real projective space RP(d), then the set of projective shapes can be defined as the set RP(d)^k / PGL(d) of equivalence classes of configurations under the component-wise action of projective transformations. Equipped with the quotient topology, the space of projective shapes is topologically ill-behaved just like in the cases of similarity and affine shapes. In particular, it is neither a manifold nor metrizable. In this thesis the topological structure of projective shape space is analysed in detail in quest for a reasonable topological subspace which is convenient enough for the application of mathematical tools. Further, it is shown that the topological subspace of Tyler regular shapes introduced by Kent and Mardia fulfills all required properties except for some number of landmarks k and dimensions d. Then using Tyler standardization, Procrustes distances and Riemannian structures can be defined on the subspace of Tyler regular shapes. For one of these Procrustes distances, a projective mean shape is defined by using the more general concept of Fréchet means. Since the computation of the corresponding sample mean is rather intricate, a new mean is introduced and discussed.Die projektive Form eines Objektes ist die geometrische Information, die invariant unter projektiven Transformationen ist. Sie tritt natürlicherweise bei der Rekonstruktion von Objekten anhand Fotos unkalibrierter Kameras auf. Wenn ein Objekt als Punktmenge oder Konfiguration von Landmarken im d-dimensionalen reell-projektiven Raum RP(d) beschrieben wird, so ist die Menge der projektiven Formen der Quotientenraum RP(d)^k / PGL(d) und damit kanonisch mit der Quotiententopologie versehen. Auf diesem topologischen Raum der projektiven Formen lassen sich jedoch aus topologischen Gründen viele mathematische Werkzeuge nicht anwenden, ein Phänomen, welches in ähnlicher Form auch bei den Räumen der Ähnlichkeits- bzw. affinen Formen auftritt. In der vorliegenden Arbeit wird die Topologie des projektiven Formenraumes gründlich untersucht, in Hinblick auf die Suche nach einem vernünftigen topologischen Unterraum, der hinreichende Eigenschaften für die Anwendung statistischer Methoden besitzt. Ein Beispiel für einen dieser gutartigen Unterräume ist der Raum der Tyler regulären Formen, der bereits durch Kent und Mardia betrachtet wurde. Deren Ergebnisse werden in dieser Arbeit noch erweitert. Dieser Unterraum ist zwar für einige Dimensionen d und Anzahlen an Landmarken k nicht optimal gewählt, jedoch liefert die so-genannte Tyler-Standardisierung dieser Formen einem sowohl Einbettungen in metrische Räume als auch eine Riemannsche Metrik auf diesem Unterraum. Für eine dieser Einbettungen werden die dazugehörige Fréchet-Erwartungs- sowie Mittelwerte definiert. Während die Konsistenz dieses Mittelwertes leicht zu zeigen ist, ist die Berechnung des extrinsischen Mittelwertes numerisch anspruchsvoll. Als Ersatz wird ein weiterer Erwartungs- bzw. Mittelwert definiert, dessen Berechnung diese Probleme umgeh

    View-Invariance in Visual Human Motion Analysis

    Get PDF
    This thesis makes contributions towards the solutions to two problems in the area of visual human motion analysis: human action recognition and human body pose estimation. Although there has been a substantial amount of research addressing these two problems in the past, the important issue of viewpoint invariance in the representation and recognition of poses and actions has received relatively scarce attention, and forms a key goal of this thesis. Drawing on results from 2D projective invariance theory and 3D mutual invariants, we present three different approaches of varying degrees of generality, for human action representation and recognition. A detailed analysis of the approaches reveals key challenges, which are circumvented by enforcing spatial and temporal coherency constraints. An extensive performance evaluation of the approaches on 2D projections of motion capture data and manually segmented real image sequences demonstrates that in addition to viewpoint changes, the approaches are able to handle well, varying speeds of execution of actions (and hence different frame rates of the video), different subjects and minor variabilities in the spatiotemporal dynamics of the action. Next, we present a method for recovering the body-centric coordinates of key joints and parts of a canonically scaled human body, given an image of the body and the point correspondences of specific body joints in an image. This problem is difficult to solve because of body articulation and perspective effects. To make the problem tractable, previous researchers have resorted to restricting the camera model or requiring an unrealistic number of point correspondences, both of which are more restrictive than necessary. We present a solution for the general case of a perspective uncalibrated camera. Our method requires that the torso does not twist considerably, an assumption that is usually satisfied for many poses of the body. We evaluate the quantitative performance of the method on synthetic data and the qualitative performance of the method on real images taken with unknown cameras and viewpoints. Both these evaluations show the effectiveness of the method at recovering the pose of the human body
    • …
    corecore