63 research outputs found

    On Motion Parameterizations in Image Sequences from Fixed Viewpoints

    Get PDF
    This dissertation addresses the problem of parameterizing object motion within a set of images taken with a stationary camera. We develop data-driven methods across all image scales: characterizing motion observed at the scale of individual pixels, along extended structures such as roads, and whole image deformations such as lungs deforming over time. The primary contributions include: a) fundamental studies of the relationship between spatio-temporal image derivatives accumulated at a pixel, and the object motions at that pixel,: b) data driven approaches to parameterize breath motion and reconstruct lung CT data volumes, and: c) defining and offering initial results for a new class of Partially Unsupervised Manifold Learning: PUML) problems, which often arise in medical imagery. Specifically, we create energy functions for measuring how consistent a given velocity vector is with observed spatio-temporal image derivatives. These energy functions are used to fit parametric snake models to roads using velocity constraints. We create an automatic data-driven technique for finding the breath phase of lung CT scans which is able to replace external belt measurements currently in use clinically. This approach is extended to automatically create a full deformation model of a CT lung volume during breathing or heart MRI during breathing and heartbeat. Additionally, motivated by real use cases, we address a scenario in which a dataset is collected along with meta-data which describes some, but not all, aspects of the dataset. We create an embedding which displays the remaining variability in a dataset after accounting for variability related to the meta-data

    NaRPA: Navigation and Rendering Pipeline for Astronautics

    Full text link
    This paper presents Navigation and Rendering Pipeline for Astronautics (NaRPA) - a novel ray-tracing-based computer graphics engine to model and simulate light transport for space-borne imaging. NaRPA incorporates lighting models with attention to atmospheric and shading effects for the synthesis of space-to-space and ground-to-space virtual observations. In addition to image rendering, the engine also possesses point cloud, depth, and contour map generation capabilities to simulate passive and active vision-based sensors and to facilitate the designing, testing, or verification of visual navigation algorithms. Physically based rendering capabilities of NaRPA and the efficacy of the proposed rendering algorithm are demonstrated using applications in representative space-based environments. A key demonstration includes NaRPA as a tool for generating stereo imagery and application in 3D coordinate estimation using triangulation. Another prominent application of NaRPA includes a novel differentiable rendering approach for image-based attitude estimation is proposed to highlight the efficacy of the NaRPA engine for simulating vision-based navigation and guidance operations.Comment: 49 pages, 22 figure

    Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data

    Full text link
    Abstract Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be ‘team science’.http://deepblue.lib.umich.edu/bitstream/2027.42/134522/1/13742_2016_Article_117.pd

    Dimensionality reduction and sparse representations in computer vision

    Get PDF
    The proliferation of camera equipped devices, such as netbooks, smartphones and game stations, has led to a significant increase in the production of visual content. This visual information could be used for understanding the environment and offering a natural interface between the users and their surroundings. However, the massive amounts of data and the high computational cost associated with them, encumbers the transfer of sophisticated vision algorithms to real life systems, especially ones that exhibit resource limitations such as restrictions in available memory, processing power and bandwidth. One approach for tackling these issues is to generate compact and descriptive representations of image data by exploiting inherent redundancies. We propose the investigation of dimensionality reduction and sparse representations in order to accomplish this task. In dimensionality reduction, the aim is to reduce the dimensions of the space where image data reside in order to allow resource constrained systems to handle them and, ideally, provide a more insightful description. This goal is achieved by exploiting the inherent redundancies that many classes of images, such as faces under different illumination conditions and objects from different viewpoints, exhibit. We explore the description of natural images by low dimensional non-linear models called image manifolds and investigate the performance of computer vision tasks such as recognition and classification using these low dimensional models. In addition to dimensionality reduction, we study a novel approach in representing images as a sparse linear combination of dictionary examples. We investigate how sparse image representations can be used for a variety of tasks including low level image modeling and higher level semantic information extraction. Using tools from dimensionality reduction and sparse representation, we propose the application of these methods in three hierarchical image layers, namely low-level features, mid-level structures and high-level attributes. Low level features are image descriptors that can be extracted directly from the raw image pixels and include pixel intensities, histograms, and gradients. In the first part of this work, we explore how various techniques in dimensionality reduction, ranging from traditional image compression to the recently proposed Random Projections method, affect the performance of computer vision algorithms such as face detection and face recognition. In addition, we discuss a method that is able to increase the spatial resolution of a single image, without using any training examples, according to the sparse representations framework. In the second part, we explore mid-level structures, including image manifolds and sparse models, produced by abstracting information from low-level features and offer compact modeling of high dimensional data. We propose novel techniques for generating more descriptive image representations and investigate their application in face recognition and object tracking. In the third part of this work, we propose the investigation of a novel framework for representing the semantic contents of images. This framework employs high level semantic attributes that aim to bridge the gap between the visual information of an image and its textual description by utilizing low level features and mid level structures. This innovative paradigm offers revolutionary possibilities including recognizing the category of an object from purely textual information without providing any explicit visual example

    Manifold Based Deep Learning: Advances and Machine Learning Applications

    Get PDF
    Manifolds are topological spaces that are locally Euclidean and find applications in dimensionality reduction, subspace learning, visual domain adaptation, clustering, and more. In this dissertation, we propose a framework for linear dimensionality reduction called the proxy matrix optimization (PMO) that uses the Grassmann manifold for optimizing over orthogonal matrix manifolds. PMO is an iterative and flexible method that finds the lower-dimensional projections for various linear dimensionality reduction methods by changing the objective function. PMO is suitable for Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), Maximum Autocorrelation Factors (MAF), and Locality Preserving Projections (LPP). We extend PMO to incorporate robust Lp-norm versions of PCA and LDA, which uses fractional p-norms making them more robust to noisy data and outliers. The PMO method is designed to be realized as a layer in a neural network for maximum benefit. In order to do so, the incremental versions of PCA, LDA, and LPP are included in the PMO framework for problems where the data is not all available at once. Next, we explore the topic of domain shift in visual domain adaptation by combining concepts from spherical manifolds and deep learning. We investigate domain shift, which quantifies how well a model trained on a source domain adapts to a similar target domain with a metric called Spherical Optimal Transport (SpOT). We adopt the spherical manifold along with an orthogonal projection loss to obtain the features from the source and target domains. We then use the optimal transport with the cosine distance between the features as a way to measure the gap between the domains. We show, in our experiments with domain adaptation datasets, that SpOT does better than existing measures for quantifying domain shift and demonstrates a better correlation with the gain of transfer across domains

    Relative Pose Estimation Using Non-overlapping Multicamera Clusters

    Get PDF
    This thesis considers the Simultaneous Localization and Mapping (SLAM) problem using a set of perspective cameras arranged such that there is no overlap in their fields-of-view. With the known and fixed extrinsic calibration of each camera within the cluster, a novel real-time pose estimation system is presented that is able to accurately track the motion of a camera cluster relative to an unknown target object or environment and concurrently generate a model of the structure, using only image-space measurements. A new parameterization for point feature position using a spherical coordinate update is presented which isolates system parameters dependent on global scale, allowing the shape parameters of the system to converge despite the scale parameters remaining uncertain. Furthermore, a flexible initialization scheme is proposed which allows the optimization to converge accurately using only the measurements from the cameras at the first time step. An analysis is presented identifying the configurations of the cluster motions and target structure geometry for which the optimization solution becomes degenerate and the global scale is ambiguous. Results are presented that not only confirm the previously known critical motions for a two-camera cluster, but also provide a complete description of the degeneracies related to the point feature constellations. The proposed algorithms are implemented and verified in experiments with a camera cluster constructed using multiple perspective cameras mounted on a quadrotor vehicle and augmented with tracking markers to collect high-precision ground-truth motion measurements from an optical indoor positioning system. The accuracy and performance of the proposed pose estimation system are confirmed for various motion profiles in both indoor and challenging outdoor environments

    Learning Equivariant Representations

    Get PDF
    State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present

    Spacecraft Trajectory Planning for Optimal Observability using Angles-Only Navigation

    Get PDF
    This work leverages existing techniques in angles-only navigation to develop optimal range observability maneuvers and trajectory planning methods for spacecraft under constrained relative motion. The resulting contribution is a guidance method for impulsive rendezvous and proximity operations valid for elliptic orbits of arbitrary eccentricity. The system dynamics describe the relative motion of an arbitrary number of maneuvering (chaser) spacecraft about a single non-cooperative resident-space-object (RSO). The chaser spacecraft motion is constrained in terms of the 1) collision bounds of the RSO, 2) maximum fuel usage, 3) eclipse avoidance, and 4) optical sensor field of view restrictions. When more than one chaser is present, additional constraints include 1) collision avoidance between formation members, and 2) formation longevity via fuel usage balancing. Depending on the type of planetary orbit, quasi-circular or elliptic, the relative motion dynamics are approximated using a linear time-invariant or a linear time-varying system, respectively. The proposed method uses two distinct parameterizations corresponding to each system type to reduce the optimization problem from 12 to 2 variables in Cartesian space, thus simplifying an otherwise intractable optimization problem
    • …
    corecore