49 research outputs found

    TOWARD 3D RECONSTRUCTION OF STATIC AND DYNAMIC OBJECTS

    Get PDF
    The goal of image-based 3D reconstruction is to construct a spatial understanding of the world from a collection of images. For applications that seek to model generic real-world scenes, it is important that the reconstruction methods used are able to characterize both static scene elements (e.g. trees and buildings) as well as dynamic objects (e.g. cars and pedestrians). However, due to many inherent ambiguities in the reconstruction problem, recovering this 3D information with accuracy, robustness, and efficiency is a considerable challenge. To advance the research frontier for image-based 3D modeling, this dissertation focuses on three challenging problems in static scene and dynamic object reconstruction. We first target the problem of static scene depthmap estimation from crowd-sourced datasets (i.e. photos collected from the Internet). While achieving high-quality depthmaps using images taken under a controlled environment is already a difficult task, heterogeneous crowd-sourced data presents a unique set of challenges for multi-view depth estimation, including varying illumination and occasional occlusions. We propose a depthmap estimation method that demonstrates high accuracy, robustness, and scalability on a large number of photos collected from the Internet. Compared to static scene reconstruction, the problem of dynamic object reconstruction from monocular images is fundamentally ambiguous when not imposing any additional assumptions. This is because having only a single observation of an object is insufficient for valid 3D triangulation, which typically requires concurrent observations of the object from multiple viewpoints. Assuming that dynamic objects of the same class (e.g. all the pedestrians walking on a sidewalk) move in a common path in the real world, we develop a method that estimates the 3D positions of the dynamic objects from unstructured monocular images. Experiments on both synthetic and real datasets illustrate the solvability of the problem and the effectiveness of our approach. Finally, we address the problem of dynamic object reconstruction from a set of unsynchronized videos capturing the same dynamic event. This problem is of great interest because, due to the increased availability of portable capture devices, captures using multiple unsynchronized videos are common in the real world. To resolve the challenges that arises from non-concurrent captures and unknown temporal overlap among video streams, we propose a self-expressive dictionary learning framework, where the dictionary entries are defined as the collection of temporally varying structures. Experiments demonstrate the effectiveness of this approach to the previously unsolved problem.Doctor of Philosoph

    Evolutionary dynamics of structural features

    Get PDF
    Structural features have the potential to push the time barrier, after which we cannot test hypotheses about relatedness of languages, back in time. However, we have to know the stability of structural features in order to be able to apply them for such purposes. In this thesis I describe the typological profile of the Transeurasian languages, which serve as a data sample for the analysis of stability, build a phylogenetic tree with these languages, measure the stability of structural features as phylogenetic signal and evolutionary rate, reconstruct ancestral states of structural features and apply an admixture model from population genetics to test the performance of phonological, morphological and syntactic features in assigning languages to their respective language families and to investigate the level of diffusion in these three feature sets. More than half of structural features appear to have a high phylogenetic signal and evolve at a slow rate. I compare the stability across functional categories, parts of speech and language levels and come to a conclusion that argument marking (flagging and indexing), derivation and valency are the most stable functional categories, pronouns and nouns the most stable parts of speech and phonology and morphology the most stable language levels. The admixture model as implemented in STRUCTURE is able to correctly identify Turkic, Mongolic and Tungusic language families at the levels of morphology and syntax, whereas Japonic and Koreanic languages are assigned to the same ancestry. We see the least amount of admixture at the level of morphology and the highest level of admixture in syntactic features. One of the most important insights is that morphological features carry the most genealogical information, and these features could be used in the future to test relationships above the language family level

    ANALYSIS AND VISUALIZATION OF FLOW FIELDS USING INFORMATION-THEORETIC TECHNIQUES AND GRAPH-BASED REPRESENTATIONS

    Get PDF
    Three-dimensional flow visualization plays an essential role in many areas of science and engineering, such as aero- and hydro-dynamical systems which dominate various physical and natural phenomena. For popular methods such as the streamline visualization to be effective, they should capture the underlying flow features while facilitating user observation and understanding of the flow field in a clear manner. My research mainly focuses on the analysis and visualization of flow fields using various techniques, e.g. information-theoretic techniques and graph-based representations. Since the streamline visualization is a popular technique in flow field visualization, how to select good streamlines to capture flow patterns and how to pick good viewpoints to observe flow fields become critical. We treat streamline selection and viewpoint selection as symmetric problems and solve them simultaneously using the dual information channel [81]. To the best of my knowledge, this is the first attempt in flow visualization to combine these two selection problems in a unified approach. This work selects streamline in a view-independent manner and the selected streamlines will not change for all viewpoints. My another work [56] uses an information-theoretic approach to evaluate the importance of each streamline under various sample viewpoints and presents a solution for view-dependent streamline selection that guarantees coherent streamline update when the view changes gradually. When projecting 3D streamlines to 2D images for viewing, occlusion and clutter become inevitable. To address this challenge, we design FlowGraph [57, 58], a novel compound graph representation that organizes field line clusters and spatiotemporal regions hierarchically for occlusion-free and controllable visual exploration. We enable observation and exploration of the relationships among field line clusters, spatiotemporal regions and their interconnection in the transformed space. Most viewpoint selection methods only consider the external viewpoints outside of the flow field. This will not convey a clear observation when the flow field is clutter on the boundary side. Therefore, we propose a new way to explore flow fields by selecting several internal viewpoints around the flow features inside of the flow field and then generating a B-Spline curve path traversing these viewpoints to provide users with closeup views of the flow field for detailed observation of hidden or occluded internal flow features [54]. This work is also extended to deal with unsteady flow fields. Besides flow field visualization, some other topics relevant to visualization also attract my attention. In iGraph [31], we leverage a distributed system along with a tiled display wall to provide users with high-resolution visual analytics of big image and text collections in real time. Developing pedagogical visualization tools forms my other research focus. Since most cryptography algorithms use sophisticated mathematics, it is difficult for beginners to understand both what the algorithm does and how the algorithm does that. Therefore, we develop a set of visualization tools to provide users with an intuitive way to learn and understand these algorithms

    ENABLING TECHNIQUES FOR EXPRESSIVE FLOW FIELD VISUALIZATION AND EXPLORATION

    Get PDF
    Flow visualization plays an important role in many scientific and engineering disciplines such as climate modeling, turbulent combustion, and automobile design. The most common method for flow visualization is to display integral flow lines such as streamlines computed from particle tracing. Effective streamline visualization should capture flow patterns and display them with appropriate density, so that critical flow information can be visually acquired. In this dissertation, we present several approaches that facilitate expressive flow field visualization and exploration. First, we design a unified information-theoretic framework to model streamline selection and viewpoint selection as symmetric problems. Two interrelated information channels are constructed between a pool of candidate streamlines and a set of sample viewpoints. Based on these information channels, we define streamline information and viewpoint information to select best streamlines and viewpoints, respectively. Second, we present a focus+context framework to magnify small features and reduce occlusion around them while compacting the context region in a full view. This framework parititions the volume into blocks and deforms them to guide streamline repositioning. The desired deformation is formulated into energy terms and achieved by minimizing the energy function. Third, measuring the similarity of integral curves is fundamental to many tasks such as feature detection, pattern querying, streamline clustering and hierarchical exploration. We introduce FlowString that extracts shape invariant features from streamlines to form an alphabet of characters, and encodes each streamline into a string. The similarity of two streamline segments then becomes a specially designed edit distance between two strings. Leveraging the suffix tree, FlowString provides a string-based method for exploratory streamline analysis and visualization. A universal alphabet is learned from multiple data sets to capture basic flow patterns that exist in a variety of flow fields. This allows easy comparison and efficient query across data sets. Fourth, for exploration of vascular data sets, which contain a series of vector fields together with multiple scalar fields, we design a web-based approach for users to investigate the relationship among different properties guided by histograms. The vessel structure is mapped from the 3D volume space to a 2D graph, which allow more efficient interaction and effective visualization on websites. A segmentation scheme is proposed to divide the vessel structure based on a user specified property to further explore the distribution of that property over space

    Minimizing Computational Resources for Deep Machine Learning: A Compression and Neural Architecture Search Perspective for Image Classification and Object Detection

    Get PDF
    Computational resources represent a significant bottleneck across all current deep learning computer vision approaches. Image and video data storage requirements for training deep neural networks have led to the widespread use of image and video compression, the use of which naturally impacts the performance of neural network architectures during both training and inference. The prevalence of deep neural networks deployed on edge devices necessitates efficient network architecture design, while training neural networks requires significant time and computational resources, despite the acceleration of both hardware and software developments within the field of artificial intelligence (AI). This thesis addresses these challenges in order to minimize computational resource requirements across the entire end-to-end deep learning pipeline. We determine the extent to which data compression impacts neural network architecture performance, and by how much this performance can be recovered by retraining neural networks with compressed data. The thesis then focuses on the accessibility of the deployment of neural architecture search (NAS) to facilitate automatic network architecture generation for image classification suited to resource-constrained environments. A combined hard example mining and curriculum learning strategy is developed to minimize the image data processed during a given training epoch within the NAS search phase, without diminishing performance. We demonstrate the capability of the proposed framework across all gradient-based, reinforcement learning, and evolutionary NAS approaches, and a simple but effective method to extend the approach to the prediction-based NAS paradigm. The hard example mining approach within the proposed NAS framework depends upon the effectiveness of an autoencoder to regulate the latent space such that similar images have similar feature embeddings. This thesis conducts a thorough investigation to satisfy this constraint within the context of image classification. Based upon the success of the overall proposed NAS framework, we subsequently extend the approach towards object detection. Despite the resultant multi-label domain presenting a more difficult challenge for hard example mining, we propose an extension to the autoencoder to capture the additional object location information encoded within the training labels. The generation of an implicit attention layer within the autoencoder network sufficiently improves its capability to enforce similar images to have similar embeddings, thus successfully transferring the proposed NAS approach to object detection. Finally, the thesis demonstrates the resilience to compression of the general two-stage NAS approach upon which our proposed NAS framework is based

    Computational methods in Connectomics

    Get PDF

    Connectomic analysis of mouse barrel cortex and fly optic lobe

    Get PDF
    corecore