2,854 research outputs found

    A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

    Full text link
    Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we propose three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks. Our datasets are the first large-scale datasets to enable training and evaluating scene flow methods. Besides the datasets, we present a convolutional network for real-time disparity estimation that provides state-of-the-art results. By combining a flow and disparity estimation network and training it jointly, we demonstrate the first scene flow estimation with a convolutional network.Comment: Includes supplementary materia

    High-Density Diffuse Optical Tomography During Passive Movie Viewing: A Platform for Naturalistic Functional Brain Mapping

    Get PDF
    Human neuroimaging techniques enable researchers and clinicians to non-invasively study brain function across the lifespan in both healthy and clinical populations. However, functional brain imaging methods such as functional magnetic resonance imaging (fMRI) are expensive, resource-intensive, and require dedicated facilities, making these powerful imaging tools generally unavailable for assessing brain function in settings demanding open, unconstrained, and portable neuroimaging assessments. Tools such as functional near-infrared spectroscopy (fNIRS) afford greater portability and wearability, but at the expense of cortical field-of-view and spatial resolution. High-Density Diffuse Optical Tomography (HD-DOT) is an optical neuroimaging modality directly addresses the image quality limitations associated with traditional fNIRS techniques through densely overlapping optical measurements. This thesis aims to establish the feasibility of using HD-DOT in a novel application demanding exceptional portability and flexibility: mapping disrupted cortical activity in chronically malnourished children. I first motivate the need for dense optical measurements of brain tissue to achieve fMRI-comparable localization of brain function (Chapter 2). Then, I present imaging work completed in Cali, Colombia, where a cohort of chronically malnourished children were imaged using a custom HD-DOT instrument to establish feasibility of performing field-based neuroimaging in this population (Chapter 3). Finally, in order to meet the need for age appropriate imaging paradigms in this population, I develop passive movie viewing paradigms for use in optical neuroimaging, a flexible and rich stimulation paradigm that is suitable for both adults and children (Chapter 4)

    Developing a comprehensive framework for multimodal feature extraction

    Full text link
    Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions---ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package's architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability

    Multiframe Scene Flow with Piecewise Rigid Motion

    Full text link
    We introduce a novel multiframe scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences. In contrast to the competing methods, we take advantage of an oversegmentation of the reference frame and robust optimization techniques. We formulate scene flow recovery as a global non-linear least squares problem which is iteratively solved by a damped Gauss-Newton approach. As a result, we obtain a qualitatively new level of accuracy in RGB-D based scene flow estimation which can potentially run in real-time. Our method can handle challenging cases with rigid, piecewise rigid, articulated and moderate non-rigid motion, and does not rely on prior knowledge about the types of motions and deformations. Extensive experiments on synthetic and real data show that our method outperforms state-of-the-art.Comment: International Conference on 3D Vision (3DV), Qingdao, China, October 201

    Multiframe Scene Flow with Piecewise Rigid Motion

    Full text link
    We introduce a novel multiframe scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences. In contrast to the competing methods, we take advantage of an oversegmentation of the reference frame and robust optimization techniques. We formulate scene flow recovery as a global non-linear least squares problem which is iteratively solved by a damped Gauss-Newton approach. As a result, we obtain a qualitatively new level of accuracy in RGB-D based scene flow estimation which can potentially run in real-time. Our method can handle challenging cases with rigid, piecewise rigid, articulated and moderate non-rigid motion, and does not rely on prior knowledge about the types of motions and deformations. Extensive experiments on synthetic and real data show that our method outperforms state-of-the-art.Comment: International Conference on 3D Vision (3DV), Qingdao, China, October 201
    corecore