20 research outputs found

    ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

    Full text link
    Object detection in wide area motion imagery (WAMI) has drawn the attention of the computer vision research community for a number of years. WAMI proposes a number of unique challenges including extremely small object sizes, both sparse and densely-packed objects, and extremely large search spaces (large video frames). Nearly all state-of-the-art methods in WAMI object detection report that appearance-based classifiers fail in this challenging data and instead rely almost entirely on motion information in the form of background subtraction or frame-differencing. In this work, we experimentally verify the failure of appearance-based classifiers in WAMI, such as Faster R-CNN and a heatmap-based fully convolutional neural network (CNN), and propose a novel two-stage spatio-temporal CNN which effectively and efficiently combines both appearance and motion information to significantly surpass the state-of-the-art in WAMI object detection. To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI). These ROOBI can contain from one to clusters of several hundred objects due to the large video frame size and varying object density in WAMI. The second stage (FoveaNet) then estimates the centroid location of all objects in that given ROOBI simultaneously via heatmap estimation. The proposed method exceeds state-of-the-art results on the WPAFB 2009 dataset by 5-16% for moving objects and nearly 50% for stopped objects, as well as being the first proposed method in wide area motion imagery to detect completely stationary objects.Comment: Main paper is 8 pages. Supplemental section contains a walk-through of our method (using a qualitative example) and qualitative results for WPAFB 2009 datase

    Algorithms and Applications of Novel Capsule Networks

    Get PDF
    Convolutional neural networks, despite their profound impact in countless domains, suffer from significant shortcomings. Linearly-combined scalar feature representations and max pooling operations lead to spatial ambiguities and a lack of robustness to pose variations. Capsule networks can potentially alleviate these issues by storing and routing the pose information of extracted features through their architectures, seeking agreement between the lower-level predictions of higher-level poses at each layer. In this dissertation, we make several key contributions to advance the algorithms of capsule networks in segmentation and classification applications. We create the first ever capsule-based segmentation network in the literature, SegCaps, by introducing a novel locally-constrained dynamic routing algorithm, transformation matrix sharing, the concept of a deconvolutional capsule, extension of the reconstruction regularization to segmentation, and a new encoder-decoder capsule architecture. Following this, we design a capsule-based diagnosis network, D-Caps, which builds off SegCaps and introduces a novel capsule-average pooling technique to handle to larger medical imaging data. Finally, we design an explainable capsule network, X-Caps, which encodes high-level visual object attributes within its capsules by utilizing a multi-task framework and a novel routing sigmoid function which independently routes information from child capsules to parents. Predictions come with human-level explanations, via object attributes, and a confidence score, by training our network directly on the distribution of expert labels, modeling inter-observer agreement and punishing over/under confidence during training. This body of work constitutes significant algorithmic advances to the application of capsule networks, especially in real-world biomedical imaging data

    Deformable Capsules for Object Detection

    Full text link
    In this study, we introduce a new family of capsule networks, deformable capsules (DeformCaps), to address a very important problem in computer vision: object detection. We propose two new algorithms associated with our DeformCaps: a novel capsule structure (SplitCaps), and a novel dynamic routing algorithm (SE-Routing), which balance computational efficiency with the need for modeling a large number of objects and classes, which have never been achieved with capsule networks before. We demonstrate that the proposed methods allow capsules to efficiently scale-up to large-scale computer vision tasks for the first time, and create the first-ever capsule network for object detection in the literature. Our proposed architecture is a one-stage detection framework and obtains results on MS COCO which are on-par with state-of-the-art one-stage CNN-based methods, while producing fewer false positive detections, generalizing to unusual poses/viewpoints of objects

    Capsules for Biomedical Image Segmentation

    Full text link
    Our work expands the use of capsule networks to the task of object segmentation for the first time in the literature. This is made possible via the introduction of locally-constrained routing and transformation matrix sharing, which reduces the parameter/memory burden and allows for the segmentation of objects at large resolutions. To compensate for the loss of global information in constraining the routing, we propose the concept of "deconvolutional" capsules to create a deep encoder-decoder style network, called SegCaps. We extend the masked reconstruction regularization to the task of segmentation and perform thorough ablation experiments on each component of our method. The proposed convolutional-deconvolutional capsule network, SegCaps, shows state-of-the-art results while using a fraction of the parameters of popular segmentation networks. To validate our proposed method, we perform experiments segmenting pathological lungs from clinical and pre-clinical thoracic computed tomography (CT) scans and segmenting muscle and adipose (fat) tissue from magnetic resonance imaging (MRI) scans of human subjects' thighs. Notably, our experiments in lung segmentation represent the largest-scale study in pathological lung segmentation in the literature, where we conduct experiments across five extremely challenging datasets, containing both clinical and pre-clinical subjects, and nearly 2000 computed-tomography scans. Our newly developed segmentation platform outperforms other methods across all datasets while utilizing less than 5% of the parameters in the popular U-Net for biomedical image segmentation. Further, we demonstrate capsules' ability to generalize to unseen rotations/reflections on natural images.Comment: Extension of the non-archival Capsules of Object Segmentation with experiments on both clinical and pre-clinical pathological lung segmentation from CT scans and muscular and adipose tissue segmentation from MR images. Accepted for publication in Medical Image Analysis. DOI: https://doi.org/10.1016/j.media.2020.101889. arXiv admin note: text overlap with arXiv:1804.0424

    BASE (Barberton Archean Surface Environments) – drilling Paleoarchean coastal strata of the Barberton Greenstone Belt

    Get PDF
    The BASE (Barberton Archean Surface Environments) scientific drilling project aimed at recovering an unweathered continuous core from the Paleoarchean Moodies Group (ca. 3.2 Ga), central Barberton Greenstone Belt (BGB), South Africa. These strata comprise some of the oldest well-preserved sedimentary strata on Earth, deposited within only a few million years in alluvial, fluvial, coastal-deltaic, tidal, and prodeltaic settings. They represent a very-high-resolution record of Paleoarchean surface conditions and processes. Moodies Group strata consist of polymict conglomerates, widespread quartzose, lithic and arkosic sandstones, siltstones, shales, and rare banded-iron formations (BIFs) and jaspilites, interbedded with tuffs and several thin lavas. This report describes objectives, drilling, and data sets; it supplements the operational report. Eight inclined boreholes between 280 and 495 m length, drilled from November 2021 through July 2022, obtained a total of 2903 m of curated core of variable quality through steeply to subvertically dipping, in part overturned stratigraphic sections. All drilling objectives were reached. Boreholes encountered a variety of conglomerates, diverse and abundant, mostly tuffaceous sandstones, rhythmically laminated shale-siltstone and banded-iron formations, and several horizons of early-diagenetic silicified sulfate concretions. Oxidative weathering reached far deeper than expected. Fracturing was more intense, and BIFs and jaspilites were thicker than anticipated. Two ca. 1 km long mine adits and a water tunnel, traversing four thick stratigraphic sections within the upper Moodies Group in the central BGB, were also sampled. All boreholes were logged by downhole wireline geophysical instruments. The core was processed (oriented, slabbed, photographed, described, and archived) in a large, publicly accessible hall in downtown Barberton. A geological exhibition provided background explanations for visitors and related the drilling objectives to the recently established Barberton Makhonjwa Mountains World Heritage Site. A substantial education, outreach, and publicity program addressed the information needs of the local population and of local and regional stakeholders

    Explanatory Remarks on the Operational Dataset about Drilling in the Moodies Group of the Barberton Greenstone Belt (BASE -Barberton Archean Surface Environments)

    Full text link
    peer reviewedAll datasets provided in the operational dataset (Heubeck et al., 2024) of the ICDP project BASE (ICDP 5069) consist of metadata, data and/or images. Here, a summary of explanations of the tables, data and images exported from the database of the project (mDIS BASE) are given and are complimented by additional information on data from measurements done in the laboratory prior to the sampling party. Finally, the sampling data from the first two sampling parties are added. Some basic definitions of identifiers used in ICDP, depths corrections and measurements are also introduced

    Operational Report about drilling in the Moodies Group of the Barberton Greenstone Belt (BASE -Barberton Archean Surface Environments)

    Full text link
    peer reviewedThe BASE (Barberton Archean Surface Environments) scientific drilling project focused on recovering unweathered continuous core through strata of the Paleoarchean Moodies Group (ca. 3.2 Ga), central Barberton Greenstone Belt (BGB), South Africa. They comprise some of the oldest well-preserved sedimentary strata on Earth, deposited within only a few million years in alluvial, fluvial, coastal-deltaic, tidal, and prodeltaic settings and represent a veryhigh-resolution record of Paleoarchean surface conditions and processes. Moodies Group strata consist of polymict conglomerates, widespread quartzose, lithic and arkosic sandstones, siltstones, shales, and rare BIFs and jaspilites, interbedded with tuffs and several thin lavas. This report describes operations from preparations to the sampling workshop and complements the related scientific report. Eight inclined boreholes between 280 and 495 m length, drilled during November 2021 through July 2022, obtained a total of 2903 m of curated core of variable quality through steeply to subvertically dipping, in part overturned stratigraphic sections. All drilling objectives were reached. Boreholes encountered a variety of conglomerates, diverse and abundant, mostly tuffaceous sandstones, rhythmically laminated shale-siltstone and banded-iron formations, and several horizons of early-diagenetic sulfate concretions. Oxidative weathering reached far deeper than expected; fracturing was more intense, and BIFs and jaspilites were thicker than anticipated. Two km-long mine adits and a water tunnel, traversing four thick stratigraphic sections within the upper Moodies Group in the central BGB, were also sampled. All boreholes were logged by geophysical instruments. Core was processed (oriented, slabbed, photographed, described, and archived) in a large, publicly accessible hall in downtown Barberton. An exhibition provided background explanations for visitors and related the drilling objectives to the recently established Barberton-Makhonjwa Mountains World Heritage Site. A substantial education, outreach and publicity program addressed the information needs of the local population and of local and regional stakeholders. Referencing article

    Clusternet: Detecting Small Objects In Large Scenes By Exploiting Spatio-Temporal Information

    No full text
    Object detection in wide area motion imagery (WAMI) has drawn the attention of the computer vision research community for a number of years. WAMI proposes a number of unique challenges including extremely small object sizes, both sparse and densely-packed objects, and extremely large search spaces (large video frames). Nearly all state-of-the-art methods in WAMI object detection report that appearance-based classifiers fail in this challenging data and instead rely almost entirely on motion information in the form of background subtraction or frame-differencing. In this work, we experimentally verify the failure of appearance-based classifiers in WAMI, such as Faster R-CNN and a heatmap-based fully convolutional neural network (CNN), and propose a novel two-stage spatio-temporal CNN which effectively and efficiently combines both appearance and motion information to significantly surpass the state-of-the-art in WAMI object detection. To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI). These ROOBI can contain from one to clusters of several hundred objects due to the large video frame size and varying object density in WAMI. The second stage (FoveaNet) then estimates the centroid location of all objects in that given ROOBI simultaneously via heatmap estimation. The proposed method exceeds state-of-the-art results on the WPAFB 2009 dataset by 5-16% for moving objects and nearly 50% for stopped objects, as well as being the first proposed method in wide area motion imagery to detect completely stationary objects
    corecore