1,151 research outputs found

    Multiband Probabilistic Cataloging: A Joint Fitting Approach to Point Source Detection and Deblending

    Get PDF
    Probabilistic cataloging (PCAT) outperforms traditional cataloging methods on single-band optical data in crowded fields. We extend our work to multiple bands, achieving greater sensitivity (~0.4 mag) and greater speed (500×) compared to previous single-band results. We demonstrate the effectiveness of multiband PCAT on mock data, in terms of both recovering accurate posteriors in the catalog space and directly deblending sources. When applied to Sloan Digital Sky Survey (SDSS) observations of M2, taking Hubble Space Telescope data as truth, our joint fit on r- and i-band data goes ~0.4 mag deeper than single-band probabilistic cataloging and has a false discovery rate less than 20% for F606W ≀ 20. Compared to DAOPHOT, the two-band SDSS catalog fit goes nearly 1.5 mag deeper using the same data and maintains a lower false discovery rate down to F606W ~ 20.5. Given recent improvements in computational speed, multiband PCAT shows promise in application to large-scale surveys and is a plausible framework for joint analysis of multi-instrument observational data. https://github.com/RichardFeder/multiband_pcat

    Feature Representation for Online Signature Verification

    Full text link
    Biometrics systems have been used in a wide range of applications and have improved people authentication. Signature verification is one of the most common biometric methods with techniques that employ various specifications of a signature. Recently, deep learning has achieved great success in many fields, such as image, sounds and text processing. In this paper, deep learning method has been used for feature extraction and feature selection.Comment: 10 pages, 10 figures, Submitted to IEEE Transactions on Information Forensics and Securit

    Efficient Min-cost Flow Tracking with Bounded Memory and Computation

    Get PDF
    This thesis is a contribution to solving multi-target tracking in an optimal fashion for real-time demanding computer vision applications. We introduce a challenging benchmark, recorded with our autonomous driving platform AnnieWAY. Three main challenges of tracking are addressed: Solving the data association (min-cost flow) problem faster than standard solvers, extending this approach to an online setting, and making it real-time capable by a tight approximation of the optimal solution

    COCO-Counterfactuals: Automatically Constructed Counterfactual Examples for Image-Text Pairs

    Full text link
    Counterfactual examples have proven to be valuable in the field of natural language processing (NLP) for both evaluating and improving the robustness of language models to spurious correlations in datasets. Despite their demonstrated utility for NLP, multimodal counterfactual examples have been relatively unexplored due to the difficulty of creating paired image-text data with minimal counterfactual changes. To address this challenge, we introduce a scalable framework for automatic generation of counterfactual examples using text-to-image diffusion models. We use our framework to create COCO-Counterfactuals, a multimodal counterfactual dataset of paired image and text captions based on the MS-COCO dataset. We validate the quality of COCO-Counterfactuals through human evaluations and show that existing multimodal models are challenged by our counterfactual image-text pairs. Additionally, we demonstrate the usefulness of COCO-Counterfactuals for improving out-of-domain generalization of multimodal vision-language models via training data augmentation.Comment: Accepted to NeurIPS 2023 Datasets and Benchmarks Trac

    Detection and height estimation of buildings from SAR and optical images using conditional random fields

    Get PDF
    [no abstract

    Knowledge and Reasoning for Image Understanding

    Get PDF
    abstract: Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning. Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Multi-Surface Simplex Spine Segmentation for Spine Surgery Simulation and Planning

    Get PDF
    This research proposes to develop a knowledge-based multi-surface simplex deformable model for segmentation of healthy as well as pathological lumbar spine data. It aims to provide a more accurate and robust segmentation scheme for identification of intervertebral disc pathologies to assist with spine surgery planning. A robust technique that combines multi-surface and shape statistics-aware variants of the deformable simplex model is presented. Statistical shape variation within the dataset has been captured by application of principal component analysis and incorporated during the segmentation process to refine results. In the case where shape statistics hinder detection of the pathological region, user-assistance is allowed to disable the prior shape influence during deformation. Results have been validated against user-assisted expert segmentation

    An investigation into the geomorphology of the Hebron Fault, Namibia, using a satellite-derived, high-resolution digital elevation model (DEM)

    Get PDF
    The Hebron fault scarp in southern Namibia is 45 km in length with an average height of 5.5 m and a maximum height of 8.9 m. Namibia is a Stable Continental Region (SCR) — a slowly deforming area within a continental plate. The country also has little recorded seismicity with the largest earthquake on the International Seismological Center (ISC) catalogue being MW 5.4. If the Hebron fault scarp was formed in a single event, this would represent a MW 7.3 earthquake. SCRs do occasionally experience large earthquakes, however, the recurrence intervals between these events is much larger than in rapidly deforming areas. Consequently, studying palaeo-earthquakes allows the record of seismicity to be extended and the characteristics of SCR events to be better understood. These studies may help refine the Mmax estimates required for seismic hazard assessment. Previous work on Hebron has been limited to field descriptions and theodolite survey scarp heights. Furthermore, there have been several interpretations of the fault mechanism and number of rupture events. This study produces a high-resolution Digital Elevation Model (DEM) via stereophotogrammetry using pan-sharpened Worldview-3 satellite imagery (0.31 m resolution). The DEM was used for several geomorphological analyses. These included measuring the scarp height at 160 locations along its length, measuring river channel displacements and identifying knickpoints along river profiles. Results indicate that the scarp formed from a normal, dip-slip fault that ruptured in a single event. This scenario would imply a high slip-to-length ratio. A comparison of other SCR fault scarps in the literature was made which shows that Hebrons’ slip-to-length ratio falls within the values found on other SCR faults. This study also discusses the implications of results for seismic hazard assessment in the region. Due a poor seismic record, probabilistic seismic hazard analysis (PSHA) will calculate a low seismic risk for Namibia. As large earthquakes can occur in SCRs, deterministic seismic hazard analysis (DSHA) can be used to inform policy makers of the worst case scenarios

    Using child-friendly movie stimuli to study the development of face, place, and object regions from age 3 to 12 years

    Get PDF
    Scanning young children while they watch short, engaging, commercially‐produced movies has emerged as a promising approach for increasing data retention and quality. Movie stimuli also evoke a richer variety of cognitive processes than traditional experiments, allowing the study of multiple aspects of brain development simultaneously. However, because these stimuli are uncontrolled, it is unclear how effectively distinct profiles of brain activity can be distinguished from the resulting data. Here we develop an approach for identifying multiple distinct subject‐specific Regions of Interest (ssROIs) using fMRI data collected during movie‐viewing. We focused on the test case of higher‐level visual regions selective for faces, scenes, and objects. Adults (N = 13) were scanned while viewing a 5.6‐min child‐friendly movie, as well as a traditional localizer experiment with blocks of faces, scenes, and objects. We found that just 2.7 min of movie data could identify subject‐specific face, scene, and object regions. While successful, movie‐defined ssROIS still showed weaker domain selectivity than traditional ssROIs. Having validated our approach in adults, we then used the same methods on movie data collected from 3 to 12‐year‐old children (N = 122). Movie response timecourses in 3‐year‐old children's face, scene, and object regions were already significantly and specifically predicted by timecourses from the corresponding regions in adults. We also found evidence of continued developmental change, particularly in the face‐selective posterior superior temporal sulcus. Taken together, our results reveal both early maturity and functional change in face, scene, and object regions, and more broadly highlight the promise of short, child‐friendly movies for developmental cognitive neuroscience
    • 

    corecore