1,373 research outputs found

    Generalizing Deep Models for Overhead Image Segmentation Through Getis-Ord Gi* Pooling

    Get PDF
    That most deep learning models are purely data driven is both a strength and a weakness. Given sufficient training data, the optimal model for a particular problem can be learned. However, this is usually not the case and so instead the model is either learned from scratch from a limited amount of training data or pre-trained on a different problem and then fine-tuned. Both of these situations are potentially suboptimal and limit the generalizability of the model. Inspired by this, we investigate methods to inform or guide deep learning models for geospatial image analysis to increase their performance when a limited amount of training data is available or when they are applied to scenarios other than which they were trained on. In particular, we exploit the fact that there are certain fundamental rules as to how things are distributed on the surface of the Earth and these rules do not vary substantially between locations. Based on this, we develop a novel feature pooling method for convolutional neural networks using Getis-Ord Gi* analysis from geostatistics. Experimental results show our proposed pooling function has significantly better generalization performance compared to a standard data-driven approach when applied to overhead image segmentation

    Feature Tracking Cardiac Magnetic Resonance via Deep Learning and Spline Optimization

    Full text link
    Feature tracking Cardiac Magnetic Resonance (CMR) has recently emerged as an area of interest for quantification of regional cardiac function from balanced, steady state free precession (SSFP) cine sequences. However, currently available techniques lack full automation, limiting reproducibility. We propose a fully automated technique whereby a CMR image sequence is first segmented with a deep, fully convolutional neural network (CNN) architecture, and quadratic basis splines are fitted simultaneously across all cardiac frames using least squares optimization. Experiments are performed using data from 42 patients with hypertrophic cardiomyopathy (HCM) and 21 healthy control subjects. In terms of segmentation, we compared state-of-the-art CNN frameworks, U-Net and dilated convolution architectures, with and without temporal context, using cross validation with three folds. Performance relative to expert manual segmentation was similar across all networks: pixel accuracy was ~97%, intersection-over-union (IoU) across all classes was ~87%, and IoU across foreground classes only was ~85%. Endocardial left ventricular circumferential strain calculated from the proposed pipeline was significantly different in control and disease subjects (-25.3% vs -29.1%, p = 0.006), in agreement with the current clinical literature.Comment: Accepted to Functional Imaging and Modeling of the Heart (FIMH) 201

    Segmentation-guided privacy preservation in visual surveillance monitoring

    Full text link
    Treballs Finals de Grau d'Enginyeria Informàtica, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2022, Director: Sergio Escalera Guerrero, Zenjie Li i Kamal Nasrollahi[en] Video surveillance has become a necessity to ensure safety and security. Today, with the advancement of technology, video surveillance has become more accessible and widely available. Furthermore, it can be useful in an enormous amount of applications and situations. For instance, it can be useful in ensuring public safety by preventing vandalism, robbery, and shoplifting. The same applies to more intimate situations, like home monitoring to detect unusual behavior of residents or in similar situations like hospitals and assisted living facilities. Thus, cameras are installed in public places like malls, metro stations, and on-roads for traffic control, as well as in sensitive settings like hospitals, embassies, and private homes. Video surveillance has always been as- sociated with the loss of privacy. Therefore, we developed a real-time visualization of privacy-protected video surveillance data by applying a segmentation mask to protect privacy while still being able to identify existing risk behaviors. This replaces existing privacy safeguards such as blanking, masking, pixelation, blurring, and scrambling. As we want to protect human personal data that are visual such as appearance, physical information, clothing, skin, eye and hair color, and facial gestures. Our main aim of this work is to analyze and compare the most successful deep-learning-based state-of-the-art approaches for semantic segmentation. In this study, we perform an efficiency-accuracy comparison to determine which segmentation methods yield accurate segmentation results while performing at the speed and execution required for real-life application scenarios. Furthermore, we also provide a modified dataset made from a combination of three existing datasets, COCO_stuff164K, PASCAL VOC 2012, and ADE20K, to make our comparison fair and generate privacyprotecting human segmentation masks

    Decomposing and Coupling Saliency Map for Lesion Segmentation in Ultrasound Images

    Full text link
    Complex scenario of ultrasound image, in which adjacent tissues (i.e., background) share similar intensity with and even contain richer texture patterns than lesion region (i.e., foreground), brings a unique challenge for accurate lesion segmentation. This work presents a decomposition-coupling network, called DC-Net, to deal with this challenge in a (foreground-background) saliency map disentanglement-fusion manner. The DC-Net consists of decomposition and coupling subnets, and the former preliminarily disentangles original image into foreground and background saliency maps, followed by the latter for accurate segmentation under the assistance of saliency prior fusion. The coupling subnet involves three aspects of fusion strategies, including: 1) regional feature aggregation (via differentiable context pooling operator in the encoder) to adaptively preserve local contextual details with the larger receptive field during dimension reduction; 2) relation-aware representation fusion (via cross-correlation fusion module in the decoder) to efficiently fuse low-level visual characteristics and high-level semantic features during resolution restoration; 3) dependency-aware prior incorporation (via coupler) to reinforce foreground-salient representation with the complementary information derived from background representation. Furthermore, a harmonic loss function is introduced to encourage the network to focus more attention on low-confidence and hard samples. The proposed method is evaluated on two ultrasound lesion segmentation tasks, which demonstrates the remarkable performance improvement over existing state-of-the-art methods.Comment: 18 pages, 18 figure
    • …