10 research outputs found

    FireRisk: A Remote Sensing Dataset for Fire Risk Assessment with Benchmarks Using Supervised and Self-supervised Learning

    Full text link
    In recent decades, wildfires, as widespread and extremely destructive natural disasters, have caused tremendous property losses and fatalities, as well as extensive damage to forest ecosystems. Many fire risk assessment projects have been proposed to prevent wildfires, but GIS-based methods are inherently challenging to scale to different geographic areas due to variations in data collection and local conditions. Inspired by the abundance of publicly available remote sensing projects and the burgeoning development of deep learning in computer vision, our research focuses on assessing fire risk using remote sensing imagery. In this work, we propose a novel remote sensing dataset, FireRisk, consisting of 7 fire risk classes with a total of 91872 labelled images for fire risk assessment. This remote sensing dataset is labelled with the fire risk classes supplied by the Wildfire Hazard Potential (WHP) raster dataset, and remote sensing images are collected using the National Agriculture Imagery Program (NAIP), a high-resolution remote sensing imagery program. On FireRisk, we present benchmark performance for supervised and self-supervised representations, with Masked Autoencoders (MAE) pre-trained on ImageNet1k achieving the highest classification accuracy, 65.29%. This remote sensing dataset, FireRisk, provides a new direction for fire risk assessment, and we make it publicly available on https://github.com/CharmonyShen/FireRisk.Comment: 10 pages, 6 figures, 1 table, 1 equatio

    DALLE-URBAN: Capturing the urban design expertise of large text to image transformers

    Full text link
    Automatically converting text descriptions into images using transformer architectures has recently received considerable attention. Such advances have implications for many applied design disciplines across fashion, art, architecture, urban planning, landscape design and the future tools available to such disciplines. However, a detailed analysis capturing the capabilities of such models, specifically with a focus on the built environment, has not been performed to date. In this work, we investigate the capabilities and biases of such text-to-image methods as it applies to the built environment in detail. We use a systematic grammar to generate queries related to the built environment and evaluate resulting generated images. We generate 1020 different images and find that text to image transformers are robust at generating realistic images across different domains for this use-case. Generated imagery can be found at the github: https://github.com/sachith500/DALLEURBANComment: Accepted to DICTA 2022, released 11000+ environmental scene images generated by Stable Diffusion and 1000+ images generated by DALLE-

    GRS-CRC-201707-team4-documents

    No full text
    2017 Graduate Research Students' Day - Collaborative Research Challenge - Team

    Semantic Segmentation using Vision Transformers: A survey

    Full text link
    Semantic segmentation has a broad range of applications in a variety of domains including land coverage analysis, autonomous driving, and medical image analysis. Convolutional neural networks (CNN) and Vision Transformers (ViTs) provide the architecture models for semantic segmentation. Even though ViTs have proven success in image classification, they cannot be directly applied to dense prediction tasks such as image segmentation and object detection since ViT is not a general purpose backbone due to its patch partitioning scheme. In this survey, we discuss some of the different ViT architectures that can be used for semantic segmentation and how their evolution managed the above-stated challenge. The rise of ViT and its performance with a high success rate motivated the community to slowly replace the traditional convolutional neural networks in various computer vision tasks. This survey aims to review and compare the performances of ViT architectures designed for semantic segmentation using benchmarking datasets. This will be worthwhile for the community to yield knowledge regarding the implementations carried out in semantic segmentation and to discover more efficient methodologies using ViTs.Comment: 35 pages, 13 figures, 2 table

    SigRep: Towards Robust Wearable Emotion Recognition with Contrastive Representation Learning

    No full text
    Extracting emotions from physiological signals has become popular over the past decade. Recent advancements in wearable smart devices have enabled capturing physiological signals continuously and unobtrusively. However, signal readings from different smart wearables are lossy due to user activities, making it difficult to develop robust models for emotion recognition. Also, the limited availability of data labels is an inherent challenge for developing machine learning techniques for emotion classification. This paper presents a novel self-supervised approach inspired by contrastive learning to address the above challenges. In particular, our proposed approach develops a method to learn representations of individual physiological signals, which can be used for downstream classification tasks. Our evaluation with four publicly available datasets shows that the proposed method surpasses the emotion recognition performance of state-of-the-art techniques for emotion classification. In addition, we show that our method is more robust to losses in the input signal

    Association between network characteristics and bicycle ridership across a large metropolitan region

    No full text
    Background: Numerous studies have explored associations between bicycle network characteristics and bicycle ridership. However, the majority of these studies have been conducted in inner metropolitan regions and as such, there is limited knowledge on how various characteristics of bicycle networks relate to bicycle trips within and across entire metropolitan regions, and how the size and composition of study regions impact on the association between bicycle network characteristics and bicycle ridership. Methods: We conducted a retrospective analysis of household travel survey data and bicycle infrastructure in the Greater Melbourne region, Australia. Seven network metrics were calculated and Bayesian spatial models were used to explore the association between these network characteristics and bicycle ridership (measured as counts of the number of trips, and the proportion of all trips that were made by bike). Results: We demonstrated that bicycle ridership was associated with several network characteristics, and that these characteristics varied according to the outcome (count of the number of trips made by bike or the proportion of trips made by bike) and the size and characteristics of the study region. Conclusions: These findings challenge the utility of approaches based on spatially modelling network characteristics and bicycle ridership when informing the monitoring and evaluation of bicycle networks. There is a need to progress the science of measuring safe and connected bicycle networks for people of all ages and abilities

    Undercover Deepfakes: Detecting Fake Segments in Videos

    Full text link
    The recent renaissance in generative models, driven primarily by the advent of diffusion models and iterative improvement in GAN methods, has enabled many creative applications. However, each advancement is also accompanied by a rise in the potential for misuse. In the arena of deepfake generation this is a key societal issue. In particular, the ability to modify segments of videos using such generative techniques creates a new paradigm of deepfakes which are mostly real videos altered slightly to distort the truth. Current deepfake detection methods in the academic literature are not evaluated on this paradigm. In this paper, we present a deepfake detection method able to address this issue by performing both frame and video level deepfake prediction. To facilitate testing our method we create a new benchmark dataset where videos have both real and fake frame sequences. Our method utilizes the Vision Transformer, Scaling and Shifting pretraining and Timeseries Transformer to temporally segment videos to help facilitate the interpretation of possible deepfakes. Extensive experiments on a variety of deepfake generation methods show excellent results on temporal segmentation and classical video level predictions as well. In particular, the paradigm we introduce will form a powerful tool for the moderation of deepfakes, where human oversight can be better targeted to the parts of videos suspected of being deepfakes. All experiments can be reproduced at: https://github.com/sanjaysaha1311/temporal-deepfake-segmentation

    Classification of Fracture Risk in Fallers Using Dual‐Energy X‐Ray Absorptiometry (DXA) Images and Deep Learning‐Based Feature Extraction

    No full text
    Abstract Dual‐energy X‐ray absorptiometry (DXA) scans are one of the most frequently used imaging techniques for calculating bone mineral density, yet calculating fracture risk using DXA image features is rarely performed. The objective of this study was to combine deep neural networks, together with DXA images and patient clinical information, to evaluate fracture risk in a cohort of adults with at least one known fall and age‐matched healthy controls. DXA images of the entire body as, well as isolated images of the hip, forearm, and spine (1488 total), were obtained from 478 fallers and 48 non‐faller controls. A modeling pipeline was developed for fracture risk prediction using the DXA images and clinical data. First, self‐supervised pretraining of feature extractors was performed using a small vision transformer (ViT‐S) and a convolutional neural network model (VGG‐16 and Resnet‐50). After pretraining, the feature extractors were then paired with a multilayer perceptron model, which was used for fracture risk classification. Classification was achieved with an average area under the receiver‐operating characteristic curve (AUROC) score of 74.3%. This study demonstrates ViT‐S as a promising neural network technique for fracture risk classification using DXA scans. The findings have future application as a fracture risk screening tool for older adults at risk of falls. © 2023 The Authors. JBMR Plus published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research

    A comparison of content from across contemporary Australian population health surveys

    No full text
    ObjectiveAssociations between place and population health are of interest to researchers and policymakers. The objective of this paper is to explore, summarise and compare content across contemporary Australian geo-referenced population health survey data sets.MethodsA search for recent (2015 or later) population health surveys from within Australia containing geographic information from participants was conducted. Survey response frames were analysed and categorised based on demographic, risk factor and disease-related characteristics. Analysis using interactive Sankey diagrams shows the extent of content overlap and differences between population health surveys in Australia.ResultsThirteen Australian geo-referenced population health survey data sets were identified. Information captured across surveys was inconsistent as was the spatial granularity of respondent information. Health and demographic features most frequently captured were symptoms, signs and clinical findings from the International Statistical Classification of Diseases and Related Health Problems version 11, employment, housing, income, self-rated health and risk factors, including alcohol consumption, diet, medical treatments, physical activity and weight-related questions. Sankey diagrams were deployed online for use by public health researchers.ConclusionsIdentifying the relationship between place and health in Australia is made more difficult by inconsistencies in information collected across surveys deployed in different regions in Australia

    MFR 2021: Masked Face Recognition Competition

    No full text
    This paper presents a summary of the Masked Face Recognition Competitions (MFR) held within the 2021 International Joint Conference on Biometrics (IJCB 2021). The competition attracted a total of 10 participating teams with valid submissions. The affiliations of these teams are diverse and associated with academia and industry in nine different countries. These teams successfully submitted 18 valid solutions. The competition is designed to motivate solutions aiming at enhancing the face recognition accuracy of masked faces. Moreover, the competition considered the deployability of the proposed solutions by taking the compactness of the face recognition models into account. A private dataset representing a collaborative, multisession, real masked, capture scenario is used to evaluate the submitted solutions. In comparison to one of the topperforming academic face recognition solutions, 10 out of the 18 submitted solutions did score higher masked face verification accuracy
    corecore