10 research outputs found
FireRisk: A Remote Sensing Dataset for Fire Risk Assessment with Benchmarks Using Supervised and Self-supervised Learning
In recent decades, wildfires, as widespread and extremely destructive natural
disasters, have caused tremendous property losses and fatalities, as well as
extensive damage to forest ecosystems. Many fire risk assessment projects have
been proposed to prevent wildfires, but GIS-based methods are inherently
challenging to scale to different geographic areas due to variations in data
collection and local conditions. Inspired by the abundance of publicly
available remote sensing projects and the burgeoning development of deep
learning in computer vision, our research focuses on assessing fire risk using
remote sensing imagery.
In this work, we propose a novel remote sensing dataset, FireRisk, consisting
of 7 fire risk classes with a total of 91872 labelled images for fire risk
assessment. This remote sensing dataset is labelled with the fire risk classes
supplied by the Wildfire Hazard Potential (WHP) raster dataset, and remote
sensing images are collected using the National Agriculture Imagery Program
(NAIP), a high-resolution remote sensing imagery program. On FireRisk, we
present benchmark performance for supervised and self-supervised
representations, with Masked Autoencoders (MAE) pre-trained on ImageNet1k
achieving the highest classification accuracy, 65.29%.
This remote sensing dataset, FireRisk, provides a new direction for fire risk
assessment, and we make it publicly available on
https://github.com/CharmonyShen/FireRisk.Comment: 10 pages, 6 figures, 1 table, 1 equatio
DALLE-URBAN: Capturing the urban design expertise of large text to image transformers
Automatically converting text descriptions into images using transformer
architectures has recently received considerable attention. Such advances have
implications for many applied design disciplines across fashion, art,
architecture, urban planning, landscape design and the future tools available
to such disciplines. However, a detailed analysis capturing the capabilities of
such models, specifically with a focus on the built environment, has not been
performed to date. In this work, we investigate the capabilities and biases of
such text-to-image methods as it applies to the built environment in detail. We
use a systematic grammar to generate queries related to the built environment
and evaluate resulting generated images. We generate 1020 different images and
find that text to image transformers are robust at generating realistic images
across different domains for this use-case. Generated imagery can be found at
the github: https://github.com/sachith500/DALLEURBANComment: Accepted to DICTA 2022, released 11000+ environmental scene images
generated by Stable Diffusion and 1000+ images generated by DALLE-
GRS-CRC-201707-team4-documents
2017 Graduate Research Students' Day - Collaborative Research Challenge - Team
Semantic Segmentation using Vision Transformers: A survey
Semantic segmentation has a broad range of applications in a variety of
domains including land coverage analysis, autonomous driving, and medical image
analysis. Convolutional neural networks (CNN) and Vision Transformers (ViTs)
provide the architecture models for semantic segmentation. Even though ViTs
have proven success in image classification, they cannot be directly applied to
dense prediction tasks such as image segmentation and object detection since
ViT is not a general purpose backbone due to its patch partitioning scheme. In
this survey, we discuss some of the different ViT architectures that can be
used for semantic segmentation and how their evolution managed the above-stated
challenge. The rise of ViT and its performance with a high success rate
motivated the community to slowly replace the traditional convolutional neural
networks in various computer vision tasks. This survey aims to review and
compare the performances of ViT architectures designed for semantic
segmentation using benchmarking datasets. This will be worthwhile for the
community to yield knowledge regarding the implementations carried out in
semantic segmentation and to discover more efficient methodologies using ViTs.Comment: 35 pages, 13 figures, 2 table
SigRep: Towards Robust Wearable Emotion Recognition with Contrastive Representation Learning
Extracting emotions from physiological signals has become popular over the past decade. Recent advancements in wearable smart devices have enabled capturing physiological signals continuously and unobtrusively. However, signal readings from different smart wearables are lossy due to user activities, making it difficult to develop robust models for emotion recognition. Also, the limited availability of data labels is an inherent challenge for developing machine learning techniques for emotion classification. This paper presents a novel self-supervised approach inspired by contrastive learning to address the above challenges. In particular, our proposed approach develops a method to learn representations of individual physiological signals, which can be used for downstream classification tasks. Our evaluation with four publicly available datasets shows that the proposed method surpasses the emotion recognition performance of state-of-the-art techniques for emotion classification. In addition, we show that our method is more robust to losses in the input signal
Association between network characteristics and bicycle ridership across a large metropolitan region
Background: Numerous studies have explored associations between bicycle network characteristics and bicycle ridership. However, the majority of these studies have been conducted in inner metropolitan regions and as such, there is limited knowledge on how various characteristics of bicycle networks relate to bicycle trips within and across entire metropolitan regions, and how the size and composition of study regions impact on the association between bicycle network characteristics and bicycle ridership.
Methods: We conducted a retrospective analysis of household travel survey data and bicycle infrastructure in the Greater Melbourne region, Australia. Seven network metrics were calculated and Bayesian spatial models were used to explore the association between these network characteristics and bicycle ridership (measured as counts of the number of trips, and the proportion of all trips that were made by bike).
Results: We demonstrated that bicycle ridership was associated with several network characteristics, and that these characteristics varied according to the outcome (count of the number of trips made by bike or the proportion of trips made by bike) and the size and characteristics of the study region.
Conclusions: These findings challenge the utility of approaches based on spatially modelling network characteristics and bicycle ridership when informing the monitoring and evaluation of bicycle networks. There is a need to progress the science of measuring safe and connected bicycle networks for people of all ages and abilities
Undercover Deepfakes: Detecting Fake Segments in Videos
The recent renaissance in generative models, driven primarily by the advent
of diffusion models and iterative improvement in GAN methods, has enabled many
creative applications. However, each advancement is also accompanied by a rise
in the potential for misuse. In the arena of deepfake generation this is a key
societal issue. In particular, the ability to modify segments of videos using
such generative techniques creates a new paradigm of deepfakes which are mostly
real videos altered slightly to distort the truth. Current deepfake detection
methods in the academic literature are not evaluated on this paradigm. In this
paper, we present a deepfake detection method able to address this issue by
performing both frame and video level deepfake prediction. To facilitate
testing our method we create a new benchmark dataset where videos have both
real and fake frame sequences. Our method utilizes the Vision Transformer,
Scaling and Shifting pretraining and Timeseries Transformer to temporally
segment videos to help facilitate the interpretation of possible deepfakes.
Extensive experiments on a variety of deepfake generation methods show
excellent results on temporal segmentation and classical video level
predictions as well. In particular, the paradigm we introduce will form a
powerful tool for the moderation of deepfakes, where human oversight can be
better targeted to the parts of videos suspected of being deepfakes. All
experiments can be reproduced at:
https://github.com/sanjaysaha1311/temporal-deepfake-segmentation
Classification of Fracture Risk in Fallers Using DualâEnergy XâRay Absorptiometry (DXA) Images and Deep LearningâBased Feature Extraction
Abstract Dualâenergy Xâray absorptiometry (DXA) scans are one of the most frequently used imaging techniques for calculating bone mineral density, yet calculating fracture risk using DXA image features is rarely performed. The objective of this study was to combine deep neural networks, together with DXA images and patient clinical information, to evaluate fracture risk in a cohort of adults with at least one known fall and ageâmatched healthy controls. DXA images of the entire body as, well as isolated images of the hip, forearm, and spine (1488 total), were obtained from 478 fallers and 48 nonâfaller controls. A modeling pipeline was developed for fracture risk prediction using the DXA images and clinical data. First, selfâsupervised pretraining of feature extractors was performed using a small vision transformer (ViTâS) and a convolutional neural network model (VGGâ16 and Resnetâ50). After pretraining, the feature extractors were then paired with a multilayer perceptron model, which was used for fracture risk classification. Classification was achieved with an average area under the receiverâoperating characteristic curve (AUROC) score of 74.3%. This study demonstrates ViTâS as a promising neural network technique for fracture risk classification using DXA scans. The findings have future application as a fracture risk screening tool for older adults at risk of falls. © 2023 The Authors. JBMR Plus published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research
A comparison of content from across contemporary Australian population health surveys
ObjectiveAssociations between place and population health are of interest to researchers and policymakers. The objective of this paper is to explore, summarise and compare content across contemporary Australian geo-referenced population health survey data sets.MethodsA search for recent (2015 or later) population health surveys from within Australia containing geographic information from participants was conducted. Survey response frames were analysed and categorised based on demographic, risk factor and disease-related characteristics. Analysis using interactive Sankey diagrams shows the extent of content overlap and differences between population health surveys in Australia.ResultsThirteen Australian geo-referenced population health survey data sets were identified. Information captured across surveys was inconsistent as was the spatial granularity of respondent information. Health and demographic features most frequently captured were symptoms, signs and clinical findings from the International Statistical Classification of Diseases and Related Health Problems version 11, employment, housing, income, self-rated health and risk factors, including alcohol consumption, diet, medical treatments, physical activity and weight-related questions. Sankey diagrams were deployed online for use by public health researchers.ConclusionsIdentifying the relationship between place and health in Australia is made more difficult by inconsistencies in information collected across surveys deployed in different regions in Australia
MFR 2021: Masked Face Recognition Competition
This paper presents a summary of the Masked Face Recognition Competitions (MFR) held within the 2021 International Joint Conference on Biometrics (IJCB 2021). The competition attracted a total of 10 participating teams with valid submissions. The affiliations of these teams are diverse and associated with academia and industry in nine different countries. These teams successfully submitted 18 valid solutions. The competition is designed to motivate solutions aiming at enhancing the face recognition accuracy of masked faces. Moreover, the competition considered the deployability of the proposed solutions by taking the compactness of the face recognition models into account. A private dataset representing a collaborative, multisession, real masked, capture scenario is used to evaluate the submitted solutions. In comparison to one of the topperforming academic face recognition solutions, 10 out of the 18 submitted solutions did score higher masked face verification accuracy