1,251 research outputs found
HurriCast: An Automatic Framework Using Machine Learning and Statistical Modeling for Hurricane Forecasting
Hurricanes present major challenges in the U.S. due to their devastating
impacts. Mitigating these risks is important, and the insurance industry is
central in this effort, using intricate statistical models for risk assessment.
However, these models often neglect key temporal and spatial hurricane patterns
and are limited by data scarcity. This study introduces a refined approach
combining the ARIMA model and K-MEANS to better capture hurricane trends, and
an Autoencoder for enhanced hurricane simulations. Our experiments show that
this hybrid methodology effectively simulate historical hurricane behaviors
while providing detailed projections of potential future trajectories and
intensities. Moreover, by leveraging a comprehensive yet selective dataset, our
simulations enrich the current understanding of hurricane patterns and offer
actionable insights for risk management strategies.Comment: This paper includes 7 pages and 8 figures. And we submitted it up to
the SC23 workshop. This is only a preprintin
Data Valuation and Detections in Federated Learning
Federated Learning (FL) enables collaborative model training while preserving
the privacy of raw data. A challenge in this framework is the fair and
efficient valuation of data, which is crucial for incentivizing clients to
contribute high-quality data in the FL task. In scenarios involving numerous
data clients within FL, it is often the case that only a subset of clients and
datasets are pertinent to a specific learning task, while others might have
either a negative or negligible impact on the model training process. This
paper introduces a novel privacy-preserving method for evaluating client
contributions and selecting relevant datasets without a pre-specified training
algorithm in an FL task. Our proposed approach FedBary, utilizes Wasserstein
distance within the federated context, offering a new solution for data
valuation in the FL framework. This method ensures transparent data valuation
and efficient computation of the Wasserstein barycenter and reduces the
dependence on validation datasets. Through extensive empirical experiments and
theoretical analyses, we demonstrate the potential of this data valuation
method as a promising avenue for FL research.Comment: Fixed some experimental errors and typo
Towards Visually Explaining Variational Autoencoders
Recent advances in Convolutional Neural Network (CNN) model interpretability
have led to impressive progress in visualizing and understanding model
predictions. In particular, gradient-based visual attention methods have driven
much recent effort in using visual attention maps as a means for visual
explanations. A key problem, however, is these methods are designed for
classification and categorization tasks, and their extension to explaining
generative models, e.g. variational autoencoders (VAE) is not trivial. In this
work, we take a step towards bridging this crucial gap, proposing the first
technique to visually explain VAEs by means of gradient-based attention. We
present methods to generate visual attention from the learned latent space, and
also demonstrate such attention explanations serve more than just explaining
VAE predictions. We show how these attention maps can be used to localize
anomalies in images, demonstrating state-of-the-art performance on the MVTec-AD
dataset. We also show how they can be infused into model training, helping
bootstrap the VAE into learning improved latent space disentanglement,
demonstrated on the Dsprites dataset
GFlowCausal: Generative Flow Networks for Causal Discovery
Causal discovery aims to uncover causal structure among a set of variables.
Score-based approaches mainly focus on searching for the best Directed Acyclic
Graph (DAG) based on a predefined score function. However, most of them are not
applicable on a large scale due to the limited searchability. Inspired by the
active learning in generative flow networks, we propose a novel approach to
learning a DAG from observational data called GFlowCausal. It converts the
graph search problem to a generation problem, in which direct edges are added
gradually. GFlowCausal aims to learn the best policy to generate high-reward
DAGs by sequential actions with probabilities proportional to predefined
rewards. We propose a plug-and-play module based on transitive closure to
ensure efficient sampling. Theoretical analysis shows that this module could
guarantee acyclicity properties effectively and the consistency between final
states and fully-connected graphs. We conduct extensive experiments on both
synthetic and real datasets, and results show the proposed approach to be
superior and also performs well in a large-scale setting
- …