79 research outputs found
Defining heatwave thresholds using an inductive machine learning approach
Establishing appropriate heatwave thresholds is important in reducing adverse human health consequences as it enables a more effective heatwave warning system and response plan. This paper defined such thresholds by focusing on the non-linear relationship between heatwave outcomes and meteorological variables as part of an inductive approach. Daily data on emergency department visitors who were diagnosed with heat illnesses and information on 19 meteorological variables were obtained for the years 2011 to 2016 from relevant government agencies. A Multivariate Adaptive Regression Splines (MARS) analysis was performed to explore points (referred to as "knots") where the behaviour of the variables rapidly changed. For all emergency department visitors, two thresholds (a maximum daily temperature >= 32.58 degrees C for 2 consecutive days and a heat index >= 79.64) were selected based on the dramatic rise of morbidity at these points. Nonetheless, visitors, who included children and outside workers diagnosed in the early summer season, were reported as being sensitive to heatwaves at lower thresholds. The average daytime temperature (from noon to 6 PM) was determined to represent an alternative threshold for heatwaves. The findings have implications for exploring complex heatwave-morbidity relationships and for developing appropriate intervention strategies to prevent and mitigate the health impact of heatwave
TLDR: Text Based Last-layer Retraining for Debiasing Image Classifiers
A classifier may depend on incidental features stemming from a strong
correlation between the feature and the classification target in the training
dataset. Recently, Last Layer Retraining (LLR) with group-balanced datasets is
known to be efficient in mitigating the spurious correlation of classifiers.
However, the acquisition of group-balanced datasets is costly, which hinders
the applicability of the LLR method. In this work, we propose to perform LLR
based on text datasets built with large language models for a general image
classifier. We demonstrate that text can be a proxy for its corresponding image
beyond the image-text joint embedding space, such as CLIP. Based on this, we
use generated texts to train the final layer in the embedding space of the
arbitrary image classifier. In addition, we propose a method of filtering the
generated words to get rid of noisy, imprecise words, which reduces the effort
of inspecting each word. We dub these procedures as TLDR (\textbf{T}ext-based
\textbf{L}ast layer retraining for \textbf{D}ebiasing image
classifie\textbf{R}s) and show our method achieves the performance that is
comparable to those of the LLR methods that also utilize group-balanced image
dataset for retraining. Furthermore, TLDR outperforms other baselines that
involve training the last linear layer without a group annotated dataset.Comment: 19 pages, Under Revie
Classical-to-quantum convolutional neural network transfer learning
Machine learning using quantum convolutional neural networks (QCNNs) has
demonstrated success in both quantum and classical data classification. In
previous studies, QCNNs attained a higher classification accuracy than their
classical counterparts under the same training conditions in the few-parameter
regime. However, the general performance of large-scale quantum models is
difficult to examine because of the limited size of quantum circuits, which can
be reliably implemented in the near future. We propose transfer learning as an
effective strategy for utilizing small QCNNs in the noisy intermediate-scale
quantum era to the full extent. In the classical-to-quantum transfer learning
framework, a QCNN can solve complex classification problems without requiring a
large-scale quantum circuit by utilizing a pre-trained classical convolutional
neural network (CNN). We perform numerical simulations of QCNN models with
various sets of quantum convolution and pooling operations for MNIST data
classification under transfer learning, in which a classical CNN is trained
with Fashion-MNIST data. The results show that transfer learning from classical
to quantum CNN performs considerably better than purely classical transfer
learning models under similar training conditions.Comment: 16 pages, 7 figure
Exploring Environmental Inequity in South Korea: An Analysis of the Distribution of Toxic Release Inventory (TRI) Facilities and Toxic Releases
Recently, location data regarding the Toxic Release Inventory (TRI) in South Korea was released to the public. This study investigated the spatial patterns of TRIs and releases of toxic substances in all 230 local governments in South Korea to determine whether spatial clusters relevant to the siting of noxious facilities occur. In addition, we employed spatial regression modeling to determine whether the number of TRI facilities and the volume of toxic releases in a given community were correlated with the community's socioeconomic, racial, political, and land use characteristics. We found that the TRI facilities and their toxic releases were disproportionately distributed with clustered spatial patterning. Spatial regression modeling indicated that jurisdictions with smaller percentages of minorities, stronger political activity, less industrial land use, and more commercial land use had smaller numbers of toxic releases, as well as smaller numbers of TRI facilities. However, the economic status of the community did not affect the siting of hazardous facilities. These results indicate that the siting of TRI facilities in Korea is more affected by sociopolitical factors than by economic status. Racial issues are thus crucial for consideration in environmental justice as the population of Korea becomes more racially and ethnically diverse
A deep learning approach for the comparison of handwritten documents using latent feature vectors
Forensic questioned document examiners still largely rely on visual assessments and expert judgment to determine the provenance of a handwritten document. Here, we propose a novel approach to objectively compare two handwritten documents using a deep learning algorithm. First, we implement a bootstrapping technique to segment document data into smaller units, as a means to enhance the efficiency of the deep learning process. Next, we use a transfer learning algorithm to systematically extract document features. The unique characteristics of the document data are then represented as latent vectors. Finally, the similarity between two handwritten documents is quantified via the cosine similarity between the two latent vectors. We illustrate the use of the proposed method by implementing it on a variety of collections of handwritten documents with different attributes, and show that in most cases, we can accurately classify pairs of documents into same or different author categories.This article is published as J. Kim, S. Park, and A. Carriquiry, A deep learning approach for the comparison of handwritten documents using latent feature vectors, Stat. Anal. Data Min.: ASA Data Sci. J. 17 (2024), e11660. https://doi.org/10.1002/sam.11660. © 2024 The Authors. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes
Hierarchical Joint Graph Learning and Multivariate Time Series Forecasting
Multivariate time series is prevalent in many scientific and industrial
domains. Modeling multivariate signals is challenging due to their long-range
temporal dependencies and intricate interactions--both direct and indirect. To
confront these complexities, we introduce a method of representing multivariate
signals as nodes in a graph with edges indicating interdependency between them.
Specifically, we leverage graph neural networks (GNN) and attention mechanisms
to efficiently learn the underlying relationships within the time series data.
Moreover, we suggest employing hierarchical signal decompositions running over
the graphs to capture multiple spatial dependencies. The effectiveness of our
proposed model is evaluated across various real-world benchmark datasets
designed for long-term forecasting tasks. The results consistently showcase the
superiority of our model, achieving an average 23\% reduction in mean squared
error (MSE) compared to existing models.Comment: Temporal Graph Learning Workshop @ NeurIPS 2023, New Orleans, United
State
A study on the consumer's perception of front-of-pack nutrition labeling
The goal of this research is to investigate the present situation for front of pack labeling in Korea and the perception of consumers for the new system of labeling, front of pack labeling, based on the consumer survey. We investigated the number of processed foods with front of pack labeling in one retailer in Youngin-si. And we also surveyed 1,019 participants nationwide whose ages were from 20 to 49; the knowledge of nutrition labeling, the knowledge of 'front of pack labeling', and the opinion about the labeling system. The data were analyzed using SAS statistics program. The results were as follows: 13.4% of processed foods had front of pack labeling, and 16.8% of the consumers always checked the nutrition labeling, while 32.7% of the consumers seldom checked it. In addition, 44.3% of the consumers think that 'front of pack labeling' is necessary, and 58.3% of the consumers think it is important to show the percentage of daily value as a way of 'front of pack labeling'. However, 32% of the consumer think the possibility of 'front of pack labeling' is slim. Meanwhile, 58.3% of the consumers think that it is important to have the color difference according to contents. The number of favorite nutrients in the front of pack was four or five. It seems that the recognition of current nutrition labeling has the influence on the willingness of using the future 'front of pack labeling'. Along with our study, the policy for 'front of pack labeling' has to be updated and improved constantly since 'front of pack labeling' helps consumer understand nutrition facts
Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors
Test-time adaptation (TTA) fine-tunes pre-trained deep neural networks for
unseen test data. The primary challenge of TTA is limited access to the entire
test dataset during online updates, causing error accumulation. To mitigate it,
TTA methods have utilized the model output's entropy as a confidence metric
that aims to determine which samples have a lower likelihood of causing error.
Through experimental studies, however, we observed the unreliability of entropy
as a confidence metric for TTA under biased scenarios and theoretically
revealed that it stems from the neglect of the influence of latent disentangled
factors of data on predictions. Building upon these findings, we introduce a
novel TTA method named Destroy Your Object (DeYO), which leverages a newly
proposed confidence metric named Pseudo-Label Probability Difference (PLPD).
PLPD quantifies the influence of the shape of an object on prediction by
measuring the difference between predictions before and after applying an
object-destructive transformation. DeYO consists of sample selection and sample
weighting, which employ entropy and PLPD concurrently. For robust adaptation,
DeYO prioritizes samples that dominantly incorporate shape information when
making predictions. Our extensive experiments demonstrate the consistent
superiority of DeYO over baseline methods across various scenarios, including
biased and wild. Project page is publicly available at
https://whitesnowdrop.github.io/DeYO/.Comment: ICLR 2024 Spotlight; 26 pages, 9 figures, 20 tables
Can We Utilize Pre-trained Language Models within Causal Discovery Algorithms?
Scaling laws have allowed Pre-trained Language Models (PLMs) into the field
of causal reasoning. Causal reasoning of PLM relies solely on text-based
descriptions, in contrast to causal discovery which aims to determine the
causal relationships between variables utilizing data. Recently, there has been
current research regarding a method that mimics causal discovery by aggregating
the outcomes of repetitive causal reasoning, achieved through specifically
designed prompts. It highlights the usefulness of PLMs in discovering cause and
effect, which is often limited by a lack of data, especially when dealing with
multiple variables. Conversely, the characteristics of PLMs which are that PLMs
do not analyze data and they are highly dependent on prompt design leads to a
crucial limitation for directly using PLMs in causal discovery. Accordingly,
PLM-based causal reasoning deeply depends on the prompt design and carries out
the risk of overconfidence and false predictions in determining causal
relationships. In this paper, we empirically demonstrate the aforementioned
limitations of PLM-based causal reasoning through experiments on
physics-inspired synthetic data. Then, we propose a new framework that
integrates prior knowledge obtained from PLM with a causal discovery algorithm.
This is accomplished by initializing an adjacency matrix for causal discovery
and incorporating regularization using prior knowledge. Our proposed framework
not only demonstrates improved performance through the integration of PLM and
causal discovery but also suggests how to leverage PLM-extracted prior
knowledge with existing causal discovery algorithms
- …