Search CORE

79 research outputs found

Defining heatwave thresholds using an inductive machine learning approach

Author: Kim Jeongseob
Park Juhyeon
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Establishing appropriate heatwave thresholds is important in reducing adverse human health consequences as it enables a more effective heatwave warning system and response plan. This paper defined such thresholds by focusing on the non-linear relationship between heatwave outcomes and meteorological variables as part of an inductive approach. Daily data on emergency department visitors who were diagnosed with heat illnesses and information on 19 meteorological variables were obtained for the years 2011 to 2016 from relevant government agencies. A Multivariate Adaptive Regression Splines (MARS) analysis was performed to explore points (referred to as "knots") where the behaviour of the variables rapidly changed. For all emergency department visitors, two thresholds (a maximum daily temperature >= 32.58 degrees C for 2 consecutive days and a heat index >= 79.64) were selected based on the dramatic rise of morbidity at these points. Nonetheless, visitors, who included children and outside workers diagnosed in the early summer season, were reported as being sensitive to heatwaves at lower thresholds. The average daytime temperature (from noon to 6 PM) was determined to represent an alternative threshold for heatwaves. The findings have implications for exploring complex heatwave-morbidity relationships and for developing appropriate intervention strategies to prevent and mitigate the health impact of heatwave

Directory of Open Access Journals

ScholarWorks@UNIST

TLDR: Text Based Last-layer Retraining for Debiasing Image Classifiers

Author: Jeong Seokhyeon
Moon Taesup
Park Juhyeon
Publication venue
Publication date: 30/11/2023
Field of study

A classifier may depend on incidental features stemming from a strong correlation between the feature and the classification target in the training dataset. Recently, Last Layer Retraining (LLR) with group-balanced datasets is known to be efficient in mitigating the spurious correlation of classifiers. However, the acquisition of group-balanced datasets is costly, which hinders the applicability of the LLR method. In this work, we propose to perform LLR based on text datasets built with large language models for a general image classifier. We demonstrate that text can be a proxy for its corresponding image beyond the image-text joint embedding space, such as CLIP. Based on this, we use generated texts to train the final layer in the embedding space of the arbitrary image classifier. In addition, we propose a method of filtering the generated words to get rid of noisy, imprecise words, which reduces the effort of inspecting each word. We dub these procedures as TLDR (\textbf{T}ext-based \textbf{L}ast layer retraining for \textbf{D}ebiasing image classifie\textbf{R}s) and show our method achieves the performance that is comparable to those of the LLR methods that also utilize group-balanced image dataset for retraining. Furthermore, TLDR outperforms other baselines that involve training the last linear layer without a group annotated dataset.Comment: 19 pages, Under Revie

arXiv.org e-Print Archive

Classical-to-quantum convolutional neural network transfer learning

Author: Huh Joonsuk
Kim Juhyeon
Park Daniel K.
Publication venue
Publication date: 28/09/2023
Field of study

Machine learning using quantum convolutional neural networks (QCNNs) has demonstrated success in both quantum and classical data classification. In previous studies, QCNNs attained a higher classification accuracy than their classical counterparts under the same training conditions in the few-parameter regime. However, the general performance of large-scale quantum models is difficult to examine because of the limited size of quantum circuits, which can be reliably implemented in the near future. We propose transfer learning as an effective strategy for utilizing small QCNNs in the noisy intermediate-scale quantum era to the full extent. In the classical-to-quantum transfer learning framework, a QCNN can solve complex classification problems without requiring a large-scale quantum circuit by utilizing a pre-trained classical convolutional neural network (CNN). We perform numerical simulations of QCNN models with various sets of quantum convolution and pooling operations for MNIST data classification under transfer learning, in which a classical CNN is trained with Fashion-MNIST data. The results show that transfer learning from classical to quantum CNN performs considerably better than purely classical transfer learning models under similar training conditions.Comment: 16 pages, 7 figure

arXiv.org e-Print Archive

Exploring Environmental Inequity in South Korea: An Analysis of the Distribution of Toxic Release Inventory (TRI) Facilities and Toxic Releases

Author: Boer
D. Yoon
Daniels
Hird
Hoffman
Juhyeon Park
Jung Kang
Publication venue: 'MDPI AG'
Publication date: 01/10/2017
Field of study

Recently, location data regarding the Toxic Release Inventory (TRI) in South Korea was released to the public. This study investigated the spatial patterns of TRIs and releases of toxic substances in all 230 local governments in South Korea to determine whether spatial clusters relevant to the siting of noxious facilities occur. In addition, we employed spatial regression modeling to determine whether the number of TRI facilities and the volume of toxic releases in a given community were correlated with the community's socioeconomic, racial, political, and land use characteristics. We found that the TRI facilities and their toxic releases were disproportionately distributed with clustered spatial patterning. Spatial regression modeling indicated that jurisdictions with smaller percentages of minorities, stronger political activity, less industrial land use, and more commercial land use had smaller numbers of toxic releases, as well as smaller numbers of TRI facilities. However, the economic status of the community did not affect the siting of hazardous facilities. These results indicate that the siting of TRI facilities in Korea is more affected by sociopolitical factors than by economic status. Racial issues are thus crucial for consideration in environmental justice as the population of Korea becomes more racially and ethnically diverse

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

ScholarWorks@UNIST

A deep learning approach for the comparison of handwritten documents using latent feature vectors

Author: Carriquiry Alicia
Kim Juhyeon
Park Soyoung
Publication venue: Wiley Periodicals LLC
Publication date: 01/02/2024
Field of study

Forensic questioned document examiners still largely rely on visual assessments and expert judgment to determine the provenance of a handwritten document. Here, we propose a novel approach to objectively compare two handwritten documents using a deep learning algorithm. First, we implement a bootstrapping technique to segment document data into smaller units, as a means to enhance the efficiency of the deep learning process. Next, we use a transfer learning algorithm to systematically extract document features. The unique characteristics of the document data are then represented as latent vectors. Finally, the similarity between two handwritten documents is quantified via the cosine similarity between the two latent vectors. We illustrate the use of the proposed method by implementing it on a variety of collections of handwritten documents with different attributes, and show that in most cases, we can accurately classify pairs of documents into same or different author categories.This article is published as J. Kim, S. Park, and A. Carriquiry, A deep learning approach for the comparison of handwritten documents using latent feature vectors, Stat. Anal. Data Min.: ASA Data Sci. J. 17 (2024), e11660. https://doi.org/10.1002/sam.11660. © 2024 The Authors. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes

Digital Repository @ Iowa State University (ISU)

Hierarchical Joint Graph Learning and Multivariate Time Series Forecasting

Author: Hwang Ung
Jung Wooyul
Kim Juhyeon
Lee Hyungeun
Park Miseon
Yoon Kijung
Yu Seungwon
Publication venue
Publication date: 30/11/2023
Field of study

Multivariate time series is prevalent in many scientific and industrial domains. Modeling multivariate signals is challenging due to their long-range temporal dependencies and intricate interactions--both direct and indirect. To confront these complexities, we introduce a method of representing multivariate signals as nodes in a graph with edges indicating interdependency between them. Specifically, we leverage graph neural networks (GNN) and attention mechanisms to efficiently learn the underlying relationships within the time series data. Moreover, we suggest employing hierarchical signal decompositions running over the graphs to capture multiple spatial dependencies. The effectiveness of our proposed model is evaluated across various real-world benchmark datasets designed for long-term forecasting tasks. The results consistently showcase the superiority of our model, achieving an average 23\% reduction in mean squared error (MSE) compared to existing models.Comment: Temporal Graph Learning Workshop @ NeurIPS 2023, New Orleans, United State

arXiv.org e-Print Archive

A study on the consumer's perception of front-of-pack nutrition labeling

Author: Chung
Dotsch-Klerk
Feunekes
Juhyeon Kim
Lee
Lee
Louie
Mhurchu
Park
Woo Kyoung Kim
Publication venue: The Korean Nutrition Society and The Korean Society of Community Nutrition
Publication date: 01/01/2009
Field of study

The goal of this research is to investigate the present situation for front of pack labeling in Korea and the perception of consumers for the new system of labeling, front of pack labeling, based on the consumer survey. We investigated the number of processed foods with front of pack labeling in one retailer in Youngin-si. And we also surveyed 1,019 participants nationwide whose ages were from 20 to 49; the knowledge of nutrition labeling, the knowledge of 'front of pack labeling', and the opinion about the labeling system. The data were analyzed using SAS statistics program. The results were as follows: 13.4% of processed foods had front of pack labeling, and 16.8% of the consumers always checked the nutrition labeling, while 32.7% of the consumers seldom checked it. In addition, 44.3% of the consumers think that 'front of pack labeling' is necessary, and 58.3% of the consumers think it is important to show the percentage of daily value as a way of 'front of pack labeling'. However, 32% of the consumer think the possibility of 'front of pack labeling' is slim. Meanwhile, 58.3% of the consumers think that it is important to have the color difference according to contents. The number of favorite nutrients in the front of pack was four or five. It seems that the recognition of current nutrition labeling has the influence on the willingness of using the future 'front of pack labeling'. Along with our study, the policy for 'front of pack labeling' has to be updated and improved constantly since 'front of pack labeling' helps consumer understand nutrition facts

Crossref

PubMed Central

Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors

Author: Hwang Uiwon
Jung Dahuin
Lee Jonghyun
Lee Saehyung
Park Junsung
Shin Juhyeon
Yoon Sungroh
Publication venue
Publication date: 12/03/2024
Field of study

Test-time adaptation (TTA) fine-tunes pre-trained deep neural networks for unseen test data. The primary challenge of TTA is limited access to the entire test dataset during online updates, causing error accumulation. To mitigate it, TTA methods have utilized the model output's entropy as a confidence metric that aims to determine which samples have a lower likelihood of causing error. Through experimental studies, however, we observed the unreliability of entropy as a confidence metric for TTA under biased scenarios and theoretically revealed that it stems from the neglect of the influence of latent disentangled factors of data on predictions. Building upon these findings, we introduce a novel TTA method named Destroy Your Object (DeYO), which leverages a newly proposed confidence metric named Pseudo-Label Probability Difference (PLPD). PLPD quantifies the influence of the shape of an object on prediction by measuring the difference between predictions before and after applying an object-destructive transformation. DeYO consists of sample selection and sample weighting, which employ entropy and PLPD concurrently. For robust adaptation, DeYO prioritizes samples that dominantly incorporate shape information when making predictions. Our extensive experiments demonstrate the consistent superiority of DeYO over baseline methods across various scenarios, including biased and wild. Project page is publicly available at https://whitesnowdrop.github.io/DeYO/.Comment: ICLR 2024 Spotlight; 26 pages, 9 figures, 20 tables

arXiv.org e-Print Archive

Can We Utilize Pre-trained Language Models within Causal Discovery Algorithms?

Author: Choe Hyeokjun
Han Sangjun
Jeong Yongjun
Kim Juhyeon
Kim Junghee
Lee Chanhui
Lee Sanghack
Lee Sangmin
Lim Sungbin
Lim Woohyung
Lyu Juhyun
Park Soyeon
Publication venue
Publication date: 18/11/2023
Field of study

Scaling laws have allowed Pre-trained Language Models (PLMs) into the field of causal reasoning. Causal reasoning of PLM relies solely on text-based descriptions, in contrast to causal discovery which aims to determine the causal relationships between variables utilizing data. Recently, there has been current research regarding a method that mimics causal discovery by aggregating the outcomes of repetitive causal reasoning, achieved through specifically designed prompts. It highlights the usefulness of PLMs in discovering cause and effect, which is often limited by a lack of data, especially when dealing with multiple variables. Conversely, the characteristics of PLMs which are that PLMs do not analyze data and they are highly dependent on prompt design leads to a crucial limitation for directly using PLMs in causal discovery. Accordingly, PLM-based causal reasoning deeply depends on the prompt design and carries out the risk of overconfidence and false predictions in determining causal relationships. In this paper, we empirically demonstrate the aforementioned limitations of PLM-based causal reasoning through experiments on physics-inspired synthetic data. Then, we propose a new framework that integrates prior knowledge obtained from PLM with a causal discovery algorithm. This is accomplished by initializing an adjacency matrix for causal discovery and incorporating regularization using prior knowledge. Our proposed framework not only demonstrates improved performance through the integration of PLM and causal discovery but also suggests how to leverage PLM-extracted prior knowledge with existing causal discovery algorithms

arXiv.org e-Print Archive