Search CORE

151 research outputs found

CheXFusion: Effective Fusion of Multi-View Features using Transformers for Long-Tailed Chest X-Ray Classification

Author: Kim Dongkyun
Publication venue
Publication date: 07/08/2023
Field of study

Medical image classification poses unique challenges due to the long-tailed distribution of diseases, the co-occurrence of diagnostic findings, and the multiple views available for each study or patient. This paper introduces our solution to the ICCV CVAMD 2023 Shared Task on CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays. Our approach introduces CheXFusion, a transformer-based fusion module incorporating multi-view images. The fusion module, guided by self-attention and cross-attention mechanisms, efficiently aggregates multi-view features while considering label co-occurrence. Furthermore, we explore data balancing and self-training methods to optimize the model's performance. Our solution achieves state-of-the-art results with 0.372 mAP in the MIMIC-CXR test set, securing 1st place in the competition. Our success in the task underscores the significance of considering multi-view settings, class imbalance, and label co-occurrence in medical image classification. Public code is available at https://github.com/dongkyuk/CXR-LT-public-solutio

arXiv.org e-Print Archive

Advances in Remote Sensing to Understand Extreme Hydrological Events

Author: Choi Minha
Kim Dongkyun
Kim Jongho
Kim Ungtae
Publication venue: EngagedScholarship@CSU
Publication date: 12/11/2019
Field of study

Cleveland-Marshall College of Law

Attribute Based Interpretable Evaluation Metrics for Generative Models

Author: Kim Dongkyun
Kwon Mingi
Uh Youngjung
Publication venue
Publication date: 26/10/2023
Field of study

When the training dataset comprises a 1:1 proportion of dogs to cats, a generative model that produces 1:1 dogs and cats better resembles the training species distribution than another model with 3:1 dogs and cats. Can we capture this phenomenon using existing metrics? Unfortunately, we cannot, because these metrics do not provide any interpretability beyond "diversity". In this context, we propose a new evaluation protocol that measures the divergence of a set of generated images from the training set regarding the distribution of attribute strengths as follows. Single-attribute Divergence (SaD) measures the divergence regarding PDFs of a single attribute. Paired-attribute Divergence (PaD) measures the divergence regarding joint PDFs of a pair of attributes. They provide which attributes the models struggle. For measuring the attribute strengths of an image, we propose Heterogeneous CLIPScore (HCS) which measures the cosine similarity between image and text vectors with heterogeneous initial points. With SaD and PaD, we reveal the following about existing generative models. ProjectedGAN generates implausible attribute relationships such as a baby with a beard even though it has competitive scores of existing metrics. Diffusion models struggle to capture diverse colors in the datasets. The larger sampling timesteps of latent diffusion model generate the more minor objects including earrings and necklaces. Stable Diffusion v1.5 better captures the attributes than v2.1. Our metrics lay a foundation for explainable evaluations of generative models

arXiv.org e-Print Archive

Editorial: Current water challenges require holistic and global solutions

Author: Dongkyun Kim
Dragan Savic
Hyun-Han Kwon
Orazio Giustolisi
Publication venue
Publication date: 01/05/2018
Field of study

The world population is exploding and is estimated to reach 9.8 billion within the next 10 years (Gerland et al. 2014). Desire for more convenient lifestyles is not likely to be satisfied (United Nations 2009). Such lifestyles entail the unsustainable exploitation of water resources and the environment (Vitousek et al. 1997). Advanced technology and transportation systems have enabled the transfer of goods across the world and, eventually, also the water that is used to produce them. This means that luxurious lifestyles on one side of the planet can cause water and food scarcity on the other side (Hoekstra & Mekonnen 2012). We are also witnessing drastic global climate change: sea levels are rising, and droughts and floods have become more intense. These have exacerbated the global water and food crises (Vorosmarty et al. 2000; Hanjra & Qureshi 2010). Our generation's water challenge is no longer a local or isolated issue. It must be recognized, understood, and analyzed from a holistic and global perspective (Wagener et al. 2010). As such, the growing complexity of global water challenges requires better collection and analysis of ever increasing data with equipping

Crossref

Open Access Repository

A Poisson Cluster Stochastic Rainfall Generator That Accounts for the Interannual Variability of Rainfall Statistics: Validation at Various Geographic Locations across the United States

Author: Dongkyun Kim
Jongho Kim
Yong-Sik Cho
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

A novel approach for a Poisson cluster stochastic rainfall generator was validated in its ability to reproduce important rainfall and watershed response characteristics at 104 locations in the United States. The suggested novel approach, The Hybrid Model (THM), as compared to the traditional Poisson cluster rainfall modeling approaches, has an additional capability to account for the interannual variability of rainfall statistics. THM and a traditional approach of Poisson cluster rainfall model (modified Bartlett-Lewis rectangular pulse model) were compared in their ability to reproduce the characteristics of extreme rainfall and watershed response variables such as runoff and peak flow. The results of the comparison indicate that THM generally outperforms the traditional approach in reproducing the distributions of peak rainfall, peak flow, and runoff volume. In addition, THM significantly outperformed the traditional approach in reproducing extreme rainfall by 2.3% to 66% and extreme flow values by 32% to 71%

Crossref

Directory of Open Access Journals

Big Data Analytics in the Internet-Of-Things And Cyber-Physical Systems

Author: De Souza Jose-Neuman
Kim Dongkyun
Lloret Jaime
Lv Zhihan
Song Houbing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/02/2019
Field of study

Lv, Z.; Song, H.; Lloret, J.; Kim, D.; De Souza, J. (2019). Big Data Analytics in the Internet-Of-Things And Cyber-Physical Systems. IEEE Access. 7:18070-18075. https://doi.org/10.1109/ACCESS.2019.2895441S1807018075

RiuNet

CODIE: Controlled Data and Interest Evaluation in Vehicular Named Data Networks

Author: Ahmed Syed Hassan
Bouk Safdar Hussain
Kim Dongkyun
Lloret Jaime
Song Hou-bing
Yaqub Muhammad Azfar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2016
Field of study

[EN] Recently, named data networking (NDN) has been proposed as a promising architecture for future Internet technologies. NDN is an extension to the content-centric network (CCN) and is expected to support various applications in vehicular communications [ vehicular NDN (VNDN)]. VNDN basically relies on naming the content rather than using end-to-end device names. In VNDN, a vehicle broadcasts an "Interest" packet for the required "content," regardless of end-to-end connectivity with servers or other vehicles and known as a "consumer." In response, a vehicle with the content replies to the Interest packet with a "Data" packet and named as a "provider." However, the simple VNDN architecture faces several challenges such as consumer/provider mobility and Interest/Data packet(s) forwarding. In VNDN, for the most part, the Data packet is sent along the reverse path of the related Interest packet. However, there is no extensive simulated reference available in the literature to support this argument. In this paper, therefore, we first analyze the propagation behavior of Interest and Data packets in the vehicular ad hoc network (VANET) environment through extensive simulations. Second, we propose the "CODIE" scheme to control the Data flooding/broadcast storm in the naive VNDN. The main idea is to allow the consumer vehicle to start hop counter in Interest packet. Upon receiving this Interest by any potential provider, a data dissemination limit (DDL) value stores the number of hops and a data packet needs to travel back. Simulation results show that CODIE forwards fewer copies of data packets processed (CDPP) while achieving similar interest satisfaction rate (ISR), as compared with the naive VNDN. In addition, we also found that CODIE also minimizes the overall interest satisfaction delay (ISD), respectively.This work was supported by the Ministry of Science, ICT and Future Planning, South Korea, under Grant IITP-2015-H8601-15-1002 of the Convergence Information Technology Research Center supervised by the Institute for Information and Communications Technology Promotion. The review of this paper was coordinated by Editors of CVS. (Corresponding author: Dongkyun Kim.)Ahmed, SH.; Bouk, SH.; Yaqub, MA.; Kim, D.; Song, H.; Lloret, J. (2016). CODIE: Controlled Data and Interest Evaluation in Vehicular Named Data Networks. IEEE Transactions on Vehicular Technology. 65(6):3954-3963. https://doi.org/10.1109/TVT.2016.2558650S3954396365

Crossref

RiuNet

A bounding algorithm for the broadcast storm problem in mobile ad hoc networks

Author: Chai-Keong Toh
Dongkyun Kim
Juan-Carlos Cano
Pietro Manzoni
Publication venue
Publication date: 01/01/2003
Field of study

Abstract-Many protocols used in Mobile Ad Hoc Networks rely on the broadcasting capability, especially when performing a route discovery process. However, an efficient broadcasting protocol should be devised to reduce the unnecessary redundant rebroadcasting at some nodes (redundancy) as well as to increase the coverage area as much as possible (reachability). A few approaches have been developed in the literature. We propose a bounding algorithm which is shown to be an efficient candidate to accommodate the two goals, that is to increase reachability while limiting redundancy

CiteSeerX

PROV-IO+: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems

Author: Byna Suren
Chen Yong
Dai Dong
Dong Bin
Han Runzhou
Hassoun Joseph
Kim Dongkyun
Tang Houjun
Thorsley David
Wolf Matthew
Zheng Mai
Publication venue
Publication date: 01/08/2023
Field of study

Data provenance, or data lineage, describes the life cycle of data. In scientific workflows on HPC systems, scientists often seek diverse provenance (e.g., origins of data products, usage patterns of datasets). Unfortunately, existing provenance solutions cannot address the challenges due to their incompatible provenance models and/or system implementations. In this paper, we analyze four representative scientific workflows in collaboration with the domain scientists to identify concrete provenance needs. Based on the first-hand analysis, we propose a provenance framework called PROV-IO+, which includes an I/O-centric provenance model for describing scientific data and the associated I/O operations and environments precisely. Moreover, we build a prototype of PROV-IO+ to enable end-to-end provenance support on real HPC systems with little manual effort. The PROV-IO+ framework can support both containerized and non-containerized workflows on different HPC platforms with flexibility in selecting various classes of provenance. Our experiments with realistic workflows show that PROV-IO+ can address the provenance needs of the domain scientists effectively with reasonable performance (e.g., less than 3.5% tracking overhead for most experiments). Moreover, PROV-IO+ outperforms a state-of-the-art system (i.e., ProvLake) in our experiments

arXiv.org e-Print Archive