Search CORE

40 research outputs found

DRAFT: Dense Retrieval Augmented Few-shot Topic classifier Framework

Author: Kim Keonwoo
Lee Younggun
Publication venue
Publication date: 05/12/2023
Field of study

With the growing volume of diverse information, the demand for classifying arbitrary topics has become increasingly critical. To address this challenge, we introduce DRAFT, a simple framework designed to train a classifier for few-shot topic classification. DRAFT uses a few examples of a specific topic as queries to construct Customized dataset with a dense retriever model. Multi-query retrieval (MQR) algorithm, which effectively handles multiple queries related to a specific topic, is applied to construct the Customized dataset. Subsequently, we fine-tune a classifier using the Customized dataset to identify the topic. To demonstrate the efficacy of our proposed approach, we conduct evaluations on both widely used classification benchmark datasets and manually constructed datasets with 291 diverse topics, which simulate diverse contents encountered in real-world applications. DRAFT shows competitive or superior performance compared to baselines that use in-context learning, such as GPT-3 175B and InstructGPT 175B, on few-shot topic classification tasks despite having 177 times fewer parameters, demonstrating its effectiveness

arXiv.org e-Print Archive

MEMTO: Memory-guided Transformer for Multivariate Time Series Anomaly Detection

Author: Cho Sungzoon
Kim Keonwoo
Oh Jeonglyul
Song Junho
Publication venue
Publication date: 05/12/2023
Field of study

Detecting anomalies in real-world multivariate time series data is challenging due to complex temporal dependencies and inter-variable correlations. Recently, reconstruction-based deep models have been widely used to solve the problem. However, these methods still suffer from an over-generalization issue and fail to deliver consistently high performance. To address this issue, we propose the MEMTO, a memory-guided Transformer using a reconstruction-based approach. It is designed to incorporate a novel memory module that can learn the degree to which each memory item should be updated in response to the input data. To stabilize the training procedure, we use a two-phase training paradigm which involves using K-means clustering for initializing memory items. Additionally, we introduce a bi-dimensional deviation-based detection criterion that calculates anomaly scores considering both input space and latent space. We evaluate our proposed method on five real-world datasets from diverse domains, and it achieves an average anomaly detection F1-score of 95.74%, significantly outperforming the previous state-of-the-art methods. We also conduct extensive experiments to empirically validate the effectiveness of our proposed model's key components

arXiv.org e-Print Archive

Additional file 2 of Mut2Vec: distributed representation of cancerous mutations

Author: Heewon Lee (1392817)
Jaewoo Kang (239124)
Keonwoo Kim (5126513)
Sunkyu Kim (1529461)
Publication venue
Publication date
Field of study

It contains the most enriched clusters with IntOGen driver mutations obtained by six clustering methods(K-Means, Agglomerative hierarchical clustering, BIRCH, Spectral clustering, Affinity Propagation, and Gaussian Mixture) and five options of the number of clusters(50, 100, 200, 300 and 500); except Affinity Propagation. (PDF 108 kb

The Francis Crick Institute

Tunable translation-level CRISPR interference by dCas13 and engineered gRNA in bacteria

Author: Giho Kim
Ho Joon Kim
Hyeon Jin Kim
Jina Yang
Keonwoo Kim
Sang Woo Seo
Publication venue: Nature Portfolio
Publication date: 01/06/2024
Field of study

Abstract Although CRISPR-dCas13, the RNA-guided RNA-binding protein, was recently exploited as a translation-level gene expression modulator, it has still been difficult to precisely control the level due to the lack of detailed characterization. Here, we develop a synthetic tunable translation-level CRISPR interference (Tl-CRISPRi) system based on the engineered guide RNAs that enable precise and predictable down-regulation of mRNA translation. First, we optimize the Tl-CRISPRi system for specific and multiplexed repression of genes at the translation level. We also show that the Tl-CRISPRi system is more suitable for independently regulating each gene in a polycistronic operon than the transcription-level CRISPRi (Tx-CRISPRi) system. We further engineer the handle structure of guide RNA for tunable and predictable repression of various genes in Escherichia coli and Vibrio natriegens. This tunable Tl-CRISPRi system is applied to increase the production of 3-hydroxypropionic acid (3-HP) by 14.2-fold via redirecting the metabolic flux, indicating the usefulness of this system for the flux optimization in the microbial cell factories based on the RNA-targeting machinery

Directory of Open Access Journals

Additional file 1 of Mut2Vec: distributed representation of cancerous mutations

Author: Heewon Lee (1392817)
Jaewoo Kang (239124)
Keonwoo Kim (5126513)
Sunkyu Kim (1529461)
Publication venue
Publication date
Field of study

It contains the visualization results with mutation vectors trained with an autoencoder and a denoising autoencoder. (PDF 427 kb

The Francis Crick Institute

Added Value of Structured Reporting for US of the Pediatric Appendix: Additional CT Examinations and Negative Appendectomy

Author: Hyuk Jung Kim
Hyun Jin Kim
Ji Young Choi
Keonwoo Choi
Suk Ki Jang
Publication venue: 'The Korean Society of Radiology'
Publication date: 01/05/2023
Field of study

Purpose This study aimed to determine the incremental value of using a structured report (SR) for US examinations of the pediatric appendix. Materials and Methods Between January 2009 and June 2016, 1150 pediatric patients with suspected appendicitis who underwent US examinations of the appendix were included retrospectively. In November 2012, we developed a five-point scale SR for appendix US examinations. The patients were divided into two groups according to the form of the US report: free-text or SR. The primary clinical outcomes were compared between the two groups, including the rate of CT imaging following US examinations, the negative appendectomy rate (NAR), and the appendiceal perforation rate (PR). Results In total, 550 patients were included in the free-text group and 600 patients in the SR group. The rate of additional CT examinations decreased by 5.3% in the SR group (8.2%, p = 0.003), and the NAR decreased by 8.4% in the SR group (7.8%, p = 0.028). There was no statistical difference in the appendiceal PR (37.6% vs. 48.0%, p = 0.078). Conclusion The use of an SR to evaluate US examinations for suspected pediatric appendicitis results in lower CT use and fewer negative appendectomies without an increase in appendiceal PR

Directory of Open Access Journals

WALDIO: Eliminating the Filesystem Journaling in Resolving the Journaling of Journal Anomaly

Author: Kim Wook-Hee
Lee Keonwoo
Lee Wongun
Nam Beomseok
Son Hankeun
Won Youjip
Publication venue: USENIX
Publication date: 08/07/2015
Field of study

This work is dedicated to resolve the Journaling of Journal Anomaly in Android IO stack.We orchestrate SQLite and EXT4 filesystem so that SQLite???s file-backed journaling activity can dispense with the expensive filesystem intervention, the journaling, without compromising the file integrity under unexpected filesystem failure. In storing the logs, we exploit the direct IO to suppress the filesystem interference. This work consists of three key ingredients: (i) Preallocation with Explicit Journaling, (ii) Header Embedding, and (iii) Group Synchronization. Preallocation with Explicit Journaling eliminates the filesystem journaling properly protecting the file metadata against the unexpected system crash. We redesign the SQLite B-tree structure with Header Embedding to make it direct IO compatible and block IO friendly. With Group Synch, we minimize the synchronization overhead of direct IO and make the SQLite operation NAND Flash friendly. Combining the three technical ingredients, we develop a new journal mode in SQLite, the WALDIO. We implement it on the commercially available smartphone. WALDIO mode achieves 5.1x performance (insert/sec) against WAL mode which is the fastest journaling mode in SQLite. It yields 2.7x performance (inserts/ sec) against the LS-MVBT, the fastest SQLite journaling mode known to public. WALDIO mode achieves 7.4x performance (insert/sec) against WAL mode when it is relieved from the overhead of explicitly synchronizing individual log-commit operations. WALDIO mode reduces the IO volume to 1/6 compared against the WAL mode

ScholarWorks@UNIST