Search CORE

342 research outputs found

Video Probabilistic Diffusion Models in Projected Latent Space

Author: Kim Subin
Shin Jinwoo
Sohn Kihyuk
Yu Sihyun
Publication venue
Publication date: 15/02/2023
Field of study

Despite the remarkable progress in deep generative models, synthesizing high-resolution and temporally coherent videos still remains a challenge due to their high-dimensionality and complex temporal dynamics along with large spatial variations. Recent works on diffusion models have shown their potential to solve this challenge, yet they suffer from severe computation- and memory-inefficiency that limit the scalability. To handle this issue, we propose a novel generative model for videos, coined projected latent video diffusion models (PVDM), a probabilistic diffusion model which learns a video distribution in a low-dimensional latent space and thus can be efficiently trained with high-resolution videos under limited resources. Specifically, PVDM is composed of two components: (a) an autoencoder that projects a given video as 2D-shaped latent vectors that factorize the complex cubic structure of video pixels and (b) a diffusion model architecture specialized for our new factorized latent space and the training/sampling procedure to synthesize videos of arbitrary length with a single model. Experiments on popular video generation datasets demonstrate the superiority of PVDM compared with previous video synthesis methods; e.g., PVDM obtains the FVD score of 639.7 on the UCF-101 long video (128 frames) generation benchmark, which improves 1773.4 of the prior state-of-the-art.Comment: Project page: https://sihyun.me/PVD

arXiv.org e-Print Archive

Ultrafast dynamics of fractional particles in $\alpha$ -RuCl $_3$

Author: Kee Hae-Young
Kim Subin
Kim Young-June
Yang Luyi
Zhang Haochen
Publication venue
Publication date: 13/08/2019
Field of study

In a Kitaev spin liquid, electron spins can break into fractional particles known as Majorana fermions and Z

_2

fluxes. Recent experiments have indicated the existence of such fractional particles in a two-dimensional Kitaev material candidate,

\alpha

-RuCl

_3

. These exotic particles can be used in topological quantum computations when braided within their lifetimes. However, the lifetimes of these particles, critical for applications in topological quantum computing, have not been reported. Here we study ultrafast dynamics of photoinduced excitations in single crystals of

\alpha

-RuCl

_3

using pump-probe transient grating spectroscopy. We observe intriguing photoexcited nonequilibrium states in the Kitaev paramagnetic regime between

T_N

~7 K and

T_H

~100 K, where

T_N

is the N\'eel temperature and

T_H

is set by the Kitaev interaction. Two distinct lifetimes are detected: a longer lifetime of ~50 ps, independent of temperature; a shorter lifetime of 1-20 ps, with a strong temperature dependence,

T^{-1.40}

. We analyze the transient grating signals using coupled differential equations and propose that the long and short lifetimes are associated with fractional particles in the Kitaev paramagnetic regime, Z

_2

fluxes and Majorana fermions, respectively

arXiv.org e-Print Archive

Machine Learning Based PCB/Package Stack-up Optimization For Signal Integrity

Author: Bae Bumhee
Huang Jiahuan
Huang Wenchang
Hwang Chulsoon
Kim Minseok
Kim Subin
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2023
Field of study

PCB/package stack-up design optimization is time-consuming and requiring a great deal of experience. Although some iterative optimization algorithms are applied to implement automatic stack-up design, evaluating the results of each iteration is still time-intensive. This paper proposes a combined Bayesian optimization-artificial neural network (BO-ANN) algorithm, utilizing a trained ANN-based surrogate model to replace a 2D cross-section analysis tool for fast PCB/package stack-up design optimization. With the acceleration of ANN, the proposed BO-ANN algorithm can finish 100 iterations in 40 seconds while achieving the target characteristic impedance. To better generalize the BO-ANN algorithm, a strategy of effective dielectric calculation is applied to multiple-dielectric stack-up optimization. the BO-ANN algorithm will be able to output optimized stack-up designs with dielectric layers chosen from the pre-defined library and the obtained designs are verified by 2D solver

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

ASAP: Accurate semantic segmentation for real time performance

Author: Kim Eon
Kim Junghwan
Lee Subin
Moon Byeongjun
Park Jaehyun
Yu Dabeen
Yu Yeonseung
Publication venue
Publication date: 03/10/2022
Field of study

Feature fusion modules from encoder and self-attention module have been adopted in semantic segmentation. However, the computation of these modules is costly and has operational limitations in real-time environments. In addition, segmentation performance is limited in autonomous driving environments with a lot of contextual information perpendicular to the road surface, such as people, buildings, and general objects. In this paper, we propose an efficient feature fusion method, Feature Fusion with Different Norms (FFDN) that utilizes rich global context of multi-level scale and vertical pooling module before self-attention that preserves most contextual information while reducing the complexity of global context encoding in the vertical direction. By doing this, we could handle the properties of representation in global space and reduce additional computational cost. In addition, we analyze low performance in challenging cases including small and vertically featured objects. We achieve the mean Interaction of-union(mIoU) of 73.1 and the Frame Per Second(FPS) of 191, which are comparable results with state-of-the-arts on Cityscapes test datasets.Comment: 5 pages, 4 figure

arXiv.org e-Print Archive

SS-IL: Separated Softmax for Incremental Learning

Author: Ahn Hongjoon
Bang Hyeonsu
Kim Hyojun
Kwak Jihwan
Lim Subin
Moon Taesup
Publication venue
Publication date: 01/12/2020
Field of study

We consider class incremental learning (CIL) problem, in which a learning agent continuously learns new classes from incrementally arriving training data batches and aims to predict well on all the classes learned so far. The main challenge of the problem is the catastrophic forgetting, and for the exemplar-memory based CIL methods, it is generally known that the forgetting is commonly caused by the prediction score bias that is injected due to the data imbalance between the new classes and the old classes (in the exemplar-memory). While several methods have been proposed to correct such score bias by some additional post-processing, e.g., score re-scaling or balanced fine-tuning, no systematic analysis on the root cause of such bias has been done. To that end, we analyze that computing the softmax probabilities by combining the output scores for all old and new classes could be the main source of the bias and propose a new CIL method, Separated Softmax for Incremental Learning (SS-IL). Our SS-IL consists of separated softmax (SS) output layer and ratio-preserving (RP) mini-batches combined with task-wise knowledge distillation (TKD), and through extensive experimental results, we show our SS-IL achieves very strong state-of-the-art accuracy on several large-scale benchmarks. We also show SS-IL makes much more balanced prediction, without any additional post-processing steps as is done in other baselines

arXiv.org e-Print Archive

Collaborative Score Distillation for Consistent Visual Synthesis

Author: Choi June Suk
Jeong Jongheon
Kim Subin
Lee Kyungmin
Shin Jinwoo
Sohn Kihyuk
Publication venue
Publication date: 04/07/2023
Field of study

Generative priors of large-scale text-to-image diffusion models enable a wide range of new generation and editing applications on diverse visual modalities. However, when adapting these priors to complex visual modalities, often represented as multiple images (e.g., video), achieving consistency across a set of images is challenging. In this paper, we address this challenge with a novel method, Collaborative Score Distillation (CSD). CSD is based on the Stein Variational Gradient Descent (SVGD). Specifically, we propose to consider multiple samples as "particles" in the SVGD update and combine their score functions to distill generative priors over a set of images synchronously. Thus, CSD facilitates seamless integration of information across 2D images, leading to a consistent visual synthesis across multiple samples. We show the effectiveness of CSD in a variety of tasks, encompassing the visual editing of panorama images, videos, and 3D scenes. Our results underline the competency of CSD as a versatile method for enhancing inter-sample consistency, thereby broadening the applicability of text-to-image diffusion models.Comment: Project page with visuals: https://subin-kim-cv.github.io/CSD

arXiv.org e-Print Archive