Search CORE

9 research outputs found

Leveraging Task Structures for Improved Identifiability in Neural Network Representations

Author: Chen Wenlin
Heo Juyeon
Hernández-Lobato José Miguel
Horwood Julien
Publication venue
Publication date: 29/09/2023
Field of study

This work extends the theory of identifiability in supervised learning by considering the consequences of having access to a distribution of tasks. In such cases, we show that identifiability is achievable even in the case of regression, extending prior work restricted to linear identifiability in the single-task classification case. Furthermore, we show that the existence of a task distribution which defines a conditional prior over latent factors reduces the equivalence class for identifiability to permutations and scaling, a much stronger and more useful result than linear identifiability. When we further assume a causal structure over these tasks, our approach enables simple maximum marginal likelihood optimization together with downstream applicability to causal representation learning. Empirically, we validate that our model outperforms more general unsupervised models in recovering canonical representations for both synthetic and real-world molecular data.Comment: 18 pages, 4 figures, 5 tables, 1 algorith

arXiv.org e-Print Archive

Adversarial Infidelity Learning for Model Interpretation

Author: Abadi Mart'in
Ancona Marco
Chakraborti Tathagata
Chen Jianbo
Dombrowski Ann-Kathrin
Heo Juyeon
Howard Andrew G
Jain Sarthak
Schwab Patrick
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/08/2020
Field of study

Model interpretation is essential in data mining and knowledge discovery. It can help understand the intrinsic model working mechanism and check if the model has undesired characteristics. A popular way of performing model interpretation is Instance-wise Feature Selection (IFS), which provides an importance score of each feature representing the data samples to explain how the model generates the specific output. In this paper, we propose a Model-agnostic Effective Efficient Direct (MEED) IFS framework for model interpretation, mitigating concerns about sanity, combinatorial shortcuts, model identifiability, and information transmission. Also, we focus on the following setting: using selected features to directly predict the output of the given model, which serves as a primary evaluation metric for model-interpretation methods. Apart from the features, we involve the output of the given model as an additional input to learn an explainer based on more accurate information. To learn the explainer, besides fidelity, we propose an Adversarial Infidelity Learning (AIL) mechanism to boost the explanation learning by screening relatively unimportant features. Through theoretical and experimental analysis, we show that our AIL mechanism can help learn the desired conditional distribution between selected features and targets. Moreover, we extend our framework by integrating efficient interpretation methods as proper priors to provide a warm start. Comprehensive empirical evaluation results are provided by quantitative metrics and human evaluation to demonstrate the effectiveness and superiority of our proposed method. Our code is publicly available online at https://github.com/langlrsw/MEED.Comment: 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '20), August 23--27, 2020, Virtual Event, US

arXiv.org e-Print Archive

Crossref

Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization

Author: Black Michael J.
Feng Haiwen
Feng Yao
Heo Juyeon
Liu Weiyang
Liu Zhen
Peng Songyou
Qiu Zeju
Schölkopf Bernhard
Weller Adrian
Wen Yandong
Xiu Yuliang
Xue Yuxuan
Yu Longhui
Publication venue
Publication date: 10/11/2023
Field of study

Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in vision and language.Comment: Technical Report (33 pages, 18 figures

arXiv.org e-Print Archive

Interpreting Deep Learning-Based Networking Systems

Author: Ba Jimmy
Bai Wei
Bastani Osbert
Chen Li
Chung Junyoung
Du Mengnan
Elsken Thomas
Frankle Jonathan
Friedman Jerome H
Guidotti Riccardo
Guo Tian
Guo Wenbo
Hanin Boris
Heo Juyeon
Huynh Loc N
James
Jay Nathan
Kamel Mahmoud
Kazak Yafim
Kingma Diederik P
Mao Hongzi
Mao Hongzi
Meng Zili
Mnih Volodymyr
Novak Roman
Pedregosa Fabian
Ribeiro Marco Tulio
Ross Stéphane
Smilkov Daniel
Toneva Mariya
Verma Abhinav
Wu Mike
Yadati Naganand
Yeo Hyunho
Ying Zhitao
Yu Haonan
Zhang Chun
Zhang Ruochi
Zhang Xinyang
Zhou Dengyong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/07/2020
Field of study

While many deep learning (DL)-based networking systems have demonstrated superior performance, the underlying Deep Neural Networks (DNNs) remain blackboxes and stay uninterpretable for network operators. The lack of interpretability makes DL-based networking systems prohibitive to deploy in practice. In this paper, we propose Metis, a framework that provides interpretability for two general categories of networking problems spanning local and global control. Accordingly, Metis introduces two different interpretation methods based on decision tree and hypergraph, where it converts DNN policies to interpretable rule-based controllers and highlight critical components based on analysis over hypergraph. We evaluate Metis over several state-of-the-art DL-based networking systems and show that Metis provides human-readable interpretations while preserving nearly no degradation in performance. We further present four concrete use cases of Metis, showcasing how Metis helps network operators to design, debug, deploy, and ad-hoc adjust DL-based networking systems.Comment: To appear at ACM SIGCOMM 202

arXiv.org e-Print Archive

Crossref

Towards More Robust Interpretation via Local Gradient Alignment

Author: Heo Juyeon
Jeong SeokHyeon
Joo Sunghwan
Moon Taesup
Weller Adrian
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

Neural network interpretation methods, particularly feature attribution methods, are known to be fragile with respect to adversarial input perturbations. To address this, several methods for enhancing the local smoothness of the gradient while training have been proposed for attaining robust feature attributions. However, the lack of considering the normalization of the attributions, which is essential in their visualizations, has been an obstacle to understanding and improving the robustness of feature attribution methods. In this paper, we provide new insights by taking such normalization into account. First, we show that for every non-negative homogeneous neural network, a naive l2-robust criterion for gradients is not normalization invariant, which means that two functions with the same normalized gradient can have different values. Second, we formulate a normalization invariant cosine distance-based criterion and derive its upper bound, which gives insight for why simply minimizing the Hessian norm at the input, as has been done in previous work, is not sufficient for attaining robust feature attribution. Finally, we propose to combine both l2 and cosine distance-based criteria as regularization terms to leverage the advantages of both in aligning the local gradient. As a result, we experimentally show that models trained with our method produce much more robust interpretations on CIFAR-10 and ImageNet-100 without significantly hurting the accuracy, compared to the recent baselines. To the best of our knowledge, this is the first work to verify the robustness of interpretation on a larger-scale dataset beyond CIFAR-10, thanks to the computational efficiency of our method

Association for the Advancement of Artificial Intelligence: AAAI Publications

Effects of cultivation, washing, and bleaching conditions on bacterial cellulose fabric production

Author: Euijin Shim
Heo YD
Hye Rim Kim
Juyeon Han
Kaczmarczyk D
Raghunathan D
Çoban EP
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

A High-Density Array of Size-Controlled Silicon Nanodots in a Silicon Oxide Nanowire by Electron-Stimulated Oxygen Expulsion

Author: Byoung Lyong Choi
Eun Kyung Lee
Gyeong-Su Park
Jae Gwan Chung
Jae Hak Lee
Jong Min Kim
Ju Cheol Park
Jun Ho Lee
Juyeon Park
Seong Keun Kim
Sung Heo
Woo Sung Jeon
Xiang Shu Li
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

Estimating PM2.5 concentration of the conterminous United States via interpretable convolutional neural networks

Author: Bach
Byungjoon Kwon
Chakma
Di
Goodfellow
Gupta
He
Hinton
Hu
Hu
Hu
Juyeon Heo
Kingma
Kloog
Krizhevsky
LeCun
LeCun
LeCun
Li
Liu
Liu
Ma
Madrigano
Neophytou
Reid
Shen
Szegedy
Taesup Moon
Wilcoxon
Wu
Xuefei Hu
Yang
Yang Liu
Yongbee Park
Zhang
Zou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref