26 research outputs found

    Bandit Parameter Estimation

    Get PDF
    Contextual bandit is useful algorithm for the recommendation task in many applications such as NETFLEX, Amazon Echo, etc. Many algorithms are researched and showed a good result in terms of high total reward or low regret. However, when user wants to receive a recommendation in the new task, these algorithms do not use information that learned from before task. We suggest new topic, Bandit Parameter Estimation, to solve that inefficient problem. In the same setting with Contextual bandit, we consider as userโ€™s latent profile. And then we propose some algorithms to estimate as fast as possible. We conducted to experiment to verify algorithms that we proposed in two case by using a synthetic dataset. As a result of experiment, we found that our algorithm estimates parameters faster than other algorithms in Contextual bandit. โ“’ 2017 DGISTopenโ… . Introduction 1-- 1.1 Overview 1-- 1.2 Background 2-- 1.2.1 Multi-Armed bandit 2-- 1.2.2 K-armed (Linear) Contextual bandit 3-- 1.3 Related work 4-- 1.3.1 algorithm 4-- 1.3.2 UCB 5-- 1.3.3 LinUCB 6-- โ…ก. Materials 8-- 2.1 Problem setting for Bandit Parameter Estimation 8-- 2.2 The uncertainty ellipsoid of _(*) 9-- 2.2.1 (. ) 10-- 2.2.2 ((ฮฃ_(t))) 11-- 2.2.3 ((ฮฃ_(t)^(-1))) 11-- 2.2.4 Max(Det(ฮฃ_(t))) 12โ…ข. Method 13-- 3.1 Generating synthetic data 13-- 3.2 The experiment process 13-- โ…ฃ. Experimental result 14-- 4.1 The experiment case 1 : Various k, fixed d 14-- 4.2 The experiment case 2 : Various d, fixed k 15-- โ…ค. Discussion 17-- 5.1 Conclusion 17-- 5.2 Future work 17-- Reference 18-- Summary (Korean) 19์ตœ๊ทผ ๋งŽ์€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—์„œ ์‚ฌ์šฉ์ž ๋งž์ถคํ˜• ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค. ์ด๋•Œ ์ฃผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ Contextual Bandit์ด๋ผ๋Š” ํ˜•ํƒœ๋กœ ์ด๋ฏธ ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์–ด ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์€ ํŠน์ • ์œ ์ €์— ๋Œ€ํ•ด์„œ ํ•˜๋‚˜์˜ Task์—์„œ๋Š” ๋น ๋ฅด๊ฒŒ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋งž๋Š” ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๊ณ  ์žˆ์œผ๋‚˜ ๋งŒ์•ฝ ์ƒˆ๋กœ์šด Task์— ๋Œ€ํ•ด ์ถ”์ฒœ์„ ์ œ๊ณตํ•ด์•ผ ํ•  ๋•Œ, ์ด์ „ Task์—์„œ ํ•™์Šตํ•œ ์ •๋ณด๋ฅผ ์ด์šฉํ•˜์ง€ ๋ชปํ•˜๊ณ  Task ๋ณ„๋กœ ๋…๋ฆฝ์ ์œผ๋กœ ๋‹ค์‹œ ํ•™์Šตํ•ด์•ผ ํ•˜๋ฏ€๋กœ ํšจ์œจ์ ์ด์ง€ ์•Š๋‹ค. ์ด๋Ÿฌํ•œ ์ ์—์„œ ๋™๊ธฐ๋ฅผ ์–ป์–ด Contextual Bandit๊ณผ ๊ฐ™์€ ํ™˜๊ฒฝ์—์„œ ์ตœ๊ทผ ์‚ฌ์šฉ์ž์˜ ํ”„๋กœํ•„์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ Bandit Parameter Estimation์ด๋ผ๋Š” ํ˜•ํƒœ์˜ ์ƒˆ๋กœ์šด ๋ฌธ์ œ๋ฅผ ์ œ์‹œํ•˜์˜€๋‹ค. ๋น ๋ฅธ ํ•™์Šต์„ ์œ„ํ•˜์—ฌ The uncertainty ellipsoid์„ ์ˆ˜์ถ•ํ•˜๊ธฐ ์œ„ํ•œ ๋ช‡ ๊ฐ€์ง€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์‹œํ•˜์˜€๊ณ  ์‹คํ—˜์„ ์œ„ํ•ด ๋งŒ๋“  ๋ฐ์ดํ„ฐ ์…‹์—์„œ ์ œ์‹œํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๊ธฐ์กด์˜ Contextual bandit ์•Œ๊ณ ๋ฆฌ์ฆ˜๋ณด๋‹ค ๋น ๋ฅด๊ฒŒ Parameter Estimation์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ๋‹ค. ๋˜ํ•œ ํ–ฅํ›„ ์—ฐ๊ตฌ ์ฃผ์ œ๋กœ ๋ณธ ๋…ผ๋ฌธ์„ ํ†ตํ•ด ํ™•์ธ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‹ค์ œ ๋ฐ์ดํ„ฐ์— ์ ์šฉํ•˜์—ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฒ€์ฆํ•˜๋Š” ๊ฒƒ ๊ทธ๋ฆฌ๊ณ  ํ•™์Šต๋œ ์‚ฌ์šฉ์ž์˜ ํ”„๋กœํ•„์„ ์ถ”๊ฐ€์ ์œผ๋กœ ์ด์šฉํ•˜์—ฌ Contextual Bandit์— ์‚ฌ์šฉ๋˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ–ˆ์„ ๋•Œ ํ”„๋กœํ•„์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•˜์„ ๋•Œ ๋ณด๋‹ค ๋” ๋น ๋ฅด๊ฒŒ ์ข‹์€ ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์ œ์‹œํ•˜์˜€๋‹ค.MasterdCollectio

    DoPAMINE: Double-sided Masked CNN for Pixel Adaptive Multiplicative Noise Despeckling

    Full text link
    We propose DoPAMINE, a new neural network based multiplicative noise despeckling algorithm. Our algorithm is inspired by Neural AIDE (N-AIDE), which is a recently proposed neural adaptive image denoiser. While the original N-AIDE was designed for the additive noise case, we show that the same framework, i.e., adaptively learning a network for pixel-wise affine denoisers by minimizing an unbiased estimate of MSE, can be applied to the multiplicative noise case as well. Moreover, we derive a double-sided masked CNN architecture which can control the variance of the activation values in each layer and converge fast to high denoising performance during supervised training. In the experimental results, we show our DoPAMINE possesses high adaptivity via fine-tuning the network parameters based on the given noisy image and achieves significantly better despeckling results compared to SAR-DRN, a state-of-the-art CNN-based algorithm.Comment: AAAI 2019 Camera Ready Versio

    Sy-CON: Symmetric Contrastive Loss for Continual Self-Supervised Representation Learning

    Full text link
    We introduce a novel and general loss function, called Symmetric Contrastive (Sy-CON) loss, for effective continual self-supervised learning (CSSL). We first argue that the conventional loss form of continual learning which consists of single task-specific loss (for plasticity) and a regularizer (for stability) may not be ideal for contrastive loss based CSSL that focus on representation learning. Our reasoning is that, in contrastive learning based methods, the task-specific loss would suffer from decreasing diversity of negative samples and the regularizer may hinder learning new distinctive representations. To that end, we propose Sy-CON that consists of two losses (one for plasticity and the other for stability) with symmetric dependence on current and past models' negative sample embeddings. We argue our model can naturally find good trade-off between the plasticity and stability without any explicit hyperparameter tuning. We validate the effectiveness of our approach through extensive experiments, demonstrating that MoCo-based implementation of Sy-CON loss achieves superior performance compared to other state-of-the-art CSSL methods.Comment: Preprin

    Knowledge Unlearning for Mitigating Privacy Risks in Language Models

    Full text link
    Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply applying the unlikelihood training objective to target token sequences is effective at forgetting them with little to no degradation of general language modeling performances; it sometimes even substantially improves the underlying LM with just a few iterations. We also find that sequential unlearning is better than trying to unlearn all the data at once and that unlearning is highly dependent on which kind of data (domain) is forgotten. By showing comparisons with a previous data preprocessing method known to mitigate privacy risks for LMs, we show that unlearning can give a stronger empirical privacy guarantee in scenarios where the data vulnerable to extraction attacks are known a priori while being orders of magnitude more computationally efficient. We release the code and dataset needed to replicate our results at https://github.com/joeljang/knowledge-unlearning

    Identification of the antibacterial action mechanism of diterpenoids through transcriptome profiling

    Get PDF
    Effective antibacterial substances of Aralia continentalis have anti-biofilm and bactericidal activity to the oral pathogen Streptococcus mutans. In this study, three compounds extracted from A. continentalis were identified as acanthoic acid, continentalic acid, and kaurenoic acid by NMR and were further investigated how these diterpenoids affect the physiology of the S. mutans. When S. mutans was exposed to individual or mixed fraction of diterpenoids, severe growth defects and unique morphology were observed. The proportion of unsaturated fatty acids in the cell membrane was increased compared to that of saturated fatty acids in the presence of diterpenoids. Genome-wide gene expression profiles with RNA-seq were compared to reveal the mode of action of diterpenoids. Streptococcus mutans commonly enhanced the expression of 176 genes in the presence of the individual diterpenoids, whereas the expression of 232 genes was considerably reduced. The diterpenoid treatment modulated the expression of genes or operon(s) involved in cell membrane synthesis, cell division, and carbohydrate metabolism of S. mutans. Collectively, these findings provide novel insights into the antibacterial effect of diterpenoids to control S. mutans infection, which causes human dental caries

    Phase Variation of Biofilm Formation in Staphylococcus aureus by IS256 Insertion and Its Impact on the Capacity Adhering to Polyurethane Surface

    Get PDF
    While ica gene of Staphylococcus epidermidis is known to undergo phase variation by insertion of IS256, the phenomenon in Staphylococcus aureus has not been evaluated. Six biofilm-positive strains were tested for the presence of biofilm-negative phase-variant strains by Congo red agar test. For potential phase-variant strains, pulsed-field gel electrophoresis was done to exclude the possibility of contamination. To investigate the mechanism of the biofilm-negative phase variation, PCR for each ica genes were done. Changes of ica genes detected by PCR were confirmed by southern hybridization, and their nucleotides were analyzed by DNA sequencing. Influence of ica genes and biofilm formation on capacity for adherence to biomedical material was evaluated by comparing the ability of adhering to polyurethane surface among a biofilm-negative phase-variant strain and its parent strain. A biofilm-negative phase-variant S. aureus strain was detected from 6 strains tested. icaC gene of the phase-variant strain was found to be inactivated by insertion of additional gene segment, IS256. The biofilm-negative phase-variant strain showed lower adhering capacity to polyurethane than its parent strain. This study shows that phase variation of ica gene occurs in S. aureus by insertion of IS256 also, and this biofilm-negative phase variation reduces adhering capacity of the bacteria
    corecore