Search CORE

2,137 research outputs found

이상치 탐지를 위한 적대적 사전 학습 알고리즘

Author: 백종혁
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (석사) -- 서울대학교 대학원 : 공과대학 기계공학부, 2020. 8. 박종우.In this thesis, we propose a semi-supervised dictionary learning algorithm that learns representations of only non-outlier data. The presence of outliers in a dataset is a major drawback for dictionary learning, resulting in less than desirable performance in real-world applications. Our adversarial dictionary learning (ADL) algorithm exploits a supervision dataset composed of known outliers. The algorithm penalizes the dictionary expressing the known outliers well. Penalizing the known outliers makes dictionary learning robust to the outliers present in the dataset. The proposed method can handle highly corrupted dataset which cannot be effectively dealt with using conventional robust dictionary learning algorithms. We empirically show the usefulness of our algorithm with extensive experiments on anomaly detection, using both synthetic univariate time-series data and multivariate point data.본 논문에서는 이상치가 아닌 데이터의 희소 표현만을 학습하는 준지도 사전 학습 알고리즘을 제안한다. 데이터셋에 섞여 있는 이상치는 사전 학습의 주요한 문제로, 실제 문제에 적용 시 바람직하지 않은 성능을 초래한다. 본 연구에서 제안하는 적대적 사전 학습(ADL) 알고리즘은 이상치 데이터로 구성된 감독 데이터셋을 학습에 이용한다. 우리의 알고리즘은 주어진 이상치 데이터를 잘 표현하는 사전에 페널티를 주고, 이것은 사전이 학습 데이터셋에 섞여 있는 이상치에 강건하게 학습되도록 한다. 제안된 방법은 기존의 사전 학습 방법들과 비교해 이상치의 비중이 높은 데이터셋에서도 효과적으로 사전을 학습해 낸다. 이 연구에서는 인공적인 단변량 시계열 데이터와 다변량 점 데이터에 대한 이상치 탐지 실험을 통해 알고리즘의 유용성을 경험적으로 검증한다.1 Introduction 1 1.1 Related Works 4 1.2 Contributions of This Thesis 5 1.3 Organization 6 2 Sparse Representation and Dictionary Learning 7 2.1 Sparse Representation 7 2.1.1 Problem De nition of Sparse Representation 7 2.1.2 Sparse representation with l0-norm regularization 10 2.1.3 Sparse representation with l1-norm regularization 11 2.1.4 Sparse representation with lp-norm regularization (0 < p < 1) 12 2.2 Dictionary Learning 12 2.2.1 Problem De nition of Dictionary Learning 12 2.2.2 Dictionary Learning Methods 14 3 Adversarial Dictionary Learning 18 3.1 Problem Formulation 18 3.2 Adversarial Loss 19 3.3 Optimization Algorithm 20 4 Experiments 25 4.1 Data Description 26 4.1.1 Univariate Time-series Data 26 4.1.2 Multivariate Point Data 29 4.2 Evaluation Process 30 4.2.1 A Baseline of Anomaly Detection 30 4.2.2 ROC Curve and AUC 34 4.3 Experiment Setting 35 4.4 Results 36 5 Conclusion 43 Bibliography 45 국문초록 50Maste

SNU Open Repository and Archive

Image Anomaly Detection and Localization with Position and Neighborhood Information

Author: Bae Jaehyeok
Kim Seyun
Lee Jae-Han
Publication venue
Publication date: 28/11/2022
Field of study

Anomaly detection and localization are essential in many areas, where collecting enough anomalous samples for training is almost impossible. To overcome this difficulty, many existing methods use a pre-trained network to encode input images and non-parametric modeling to estimate the encoded feature distribution. In the modeling process, however, they overlook that position and neighborhood information affect the distribution of normal features. To use the information, in this paper, the normal distribution is estimated with conditional probability given neighborhood features, which is modeled with a multi-layer perceptron network. At the same time, positional information can be used by building a histogram of representative features at each position. While existing methods simply resize the anomaly map into the resolution of an input image, the proposed method uses an additional refine network that is trained from synthetic anomaly images to perform better interpolation considering the shape and edge of the input image. For the popular industrial dataset, MVTec AD benchmark, the experimental results show \textbf{99.52\%} and \textbf{98.91\%} AUROC scores in anomaly detection and localization, which is state-of-the-art performance

arXiv.org e-Print Archive

Kosmos-2: Grounding Multimodal Large Language Models to the World

Author: Dong Li
Hao Yaru
Huang Shaohan
Ma Shuming
Peng Zhiliang
Wang Wenhui
Wei Furu
Publication venue
Publication date: 27/06/2023
Field of study

We introduce Kosmos-2, a Multimodal Large Language Model (MLLM), enabling new capabilities of perceiving object descriptions (e.g., bounding boxes) and grounding text to the visual world. Specifically, we represent refer expressions as links in Markdown, i.e., ``[text span](bounding boxes)'', where object descriptions are sequences of location tokens. Together with multimodal corpora, we construct large-scale data of grounded image-text pairs (called GrIT) to train the model. In addition to the existing capabilities of MLLMs (e.g., perceiving general modalities, following instructions, and performing in-context learning), Kosmos-2 integrates the grounding capability into downstream applications. We evaluate Kosmos-2 on a wide range of tasks, including (i) multimodal grounding, such as referring expression comprehension, and phrase grounding, (ii) multimodal referring, such as referring expression generation, (iii) perception-language tasks, and (iv) language understanding and generation. This work lays out the foundation for the development of Embodiment AI and sheds light on the big convergence of language, multimodal perception, action, and world modeling, which is a key step toward artificial general intelligence. Data, demo, and pretrained models are available at https://aka.ms/kosmos-2.Comment: 20 page

arXiv.org e-Print Archive

Extensible Modeling and Simulation Framework (XMSF) Opportunities for Web-Based Modeling and Simulation

Author: Blais Curt
Brutzman Don
Fouskarinis Steven
Kapolka Andrzej
McGregor Don
Morse Katherine L.
Pullen Mark
Zyda Michael
Publication venue
Publication date: 14/06/2002
Field of study

Technical Opportunities Workshop Whitepaper, 14 June 2002Purpose: As the Department of Defense (DoD) is engaged in both warfighting and institutional transformation for the new millennium, DoD Modeling & Simulation (M&S) also needs to identify and adopt transformational technologies which provide direct tactical relevance to warfighters. Because the only software systems that composably scale to worldwide scope utilize the World Wide Web, it is evident that an extensible Web-based framework shows great promise to scale up the capabilities of M&S systems to meet the needs of training, analysis, acquisition, and the operational warfighter. By embracing commercial web technologies as a shared-communications platform and a ubiquitous-delivery framework, DoD M&S can fully leverage mainstream practices for enterprise-wide software development

Calhoun, Institutional Archive of the Naval Postgraduate School

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Author: Farhadi Ali
Gupta Abhinav
Laptev Ivan
Sigurdsson Gunnar A.
Varol Gül
Wang Xiaolong
Publication venue
Publication date: 26/07/2016
Field of study

Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to be trained from real and diverse examples of our daily dynamic scenes. While most of such scenes are not particularly exciting, they typically do not appear on YouTube, in movies or TV broadcasts. So how do we collect sufficiently many diverse but boring samples representing our lives? We propose a novel Hollywood in Homes approach to collect such data. Instead of shooting videos in the lab, we ensure diversity by distributing and crowdsourcing the whole process of video creation from script writing to video recording and annotation. Following this procedure we collect a new dataset, Charades, with hundreds of people recording videos in their own homes, acting out casual everyday activities. The dataset is composed of 9,848 annotated videos with an average length of 30 seconds, showing activities of 267 people from three continents. Each video is annotated by multiple free-text descriptions, action labels, action intervals and classes of interacted objects. In total, Charades provides 27,847 video descriptions, 66,500 temporally localized intervals for 157 action classes and 41,104 labels for 46 object classes. Using this rich data, we evaluate and provide baseline results for several tasks including action recognition and automatic description generation. We believe that the realism, diversity, and casual nature of this dataset will present unique challenges and new opportunities for computer vision community

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Understanding the Economic Consequences of Shifting Trends in Population Health

Author: A. Gailey
D. Goldman
D. Lakdawalla
P.-C. Michaud
Y. Zheng
Publication venue
Publication date
Field of study

The public economic burden of shifting trends in population health remains uncertain. Sustained increases in obesity, diabetes, and other diseases could reduce life expectancy – with a concomitant decrease in the public-sector’s annuity burden – but these savings may be offset by worsening functional status, which increases health care spending, reduces labor supply, and increases public assistance. Using a microsimulation approach, we quantify the competing public-finance consequences of shifting trends in population health for medical care costs, labor supply, earnings, wealth, tax revenues, and government expenditures (including Social Security and income assistance). Together, the reduction in smoking and the rise in obesity have increased net public-sector liabilities by $430bn, or approximately 4% of the current debt burden. Larger effects are observed for specific public programs: annual spending is 10% higher in the Medicaid program, and 7% higher for Medicare.disability, health care costs, social security, microsimulation

Research Papers in Economics