Search CORE

84,465 research outputs found

제조 시스템에서의 예측 모델링을 위한 지능적 데이터 획득

Author: 심재웅
Publication venue: 서울대학교 대학원
Publication date: 01/02/2021
Field of study

학위논문 (박사) -- 서울대학교 대학원 : 공과대학 산업공학과, 2021. 2. 조성준.Predictive modeling is a type of supervised learning to find the functional relationship between the input variables and the output variable. Predictive modeling is used in various aspects in manufacturing systems, such as automation of visual inspection, prediction of faulty products, and result estimation of expensive inspection. To build a high-performance predictive model, it is essential to secure high quality data. However, in manufacturing systems, it is practically impossible to acquire enough data of all kinds that are needed for the predictive modeling. There are three main difficulties in the data acquisition in manufacturing systems. First, labeled data always comes with a cost. In many problems, labeling must be done by experienced engineers, which is costly. Second, due to the inspection cost, not all inspections can be performed on all products. Because of time and monetary constraints in the manufacturing system, it is impossible to obtain all the desired inspection results. Third, changes in the manufacturing environment make data acquisition difficult. A change in the manufacturing environment causes a change in the distribution of generated data, making it impossible to obtain enough consistent data. Then, the model have to be trained with a small amount of data. In this dissertation, we overcome this difficulties in data acquisition through active learning, active feature-value acquisition, and domain adaptation. First, we propose an active learning framework to solve the high labeling cost of the wafer map pattern classification. This makes it possible to achieve higher performance with a lower labeling cost. Moreover, the cost efficiency is further improved by incorporating the cluster-level annotation into active learning. For the inspection cost for fault prediction problem, we propose a active inspection framework. By selecting products to undergo high-cost inspection with the novel uncertainty estimation method, high performance can be obtained with low inspection cost. To solve the recipe transition problem that frequently occurs in faulty wafer prediction in semiconductor manufacturing, a domain adaptation methods are used. Through sequential application of unsupervised domain adaptation and semi-supervised domain adaptation, performance degradation due to recipe transition is minimized. Through experiments on real-world data, it was demonstrated that the proposed methodologies can overcome the data acquisition problems in the manufacturing systems and improve the performance of the predictive models.예측 모델링은 지도 학습의 일종으로, 학습 데이터를 통해 입력 변수와 출력 변수 간의 함수적 관계를 찾는 과정이다. 이런 예측 모델링은 육안 검사 자동화, 불량 제품 사전 탐지, 고비용 검사 결과 추정 등 제조 시스템 전반에 걸쳐 활용된다. 높은 성능의 예측 모델을 달성하기 위해서는 양질의 데이터가 필수적이다. 하지만 제조 시스템에서 원하는 종류의 데이터를 원하는 만큼 획득하는 것은 현실적으로 거의 불가능하다. 데이터 획득의 어려움은 크게 세가지 원인에 의해 발생한다. 첫번째로, 라벨링이 된 데이터는 항상 비용을 수반한다는 점이다. 많은 문제에서, 라벨링은 숙련된 엔지니어에 의해 수행되어야 하고, 이는 큰 비용을 발생시킨다. 두번째로, 검사 비용 때문에 모든 검사가 모든 제품에 대해 수행될 수 없다. 제조 시스템에는 시간적, 금전적 제약이 존재하기 때문에, 원하는 모든 검사 결과값을 획득하는 것이 어렵다. 세번째로, 제조 환경의 변화가 데이터 획득을 어렵게 만든다. 제조 환경의 변화는 생성되는 데이터의 분포를 변형시켜, 일관성 있는 데이터를 충분히 획득하지 못하게 한다. 이로 인해 적은 양의 데이터만으로 모델을 재학습시켜야 하는 상황이 빈번하게 발생한다. 본 논문에서는 이런 데이터 획득의 어려움을 극복하기 위해 능동 학습, 능동 피쳐값 획득, 도메인 적응 방법을 활용한다. 먼저, 웨이퍼 맵 패턴 분류 문제의 높은 라벨링 비용을 해결하기 위해 능동학습 프레임워크를 제안한다. 이를 통해 적은 라벨링 비용으로 높은 성능의 분류 모델을 구축할 수 있다. 나아가, 군집 단위의 라벨링 방법을 능동학습에 접목하여 비용 효율성을 한차례 더 개선한다. 제품 불량 예측에 활용되는 검사 비용 문제를 해결하기 위해서는 능동 검사 방법을 제안한다. 제안하는 새로운 불확실성 추정 방법을 통해 고비용 검사 대상 제품을 선택함으로써 적은 검사 비용으로 높은 성능을 얻을 수 있다. 반도체 제조의 웨이퍼 불량 예측에서 빈번하게 발생하는 레시피 변경 문제를 해결하기 위해서는 도메인 적응 방법을 활용한다. 비교사 도메인 적응과 반교사 도메인 적응의 순차적인 적용을 통해 레시피 변경에 의한 성능 저하를 최소화한다. 본 논문에서는 실제 데이터에 대한 실험을 통해 제안된 방법론들이 제조시스템의 데이터 획득 문제를 극복하고 예측 모델의 성능을 높일 수 있음을 확인하였다.1. Introduction 1 2. Literature Review 9 2.1 Review of Related Methodologies 9 2.1.1 Active Learning 9 2.1.2 Active Feature-value Acquisition 11 2.1.3 Domain Adaptation 14 2.2 Review of Predictive Modelings in Manufacturing 15 2.2.1 Wafer Map Pattern Classification 15 2.2.2 Fault Detection and Classification 16 3. Active Learning for Wafer Map Pattern Classification 19 3.1 Problem Description 19 3.2 Proposed Method 21 3.2.1 System overview 21 3.2.2 Prediction model 25 3.2.3 Uncertainty estimation 25 3.2.4 Query wafer selection 29 3.2.5 Query wafer labeling 30 3.2.6 Model update 30 3.3 Experiments 31 3.3.1 Data description 31 3.3.2 Experimental design 31 3.3.3 Results and discussion 34 4. Active Cluster Annotation for Wafer Map Pattern Classification 42 4.1 Problem Description 42 4.2 Proposed Method 44 4.2.1 Clustering of unlabeled data 46 4.2.2 CNN training with labeled data 48 4.2.3 Cluster-level uncertainty estimation 49 4.2.4 Query cluster selection 50 4.2.5 Cluster-level annotation 50 4.3 Experiments 51 4.3.1 Data description 51 4.3.2 Experimental setting 51 4.3.3 Clustering results 53 4.3.4 Classification performance 54 4.3.5 Analysis for label noise 57 5. Active Inspection for Fault Prediction 60 5.1 Problem Description 60 5.2 Proposed Method 65 5.2.1 Active inspection framework 65 5.2.2 Acquisition based on Expected Prediction Change 68 5.3 Experiments 71 5.3.1 Data description 71 5.3.2 Fault prediction models 72 5.3.3 Experimental design 73 5.3.4 Results and discussion 74 6. Adaptive Fault Detection for Recipe Transition 76 6.1 Problem Description 76 6.2 Proposed Method 78 6.2.1 Overview 78 6.2.2 Unsupervised adaptation phase 81 6.2.3 Semi-supervised adaptation phase 83 6.3 Experiments 85 6.3.1 Data description 85 6.3.2 Experimental setting 85 6.3.3 Performance degradation caused by recipe transition 86 6.3.4 Effect of unsupervised adaptation 87 6.3.5 Effect of semi-supervised adaptation 88 7. Conclusion 91 7.1 Contributions 91 7.2 Future work 94Docto

SNU Open Repository and Archive

A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

Author: Hassanzadeh Hamed
Keyvanpour MohammadReza
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 26/04/2011
Field of study

The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as multilinguality, scalability, and issues which are related to diversity and inconsistency in content of different web pages. Due to the wide range of domains and the dynamic environments that the Semantic Annotation systems must be performed on, the problem of automating annotation process is one of the significant challenges in this domain. To overcome this problem, different machine learning approaches such as supervised learning, unsupervised learning and more recent ones like, semi-supervised learning and active learning have been utilized. In this paper we present an inclusive layered classification of Semantic Annotation challenges and discuss the most important issues in this field. Also, we review and analyze machine learning applications for solving semantic annotation problems. For this goal, the article tries to closely study and categorize related researches for better understanding and to reach a framework that can map machine learning techniques into the Semantic Annotation challenges and requirements

arXiv.org e-Print Archive

Crossref

Challenges and solutions for Latin named entity recognition

Author: Ajaka Petra
Brown Christopher
de Marneffe Marie-Catherine
Elsner Micha
Erdmann Alex
Janse Mark
Joseph Brian D.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Although spanning thousands of years and genres as diverse as liturgy, historiography, lyric and other forms of prose and poetry, the body of Latin texts is still relatively sparse compared to English. Data sparsity in Latin presents a number of challenges for traditional Named Entity Recognition techniques. Solving such challenges and enabling reliable Named Entity Recognition in Latin texts can facilitate many down-stream applications, from machine translation to digital historiography, enabling Classicists, historians, and archaeologists for instance, to track the relationships of historical persons, places, and groups on a large scale. This paper presents the first annotated corpus for evaluating Named Entity Recognition in Latin, as well as a fully supervised model that achieves over 90% F-score on a held-out test set, significantly outperforming a competitive baseline. We also present a novel active learning strategy that predicts how many and which sentences need to be annotated for named entities in order to attain a specified degree of accuracy when recognizing named entities automatically in a given text. This maximizes the productivity of annotators while simultaneously controlling quality

Ghent University Academic Bibliography

Global stellar variability study in the field-of-view of the Kepler satellite

Author: Aerts
Auvergne
Blomme
Borucki
C. Aerts
Cuypers
Debosscher
Gilliland
Handler
J. Blomme
J. De Ridder
J. Debosscher
Kolenberg
Mislis
Prša
Rieke
Rodríguez
Sarro
Publication venue: 'EDP Sciences'
Publication date: 01/01/2011
Field of study

We present the results of an automated variability analysis of the Kepler public data measured in the first quarter (Q1) of the mission. In total, about 150 000 light curves have been analysed to detect stellar variability, and to identify new members of known variability classes. We also focus on the detection of variables present in eclipsing binary systems, given the important constraints on stellar fundamental parameters they can provide. The methodology we use here is based on the automated variability classification pipeline which was previously developed for and applied successfully to the CoRoT exofield database and to the limited subset of a few thousand Kepler asteroseismology light curves. We use a Fourier decomposition of the light curves to describe their variability behaviour and use the resulting parameters to perform a supervised classification. Several improvements have been made, including a separate extractor method to detect the presence of eclipses when other variability is present in the light curves. We also included two new variability classes compared to previous work: variables showing signs of rotational modulation and of activity. Statistics are given on the number of variables and the number of good candidates per class. A comparison is made with results obtained for the CoRoT exoplanet data. We present some special discoveries, including variable stars in eclipsing binary systems. Many new candidate non-radial pulsators are found, mainly Delta Sct and Gamma Dor stars. We have studied those samples in more detail by using 2MASS colours. The full classification results are made available as an online catalogue.Comment: 15 pages, 5 figures, Accepted for publication in Astronomy and Astrophysics on 09/02/201

arXiv.org e-Print Archive

Lirias

Crossref

EDP Sciences OAI-PMH repository (1.2.0)