Search CORE

252 research outputs found

Pemilihan Parameter Smoothing pada Probabilistic Neural Network dengan Menggunakan Particle Swarm Optimization untuk Pendeteksian Teks pada Citra

Author: Saputri E. E. (Endah)
Suhartono V. (Vincent)
Wahono R. S. (Romi)
Publication venue: IlmuKomputer.com
Publication date: 01/01/2015
Field of study

Teks sering dijumpai di berbagai tempat seperti nama jalan, nama toko, spanduk, penunjuk jalan, peringatan, dan lain sebagainya. Deteksi teks terbagi menjadi tiga pendekatan yaitu pendekatan tekstur, pendekatan edge, dan pendekatan Connected Component. Pendekatan tekstur dapat mendeteksi teks dengan baik, namun membutuhkan data training yang banyak. Probabilistic Neural Netwok (PNN) dapat mengatasi permasalahan tersebut. Namun PNN memiliki permasalahan dalam menentukan nilai parameter smoothing yang biasanya dilakukan secara trial and error. Particle Swarm Optimization (PSO) merupakan algoritma optimasi yang dapat menangani permasalahan pada PNN. Pada penelitian ini, PNN digunakan pada pendekatan tekstur guna menangani permasalahan pada pendekatan tekstur, yaitu banyaknya data training yang dibutuhkan. Selain itu, digunakan PSO untuk menentukan parameter smoothing pada PNN agar akurasi yang dihasilkan PNN-PSO lebih baik dari PNN tradisional. Hasil eksperimen menunjukkan PNN dapat mendeteksi teks dengan akurasi 75,42% hanya dengan mengunakan 300 data training, dan menghasilkan 77,75% dengan menggunakan 1500 data training. Sedangkan PNN-PSO dapat menghasilkan akurasi 76,91% dengan menggunakan 300 data training dan 77,89% dengan menggunakan 1500 data training. Maka dapat disimpulkan bahwa PNN dapat mendeteksi teks dengan baik walaupun data training yang digunakan sedikit dan dapat mengatasi permasalahan pada pendekatan tekstur. Sedangkan, PSO dapat menentukan nilai parameter smoothing pada PNN dan menghasilkan akurasi yang lebih baik dari PNN tradisional, yaitu dengan peningkatan akurasi sekitar 0,1% hingga 1,5%. Selain itu, penggunaan PSO pada PNN dapat digunakan dalam menentukan nilai parameter smoothing secara otomatis pada dataset yang berbeda

시계열 데이터 패턴 분석을 위한 종단 심층 학습망 설계 방법론

Author: 황보선
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 장병탁.Pattern recognition within time series data became an important avenue of research in artificial intelligence following the paradigm shift of the fourth industrial revolution. A number of studies related to this have been conducted over the past few years, and research using deep learning techniques are becoming increasingly popular. Due to the nonstationary, nonlinear and noisy nature of time series data, it is essential to design an appropriate model to extract its significant features for pattern recognition. This dissertation not only discusses the study of pattern recognition using various hand-crafted feature engineering techniques using physiological time series signals, but also suggests an end-to-end deep learning design methodology without any feature engineering. Time series signal can be classified into signals having periodic and non-periodic characteristics in the time domain. This thesis proposes two end-to-end deep learning design methodologies for pattern recognition of periodic and non-periodic signals. The first proposed deep learning design methodology is Deep ECGNet. Deep ECGNet offers a design scheme for an end-to-end deep learning model using periodic characteristics of Electrocardiogram (ECG) signals. ECG, recorded from the electrophysiologic patterns of heart muscle during heartbeat, could be a promising candidate to provide a biomarker to estimate event-based stress level. Conventionally, the beat-to-beat alternations, heart rate variability (HRV), from ECG have been utilized to monitor the mental stress status as well as the mortality of cardiac patients. These HRV parameters have the disadvantage of having a 5-minute measurement period. In this thesis, human's stress states were estimated without special hand-crafted feature engineering using only 10-second interval data with the deep learning model. The design methodology of this model incorporates the periodic characteristics of the ECG signal into the model. The main parameters of 1D CNNs and RNNs reflecting the periodic characteristics of ECG were updated corresponding to the stress states. The experimental results proved that the proposed method yielded better performance than those of the existing HRV parameter extraction methods and spectrogram methods. The second proposed methodology is an automatic end-to-end deep learning design methodology using Bayesian optimization for non-periodic signals. Electroencephalogram (EEG) is elicited from the central nervous system (CNS) to yield genuine emotional states, even at the unconscious level. Due to the low signal-to-noise ratio (SNR) of EEG signals, spectral analysis in frequency domain has been conventionally applied to EEG studies. As a general methodology, EEG signals are filtered into several frequency bands using Fourier or wavelet analyses and these band features are then fed into a classifier. This thesis proposes an end-to-end deep learning automatic design method using optimization techniques without this basic feature engineering. Bayesian optimization is a popular optimization technique for machine learning to optimize model hyperparameters. It is often used in optimization problems to evaluate expensive black box functions. In this thesis, we propose a method to perform whole model hyperparameters and structural optimization by using 1D CNNs and RNNs as basic deep learning models and Bayesian optimization. In this way, this thesis proposes the Deep EEGNet model as a method to discriminate human emotional states from EEG signals. Experimental results proved that the proposed method showed better performance than that of conventional method based on the conventional band power feature method. In conclusion, this thesis has proposed several methodologies for time series pattern recognition problems from the feature engineering-based conventional methods to the end-to-end deep learning design methodologies with only raw time series signals. Experimental results showed that the proposed methodologies can be effectively applied to pattern recognition problems using time series data.시계열 데이터의 패턴 인식 문제는 4차 산업 혁명의 패러다임 전환과 함께 매우 중요한 인공 지능의 한 분야가 되었다. 이에 따라, 지난 몇 년간 이와 관련된 많은 연구들이 이루어져 왔으며, 최근에는 심층 학습망 (deep learning networks) 모델을 이용한 연구들이 주를 이루어 왔다. 시계열 데이터는 비정상, 비선형 그리고 잡음 (nonstationary, nonlinear and noisy) 특성으로 인하여 시계열 데이터의 패턴 인식 수행을 위해선, 데이터의 주요한 특징점을 추출하기 위한 최적화된 모델의 설계가 필수적이다. 본 논문은 대표적인 시계열 데이터인 생체 신호를 사용하여 여러 특징 벡터 추출 방법 (hand-crafted feature engineering methods)을 이용한 패턴 인식 기법에 대하여 논할 뿐만 아니라, 궁극적으로는 특징 벡터 추출 과정이 없는 종단 심층 학습망 설계 방법론에 대한 연구 내용을 담고 있다. 시계열 신호는 시간 축 상에서 크게 주기적 신호와 비주기적 신호로 구분할 수 있는데, 본 연구는 이러한 두 유형의 신호들에 대한 패턴 인식을 위해 두 가지 종단 심층 학습망에 대한 설계 방법론을 제안한다. 첫 번째 제안된 방법론을 이용해 설계된 모델은 신호의 주기적 특성을 이용한 Deep ECGNet이다. 심장 근육의 전기 생리학적 패턴으로부터 기록된 심전도 (Electrocardiogram, ECG)는 이벤트 기반 스트레스 수준을 추정하기 위한 척도 (bio marker)를 제공하는 유효한 데이터가 될 수 있다. 전통적으로 심전도의 심박수 변동성 (Herat Rate Variability, HRV) 매개변수 (parameter)는 심장 질환 환자의 정신적 스트레스 상태 및 사망률을 모니터링하는 데 사용되었다. 하지만, 표준 심박수 변동성 매개 변수는 측정 주기가 5분 이상으로, 측정 시간이 길다는 단점이 있다. 본 논문에서는 심층 학습망 모델을 이용하여 10초 간격의 ECG 데이터만을 이용하여, 추가적인 특징 벡터의 추출 과정 없이 인간의 스트레스 상태를 인식할 수 있음을 보인다. 제안된 설계 기법은 ECG 신호의 주기적 특성을 모델에 반영하였는데, ECG의 은닉 특징 추출기로 사용된 1D CNNs 및 RNNs 모델의 주요 매개 변수에 주기적 특성을 반영함으로써, 한 주기 신호의 스트레스 상태에 따른 주요 특징점을 종단 학습망 내부적으로 추출할 수 있음을 보였다. 실험 결과 제안된 방법이 기존 심박수 변동성 매개변수와 spectrogram 추출 기법 기반의 패턴 인식 방법보다 좋은 성능을 나타내고 있음을 확인할 수 있었다. 두 번째 제안된 방법론은 비 주기적이며 비정상, 비선형 그리고 잡음 특성을 지닌 신호의 패턴인식을 위한 최적 종단 심층 학습망 자동 설계 방법론이다. 뇌파 신호 (Electroencephalogram, EEG)는 중추 신경계 (CNS)에서 발생되어 무의식 상태에서도 본연의 감정 상태를 나타내는데, EEG 신호의 낮은 신호 대 잡음비 (SNR)로 인해 뇌파를 이용한 감정 상태 판정을 위해서 주로 주파수 영역의 스펙트럼 분석이 뇌파 연구에 적용되어 왔다. 통상적으로 뇌파 신호는 푸리에 (Fourier) 또는 웨이블렛 (wavelet) 분석을 사용하여 여러 주파수 대역으로 필터링 된다. 이렇게 추출된 주파수 특징 벡터는 보통 얕은 학습 분류기 (shallow machine learning classifier)의 입력으로 사용되어 패턴 인식을 수행하게 된다. 본 논문에서는 이러한 기본적인 특징 벡터 추출 과정이 없는 베이지안 최적화 (Bayesian optimization) 기법을 이용한 종단 심층 학습망 자동 설계 기법을 제안한다. 베이지안 최적화 기법은 초 매개변수 (hyperparamters)를 최적화하기 위한 기계 학습 분야의 대표적인 최적화 기법인데, 최적화 과정에서 평가 시간이 많이 소요되는 목적 함수 (expensive black box function)를 갖고 있는 최적화 문제에 적합하다. 이러한 베이지안 최적화를 이용하여 기본적인 학습 모델인 1D CNNs 및 RNNs의 전체 모델의 초 매개변수 및 구조적 최적화를 수행하는 방법을 제안하였으며, 제안된 방법론을 바탕으로 Deep EEGNet이라는 인간의 감정상태를 판별할 수 있는 모델을 제안하였다. 여러 실험을 통해 제안된 모델이 기존의 주파수 특징 벡터 (band power feature) 추출 기법 기반의 전통적인 감정 패턴 인식 방법보다 좋은 성능을 나타내고 있음을 확인할 수 있었다. 결론적으로 본 논문은 시계열 데이터를 이용한 패턴 인식문제를 여러 특징 벡터 추출 기법 기반의 전통적인 방법을 통해 설계하는 방법부터, 추가적인 특징 벡터 추출 과정 없이 원본 데이터만을 이용하여 종단 심층 학습망을 설계하는 방법까지 제안하였다. 또한, 다양한 실험을 통해 제안된 방법론이 시계열 신호 데이터를 이용한 패턴 인식 문제에 효과적으로 적용될 수 있음을 보였다.Chapter 1 Introduction 1 1.1 Pattern Recognition in Time Series 1 1.2 Major Problems in Conventional Approaches 7 1.3 The Proposed Approach and its Contribution 8 1.4 Thesis Organization 10 Chapter 2 Related Works 12 2.1 Pattern Recognition in Time Series using Conventional Methods 12 2.1.1 Time Domain Features 12 2.1.2 Frequency Domain Features 14 2.1.3 Signal Processing based on Multi-variate Empirical Mode Decomposition (MEMD) 15 2.1.4 Statistical Time Series Model (ARIMA) 18 2.2 Fundamental Deep Learning Algorithms 20 2.2.1 Convolutional Neural Networks (CNNs) 20 2.2.2 Recurrent Neural Networks (RNNs) 22 2.3 Hyper Parameters and Structural Optimization Techniques 24 2.3.1 Grid and Random Search Algorithms 24 2.3.2 Bayesian Optimization 25 2.3.3 Neural Architecture Search 28 2.4 Research Trends related to Time Series Data 29 2.4.1 Generative Model of Raw Audio Waveform 30 Chapter 3 Preliminary Researches: Patten Recognition in Time Series using Various Feature Extraction Methods 31 3.1 Conventional Methods using Time and Frequency Features: Motor Imagery Brain Response Classification 31 3.1.1 Introduction 31 3.1.2 Methods 32 3.1.3 Ensemble Classification Method (Stacking & AdaBoost) 32 3.1.4 Sensitivity Analysis 33 3.1.5 Classification Results 36 3.2 Statistical Feature Extraction Methods: ARIMA Model Based Feature Extraction Methodology 38 3.2.1 Introduction 38 3.2.2 ARIMA Model 38 3.2.3 Signal Processing 39 3.2.4 ARIMA Model Conformance Test 40 3.2.5 Experimental Results 40 3.2.6 Summary 43 3.3 Application on Specific Time Series Data: Human Stress States Recognition using Ultra-Short-Term ECG Spectral Feature 44 3.3.1 Introduction 44 3.3.2 Experiments 45 3.3.3 Classification Methods 49 3.3.4 Experimental Results 49 3.3.5 Summary 56 Chapter 4 Master Framework for Pattern Recognition in Time Series 57 4.1 The Concept of the Proposed Framework for Pattern Recognition in Time Series 57 4.1.1 Optimal Basic Deep Learning Models for the Proposed Framework 57 4.2 Two Categories for Pattern Recognition in Time Series Data 59 4.2.1 The Proposed Deep Learning Framework for Periodic Time Series Signals 59 4.2.2 The Proposed Deep Learning Framework for Non-periodic Time Series Signals 61 4.3 Expanded Models of the Proposed Master Framework for Pattern Recogntion in Time Series 63 Chapter 5 Deep Learning Model Design Methodology for Periodic Signals using Prior Knowledge: Deep ECGNet 65 5.1 Introduction 65 5.2 Materials and Methods 67 5.2.1 Subjects and Data Acquisition 67 5.2.2 Conventional ECG Analysis Methods 72 5.2.3 The Initial Setup of the Deep Learning Architecture 75 5.2.4 The Deep ECGNet 78 5.3 Experimental Results 83 5.4 Summary 98 Chapter 6 Deep Learning Model Design Methodology for Non-periodic Time Series Signals using Optimization Techniques: Deep EEGNet 100 6.1 Introduction 100 6.2 Materials and Methods 104 6.2.1 Subjects and Data Acquisition 104 6.2.2 Conventional EEG Analysis Methods 106 6.2.3 Basic Deep Learning Units and Optimization Technique 108 6.2.4 Optimization for Deep EEGNet 109 6.2.5 Deep EEGNet Architectures using the EEG Channel Grouping Scheme 111 6.3 Experimental Results 113 6.4 Summary 124 Chapter 7 Concluding Remarks 126 7.1 Summary of Thesis and Contributions 126 7.2 Limitations of the Proposed Methods 128 7.3 Suggestions for Future Works 129 Bibliography 131 초 록 139Docto

IRIM at TRECVID 2013: Semantic Indexing and Instance Search

Author: Ayache Stéphane
Ballas Nicolas
Benois-Pineau Jenny
Benoît Alexandre
Bichot Charles-Edmond
Chen Liming
Dellandrea Emmanuel
Derbas Nadia
Dong Han
Gao Boyang
Gosselin Philippe
Hamadi Abdelkader
Labbé Benjamin
Lambert Patrick
Le Borgne Hervé
Mansencal Boris
Merialdo Bernard
Quénot Georges
Redi Miriam
Safadi Bahjat
Strat Tiberius
Tang Yuxing
Vieux Rémi
Vuong Thi-Thu-Thuy
Zhu Chao
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceThe IRIM group is a consortium of French teams working on Multimedia Indexing and Retrieval. This paper describes its participation to the TRECVID 2013 semantic indexing and instance search tasks. For the semantic indexing task, our approach uses a six-stages processing pipelines for computing scores for the likelihood of a video shot to contain a target concept. These scores are then used for producing a ranked list of images or shots that are the most likely to contain the target concept. The pipeline is composed of the following steps: descriptor extraction, descriptor optimization, classiffication, fusion of descriptor variants, higher-level fusion, and re-ranking. We evaluated a number of different descriptors and tried different fusion strategies. The best IRIM run has a Mean Inferred Average Precision of 0.2796, which ranked us 4th out of 26 participants

Hal - Université Grenoble Alpes

HAL AMU

Hal-Diderot

Contextual Bag-Of-Visual-Words and ECOC-Rank for Retrieval and Multi-class Object Recognition

Author: Mirza-Mohammadi Mehdi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2009
Field of study

Projecte Final de Màster UPC realitzat en col.laboració amb Dept. Matemàtica Aplicada i Anàlisi, Universitat de BarcelonaMulti-class object categorization is an important line of research in Computer Vision and Pattern Recognition fields. An artificial intelligent system is able to interact with its environment if it is able to distinguish among a set of cases, instances, situations, objects, etc. The World is inherently multi-class, and thus, the eficiency of a system can be determined by its accuracy discriminating among a set of cases. A recently applied procedure in the literature is the Bag-Of-Visual-Words (BOVW). This methodology is based on the natural language processing theory, where a set of sentences are defined based on word frequencies. Analogy, in the pattern recognition domain, an object is described based on the frequency of its parts appearance. However, a general drawback of this method is that the dictionary construction does not take into account geometrical information about object parts. In order to include parts relations in the BOVW model, we propose the Contextual BOVW (C-BOVW), where the dictionary construction is guided by a geometricaly-based merging procedure. As a result, objects are described as sentences where geometrical information is implicitly considered. In order to extend the proposed system to the multi-class case, we used the Error-Correcting Output Codes framework (ECOC). State-of-the-art multi-class techniques are frequently defined as an ensemble of binary classifiers. In this sense, the ECOC framework, based on error-correcting principles, showed to be a powerful tool, being able to classify a huge number of classes at the same time that corrects classification errors produced by the individual learners. In our case, the C-BOVW sentences are learnt by means of an ECOC configuration, obtaining high discriminative power. Moreover, we used the ECOC outputs obtained by the new methodology to rank classes. In some situations, more than one label is required to work with multiple hypothesis and find similar cases, such as in the well-known retrieval problems. In this sense, we also included contextual and semantic information to modify the ECOC outputs and defined an ECOC-rank methodology. Altering the ECOC output values by means of the adjacency of classes based on features and classes relations based on ontologies, we also reporteda significant improvement in class-retrieval problems