Search CORE

6 research outputs found

What does fault tolerant Deep Learning need from MPI?

Author: Amatya Vinay
Daily Jeff
Siegel Charles
Vishnu Abhinav
Publication venue
Publication date: 01/01/2017
Field of study

Deep Learning (DL) algorithms have become the de facto Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive - even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults - requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: What is needed from MPI for de- signing fault tolerant DL implementations? In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by ex- tending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet, and GoogLeNet neural network topologies demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM

arXiv.org e-Print Archive

Crossref

Artificial Neural Network Pruning to Extract Knowledge

Author: Mirkes Evgeny M
Publication venue
Publication date: 13/05/2020
Field of study

Artificial Neural Networks (NN) are widely used for solving complex problems from medical diagnostics to face recognition. Despite notable successes, the main disadvantages of NN are also well known: the risk of overfitting, lack of explainability (inability to extract algorithms from trained NN), and high consumption of computing resources. Determining the appropriate specific NN structure for each problem can help overcome these difficulties: Too poor NN cannot be successfully trained, but too rich NN gives unexplainable results and may have a high chance of overfitting. Reducing precision of NN parameters simplifies the implementation of these NN, saves computing resources, and makes the NN skills more transparent. This paper lists the basic NN simplification problems and controlled pruning procedures to solve these problems. All the described pruning procedures can be implemented in one framework. The developed procedures, in particular, find the optimal structure of NN for each task, measure the influence of each input signal and NN parameter, and provide a detailed verbal description of the algorithms and skills of NN. The described methods are illustrated by a simple example: the generation of explicit algorithms for predicting the results of the US presidential election.Comment: IJCNN 202

arXiv.org e-Print Archive

딥러닝 방법론을 이용한 높은 적용성을 가진 수경재배 파프리카 대상 절차 기반 모델 개발

Author: 문태원
Publication venue: 서울대학교 대학원
Publication date: 01/08/2022
Field of study

학위논문(박사) -- 서울대학교대학원 : 농업생명과학대학 농림생물자원학부, 2022. 8. 손정익.Many agricultural challenges are entangled in a complex interaction between crops and the environment. As a simplifying tool, crop modeling is a process of abstracting and interpreting agricultural phenomena. Understanding based on this interpretation can play a role in supporting academic and social decisions in agriculture. Process-based crop models have solved the challenges for decades to enhance the productivity and quality of crop production; the remaining objectives have led to demand for crop models handling multidirectional analyses with multidimensional information. As a possible milestone to satisfy this goal, deep learning algorithms have been introduced to the complicated tasks in agriculture. However, the algorithms could not replace existing crop models because of the research fragmentation and low accessibility of the crop models. This study established a developmental protocol for a process-based crop model with deep learning methodology. Literature Review introduced deep learning and crop modeling, and it explained the reasons for the necessity of this protocol despite numerous deep learning applications for agriculture. Base studies were conducted with several greenhouse data in Chapters 1 and 2: transfer learning and U-Net structure were utilized to construct an infrastructure for the deep learning application; HyperOpt, a Bayesian optimization method, was tested to calibrate crop models to compare the existing crop models with the developed model. Finally, the process-based crop model with full deep neural networks, DeepCrop, was developed with an attention mechanism and multitask decoders for hydroponic sweet peppers (Capsicum annuum var. annuum) in Chapter 3. The methodology for data integrity showed adequate accuracy, so it was applied to the data in all chapters. HyperOpt was able to calibrate food and feed crop models for sweet peppers. Therefore, the compared models in the final chapter were optimized using HyperOpt. DeepCrop was trained to simulate several growth factors with environment data. The trained DeepCrop was evaluated with unseen data, and it showed the highest modeling efficiency (=0.76) and the lowest normalized root mean squared error (=0.18) than the compared models. With the high adaptability of DeepCrop, it can be used for studies on various scales and purposes. Since all methods adequately solved the given tasks and underlay the DeepCrop development, the established protocol can be a high throughput for enhancing accessibility of crop models, resulting in unifying crop modeling studies.농업 시스템에서 발생하는 문제들은 작물과 환경의 상호작용 하에 복잡하게 얽혀 있다. 작물 모델링은 대상을 단순화하는 방법으로써, 농업에서 일어나는 현상을 추상화하고 해석하는 과정이다. 모델링을 통해 대상을 이해하는 것은 농업 분야의 학술적 및 사회적 결정을 지원할 수 있다. 지난 수년 간 절차 기반 작물 모델은 농업의 문제들을 해결하여 작물 생산성 및 품질을 증진시켰으며, 현재 작물 모델링에 남아있는 과제들은 다차원 정보를 다방향에서 분석할 수 있는 작물 모델을 필요로 하게 되었다. 이를 만족시킬 수 있는 지침으로써, 복잡한 농업적 과제들을 목표로 딥러닝 알고리즘이 도입되었다. 그러나, 이 알고리즘들은 낮은 데이터 완결성 및 높은 연구 다양성 때문에 기존의 작물 모델들을 대체하지는 못했다. 본 연구에서는 딥러닝 방법론을 이용하여 절차 기반 작물 모델을 구축하는 개발 프로토콜을 확립하였다. Literature Review에서는 딥러닝과 작물 모델에 대해 소개하고, 농업으로의 딥러닝 적용 연구가 많음에도 이 프로토콜이 필요한 이유를 설명하였다. 제1장과 2장에서는 국내 여러 지역의 데이터를 이용하여 전이 학습 및 U-Net 구조를 활용하여 딥러닝 모델 적용을 위한 기반을 마련하고, 베이지안 최적화 방법인 HyperOpt를 사용하여 기존 모델과 딥러닝 기반 모델을 비교하기 위해 시험적으로 WOFOST 작물 모델을 보정하는 등 모델 개발을 위한 기반 연구를 수행하였다. 마지막으로, 제3장에서는 주의 메커니즘 및 다중 작업 디코더를 가진 완전 심층 신경망 절차 기반 작물 모델인 DeepCrop을 수경재배 파프리카(Capsicum annuum var. annuum) 대상으로 개발하였다. 데이터 완결성을 위한 기술들은 적합한 정확도를 보여주었으며, 전체 챕터 데이터에 적용하였다. HyperOpt는 식량 및 사료 작물 모델들을 파프리카 대상으로 보정할 수 있었다. 따라서, 제3장의 비교 대상 모델들에 대해 HyperOpt를 사용하였다. DeepCrop은 환경 데이터를 이용하고 여러 생육 지표를 예측하도록 학습되었다. 학습에 사용하지 않은 데이터를 이용하여 학습된 DeepCrop를 평가하였으며, 이 때 비교 모델들 중 가장 높은 모형 효율(EF=0.76)과 가장 낮은 표준화 평균 제곱근 오차(NRMSE=0.18)를 보여주었다. DeepCrop은 높은 적용성을 기반으로 다양한 범위와 목적을 가진 연구에 사용될 수 있을 것이다. 모든 방법들이 주어진 작업을 적절히 풀어냈고 DeepCrop 개발의 근거가 되었으므로, 본 논문에서 확립한 프로토콜은 작물 모델의 접근성을 향상시킬 수 있는 획기적인 방향을 제시하였고, 작물 모델 연구의 통합에 기여할 수 있을 것으로 기대한다.LITERATURE REVIEW 1 ABSTRACT 1 BACKGROUND 3 REMARKABLE APPLICABILITY AND ACCESSIBILITY OF DEEP LEARNING 12 DEEP LEARNING APPLICATIONS FOR CROP PRODUCTION 17 THRESHOLDS TO APPLY DEEP LEARNING TO CROP MODELS 18 NECESSITY TO PRIORITIZE DEEP-LEARNING-BASED CROP MODELS 20 REQUIREMENTS OF THE DEEP-LEARNING-BASED CROP MODELS 21 OPENING REMARKS AND THESIS OBJECTIVES 22 LITERATURE CITED 23 Chapter 1 34 Chapter 1-1 35 ABSTRACT 35 INTRODUCTION 37 MATERIALS AND METHODS 40 RESULTS 50 DISCUSSION 59 CONCLUSION 63 LITERATURE CITED 64 Chapter 1-2 71 ABSTRACT 71 INTRODUCTION 73 MATERIALS AND METHODS 75 RESULTS 84 DISCUSSION 92 CONCLUSION 101 LITERATURE CITED 102 Chapter 2 108 ABSTRACT 108 NOMENCLATURE 110 INTRODUCTION 112 MATERIALS AND METHODS 115 RESULTS 124 DISCUSSION 133 CONCLUSION 137 LITERATURE CITED 138 Chapter 3 144 ABSTRACT 144 INTRODUCTION 146 MATERIALS AND METHODS 149 RESULTS 169 DISCUSSION 182 CONCLUSION 187 LITERATURE CITED 188 GENERAL DISCUSSION 196 GENERAL CONCLUSION 201 ABSTRACT IN KOREAN 203 APPENDIX 204박

SNU Open Repository and Archive