96 research outputs found

    Trust Region Methods for Training Neural Networks

    Get PDF
    Artificial feed-forward neural networks (ff-ANNs) serve as powerful machine learning models for supervised classification problems. They have been used to solve problems stretching from natural language processing to computer vision. ff-ANNs are typically trained using gradient based approaches, which only require the computation of first order derivatives. In this thesis we explore the benefits and drawbacks of training an ff-ANN with a method which requires the computation of second order derivatives of the objective function. We also explore whether stochastic approximations can be used to decrease the computation time of such a method. A numerical investigation was performed into the behaviour of trust region methods, a type of second order numerical optimization method, when used to train ff-ANNs on several datasets. Our study compares a classical trust region approach and evaluates the effect of adapting this method using stochastic variations. The exploration includes three approaches to reducing the computations required to perform the classical method: stochastic subsampling of training examples, stochastic subsampling of parameters and using a gradient based approach in combination with the classical trust region method. We found that stochastic subsampling methods can, in some cases, reduce the CPU time required to reach a reasonable solution when compared to the classical trust region method but this was not consistent across all datasets. We also found that using the classical trust region method in combination with mini-batch gradient descent either successfully matched (within 0.1s) or decreased the CPU time required to reach a reasonable solution for all datasets. This was achieved by only computing the trust region step when training progress using the gradient approach had stalled

    Mission Analysis Program for Solar Electric Propulsion (MAPSEP). Volume 1: Analytical manual for earth orbital MAPSEP

    Get PDF
    An introduction to the MAPSEP organization and a detailed analytical description of all models and algorithms are given. These include trajectory and error covariance propagation methods, orbit determination processes, thrust modeling, and trajectory correction (guidance) schemes. Earth orbital MAPSEP contains the capability of analyzing almost any currently projected low thrust mission from low earth orbit to super synchronous altitudes. Furthermore, MAPSEP is sufficiently flexible to incorporate extended dynamic models, alternate mission strategies, and almost any other system requirement imposed by the user. As in the interplanetary version, earth orbital MAPSEP represents a trade-off between precision modeling and computational speed consistent with defining necessary system requirements. It can be used in feasibility studies as well as in flight operational support. Pertinent operational constraints are available both implicitly and explicitly. However, the reader should be warned that because of program complexity, MAPSEP is only as good as the user and will quickly succumb to faulty user inputs

    Learning Tuple Probabilities in Probabilistic Databases

    No full text
    Learning the parameters of complex probabilistic-relational models from labeled training data is a standard technique in machine learning, which has been intensively studied in the subfield of Statistical Relational Learning (SRL), but---so far---this is still an under-investigated topic in the context of Probabilistic Databases (PDBs). In this paper, we focus on learning the probability values of base tuples in a PDB from query answers, the latter of which are represented as labeled lineage formulas. Specifically, we consider labels in the form of pairs, each consisting of a Boolean lineage formula and a marginal probability that comes attached to the corresponding query answer. The resulting learning problem can be viewed as the inverse problem to confidence computations in PDBs: given a set of labeled query answers, learn the probability values of the base tuples, such that the marginal probabilities of the query answers again yield in the assigned probability labels. We analyze the learning problem from a theoretical perspective, devise two optimization-based objectives, and provide an efficient algorithm (based on Stochastic Gradient Descent) for solving these objectives. Finally, we conclude this work by an experimental evaluation on three real-world and one synthetic dataset, while competing with various techniques from SRL, reasoning in information extraction, and optimization

    On-orbit transfer trajectory methods using high fidelity dynamic models

    Get PDF
    A high fidelity trajectory propagator for use in targeting and reference trajectory generation is developed for aerospace applications in low Earth and translunar orbits. The dominant perturbing effects necessary to accurately model vehicle motion in these dynamic environments are incorporated into a numerical predictor-corrector scheme to converge on a realistic trajectory incorporating multi-body gravitation, high order gravity, atmospheric drag, and solar radiation pressure. The predictor-corrector algorithm is shown to reliably produce accurate required velocities to meet constraints on the final position for the dominant perturbation effects modeled. Low fidelity conic state propagation techniques such as Lambert's method and multiconic pseudostate theory are developed to provide a suitable initial guess. Feasibility of the method is demonstrated through sensitivity analysis to the initial guess for a bounding set of cases

    The Dependence of the Time-Asymptotic Structure of 3-D Vortex Breakdown on Boundary and Initial Conditions

    Get PDF
    The three-dimensional, compressible Navier-Stokes equations are solved numerically to simulate vortex breakdown in tubes. Time integration is performed with an implicit Beam-Warming algorithm, which uses fourth-order compact operators to discretize spatial derivatives. Initial conditions are obtained by solving the steady, compressible, and axisymmetric form of the Navier-Stokes equations using Newton\u27s method. Stability of the axisymmetric initial conditions is assessed through 3-D time integration. Unique axisymmetric solutions at a Reynolds number of 250 lose stability to 3-D disturbances at a critical value of vortex strength, resulting in 3-D and time-periodic flow. Axisymmetric solutions at a Reynolds number of 1000 contain regions of nonuniqueness. Within this region, 3-D time integration reveals only unique solutions, with nonunique, axisymmetric initial conditions converging to a unique solution that is steady and axisymmetric. Past the primary limit point, which approximately identifies critical flow, the solutions bifurcate into 3-D periodic flows

    Prediction Interval Estimation Techniques for Empirical Modeling Strategies and their Applications to Signal Validation Tasks

    Get PDF
    The basis of this work was to evaluate both parametric and non-parametric empirical modeling strategies applied to signal validation or on-line monitoring tasks. On-line monitoring methods assess signal channel performance to aid in making instrument calibration decisions, enabling the use of condition-based calibration schedules. The three non-linear empirical modeling strategies studied were: artificial neural networks (ANN), neural network partial least squares (NNPLS), and local polynomial regression (LPR). These three types are the most common nonlinear models for applications to signal validation tasks. Of the class of local polynomials (for LPR), two were studied in this work: zero-order (kernel regression), and first-order (local linear regression). The evaluation of the empirical modeling strategies includes the presentation and derivation of prediction intervals for each of three different model types studied so that estimations could be made with an associated prediction interval. An estimate and its corresponding prediction interval contain the measurements with a specified certainty, usually 95%. The prediction interval estimates were compared to results obtained from bootstrapping via Monte Carlo resampling, to validate their expected accuracy. The estimation of prediction intervals applied to on-line monitoring systems is essential if widespread use of these empirical based systems is to be attained. In response to the topical report On-Line Monitoring of Instrument Channel Performance, published by the Electric Power Research Institute [Davis 1998], the NRC issued a safety evaluation report that identified the need to evaluate the associated uncertainty of empirical model estimations from all contributing sources. This need forms the basis for the research completed and reported in this dissertation. The focus of this work, and basis of its original contributions, were to provide an accurate prediction interval estimation method for each of the mentioned empirical modeling techniques, and to verify the results via bootstrap simulation studies. Properly determined prediction interval estimates were obtained that consistently captured the uncertainty of the given model such that the level of certainty of the intervals closely matched the observed level of coverage of the prediction intervals over the measured values. In most cases the expected level of coverage of the measured values within the prediction intervals was 95%, such that the probability that an estimate and its associated prediction interval contain the corresponding measured observation was 95%. The results also indicate that instrument channel drifts are identifiable through the use of the developed prediction intervals by observing the drop in the level of coverage of the prediction intervals to relatively low values, e.g. 30%. While all empirical models exhibit optimal performance for a given set of specifications, the identification of this optimal set may be difficult to attain. The developed methods of prediction interval estimation were shown to perform as expected over a wide range of model specifications, including misspecification. Model misspecification occurs through different mechanisms dependent on the type of empirical model. The main mechanisms under which model misspecification occur for each empirical model studied are: ANN – through architecture selection, NNPLS – through latent variable selection, LPR – through bandwidth selection. In addition, all of the above empirical models are susceptible to misspecification due to inadequate data and the presence of erroneous predictor variables in the set of predictors. A study was completed to verify that the presence of erroneous variables, i.e. unrelated to the desired response or random noise components, resulted in increases in the prediction interval magnitudes while maintaining the appropriate level of coverage for the response measurements. In addition to considering the resultant prediction intervals and coverage values, a comparative evaluation of the different empirical models was performed. The evaluation considers the average estimation errors and the stability of the models under repeated Monte Carlo resampling. The results indicate the large uncertainty of ANN models applied to collinear data, and the utility of the NNPLS model for the same purpose. While the results from the LPR models remained consistent for data with or without collinearity, assuming proper regularization was applied. The quantification of the uncertainty of an empirical model\u27s estimations is a necessary task for promoting the use of on-line monitoring systems in the nuclear power industry. All of the methods studied herein were applied to a simulated data set for an initial evaluation of the methods, and data from two different U.S. nuclear power plants for the purposes of signal validation for on-line monitoring tasks

    The Third Air Force/NASA Symposium on Recent Advances in Multidisciplinary Analysis and Optimization

    Get PDF
    The third Air Force/NASA Symposium on Recent Advances in Multidisciplinary Analysis and Optimization was held on 24-26 Sept. 1990. Sessions were on the following topics: dynamics and controls; multilevel optimization; sensitivity analysis; aerodynamic design software systems; optimization theory; analysis and design; shape optimization; vehicle components; structural optimization; aeroelasticity; artificial intelligence; multidisciplinary optimization; and composites

    Evaluation of Undrained Shear Strength of Soil, Ultimate Pile Capacity and Pile Set-Up Parameter from Cone Penetration Test (CPT) Using Artificial Neural Network (ANN)

    Get PDF
    Over the years, numerous design methods were developed to evaluate the undrained shear strength, Su, ultimate pile capacity and pile set-up parameter, A. In recent decades, the emphasis was given to the in-situ cone and piezocone penetration tests (CPT, PCPT) to estimate these parameters since CPT/PCPT has been proven to be fast, reliable and cost-effective soil investigation method. However, because of the paucity of a vivid comprehension of the physical problem, some of the developed methods incorporate correlation assumptions which might compromise the consistent accuracy. In this study, the Artificial Neural Network (ANN) was exerted using CPT data and soil properties to generate a better and unswerving interpretation of Su, ultimate pile capacity and ‘A’ parameter. In this regard, a data set was prepared consisting of CPT/PCPT data as well as relevant soil properties from 70 sites in Louisiana for the evaluation of Su. For ultimate pile capacity, a database of 80 pile load tests was prepared. Lastly, data was collected from 12 instrumented pile load tests for the interpretation of the ‘A’ parameter. Corresponding CPTs along with the soil borings were also collected. Presenting these data to ANN, models were trained through trial and error using different feed-forward network techniques, e.g. Back Propagation method. Different models of ANN were explored with cone sleeve friction, fs, and tip resistance, qt, as well as plasticity index, PI, effective overburden pressure, σ’vo, etc. as input data and were compared to the conventional methods. It was found that the ANN model with qt, fs, and σ’vo as inputs performed satisfactorily and was found to be better than the conventional empirical method of evaluation of Su. On the other hand, ANN models with pile embedment length, pile width, qt, and fs as inputs, outperformed the best-performed direct pile-CPT methods in the interpretation of ultimate pile capacity. Similarly, the ‘A’ parameter predicted by the ANN models (PI, OCR, and Su as inputs) was also in good agreement with the actual one. These findings, hence, fortifies the applicability of ANN for estimating the undrained shear strength, ultimate pile capacity and ‘A’ parameter from CPT data and soil properties
    • …
    corecore