69 research outputs found

    Model combination by decomposition and aggregation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Nuclear Engineering, 2004.Includes bibliographical references (p. 265-282).This thesis focuses on a general problem in statistical modeling, namely model combination. It proposes a novel feature-based model combination method to improve model accuracy and reduce model uncertainty. In this method, a set of candidate models are first decomposed into a group of components or features and then components are selected and aggregated into a composite model based on data. However, in implementing this new method, some central challenges have to be addressed, which include candidate model choice, component selection, data noise modeling, model uncertainty reduction and model locality. In order to solve these problems, some new methods are put forward. In choosing candidate models, some criteria are proposed including accuracy, diversity, independence as well as completeness and then corresponding quantitative measures are designed to quantify these criteria, and finally an overall preference score is generated for each model in the pool. Principal component analysis (PCA) and independent component analysis (ICA) are applied to decompose candidate models into components and multiple linear regression is employed to aggregate components into a composite model.(cont.) In order to reduce model structure uncertainty, a new concept of fuzzy variable selection is introduced to carry out component selection, which is able to combine the interpretability of classical variable selection and the stability of shrinkage estimators. In dealing with parameter estimation uncertainty, exponential power distribution is proposed to model unknown non-Gaussian noise and parametric weighted least-squares method is devise to estimate parameters in the context of non-Gaussian noise. These two methods are combined to work together to reduce model uncertainty, including both model structure uncertainty and parameter uncertainty. To handle model locality, i.e. candidate models do not work equally well over different regions, the adaptive fuzzy mixture of local ICA models is developed. Basically, it splits the entire input space into domains, build local ICA models within each sub-region and then combine them into a mixture model. Many different experiments are carried out to demonstrate the performance of this novel method. Our simulation study and comparison show that this new method meets our goals and outperforms existing methods in most situations.by Mingyang Xu.Ph.D

    Real-Time Forecasting/Control of Water Resource Systems; Selected Papers from an IIASA Workshop, October 18-21,1976

    Get PDF
    When water resource systems are not under control, the consequences can be devastating. In the United States alone, flood damage cost approximately $1.5 billion annually. These losses can be avoided by building more reservoirs to hold the flood waters, but such construction is very expensive, especially because reservoirs have already been built on the best sites. A better and less expensive alternative is the development of more effective management methods for existing water resource systems, which commonly waste approximately 20 percent of their capacities through mismanagement. Statistical models first appeared in hydrology at the beginning of the 1970s. Hydrologists began to use the techniques of time series analysis and system identification in their models, which seemed to give better results than the earlier, deterministic simulation models. In addition, real-time control of water resources was being developed at the practical level and on-line measurements of rainfall and runoff from a catchment were becoming available. The conceptual models then in use could not take advantage of measurements from the catchment, but on-line measurements now allow an operator to anticipate flood waters upstream or a water shortage downstream. This book contains selected papers from a workshop devoted to the consolidation of international research on statistically estimated models for real-time forecasting and control of water resource systems. The book is divided into three parts. The first part presents several methods of forecasting for water resource systems: distributed lag models, maximum likelihood identification, nonlinear catchment models, Kalman filtering, and self-tuning predictors. The papers in the second part present methods for controlling stream quality and stream flow, and the third part describes forecasting in the United States, the United Kingdom, and Poland

    Development of an Optimal Replenishment Policy for Human Capital Inventory

    Get PDF
    A unique approach is developed for evaluating Human Capital (workforce) requirements. With this approach, new ways of measuring personnel availability are proposed and available to ensure that an organization remains ready to provide timely, relevant, and accurate products and services in support of its strategic objectives over its planning horizon. The development of this analysis and methodology was established as an alternative approach to existing studies for determining appropriate hiring and attrition rates and to maintain appropriate personnel levels of effectiveness to support existing and future missions. The contribution of this research is a prescribed method for the strategic analyst to incorporate a personnel and cost simulation model within the framework of Human Resources Human Capital forecasting which can be used to project personnel requirements and evaluate workforce sustainment, at least cost, through time. This will allow various personnel managers to evaluate multiple resource strategies, present and future, maintaining near “perfect” hiring and attrition policies to support its future Human Capital assets

    Stochastic optimazation models for planning and operation of a multipurpose water reservoir

    Get PDF
    We consider the capacity determination problem of a hydro reservoir. The reservoir is to be used primarily for hydropower generation; however, commitments on release targets for irrigation as well as mitigation of downstream flood hazards are also sec-ondary objectives. This thesis is concerned with studying the complex interaction among various system reliabilities (power, flood, irrigation, etc.) and to provide de-cision makers a planning tool for further investigation. The main tools are stochastic programming models that recognize the randomness in the streamflow. A chance con-strained programming model and a stochastic programming model with recourse are formulated and solved. The models developed incorporate a special target-priority policy according to given system reliabilities. Optimized values are then used in a simulation model to investigate the system behavior. Detailed computational results are provided and analyzed

    Novel methods based on regression techniques to analyze multistate models and high-dimensional omics data.

    Get PDF
    The dissertation is based on four distinct research projects that are loosely interconnected by the common link of a regression framework. Chapter 1 provides an introductory outline of the problems addressed in the projects along with a detailed review of the previous works that have been done on them and a brief discussion on our newly developed methodologies. Chapter 2 describes the first project that is concerned with the identification of hidden subject-specific sources of heterogeneity in gene expression profiling analyses and adjusting for them by a technique based on Partial Least Squares (PLS) regression, in order to ensure a more accurate inference on the expression pattern of the genes over two different varieties of samples. Chapter 3 focuses on the development of an R package based on Project 1 and its performance evaluation with respect to other popular software dealing with differential gene expression analyses. Chapter 4 covers the third project that proposes a non-parametric regression method for the estimation of stage occupation probabilities at different time points in a right-censored multistate model data, using an Inverse Probability of Censoring (IPCW) (Datta and Satten, 2001) based version of the backfitting principle (Hastie and Tibshirani, 1992). Chapter 5 describes the fourth project which deals with the testing for the equality of the residual distributions after adjusting for available covariate information from the right censored waiting times of two groups of subjects, by using an Inverse Probability of Censoring weighted (IPCW) version of the Mann-Whitney U test

    Chaotic price dynamics of agricultural commodities

    Get PDF
    Traditionally, commodity prices have been analyzed and modeled in the context of linear generating processes. The purpose of this dissertation is to address the adequacy of this work through examination of the critical assumption of independence in the residual process of linearly specified models. As an alternative, a test procedure is developed and utilized to demonstrate the appropriateness of applying generalized conditional heteroscedastic time series models (GARCH) to agricultural commodity prices. In addition, a distinction is made between testing for independence and testing for chaos in commodity prices. The price series of interest derive from the major international agricultural commodity markets, sampled monthly over the period 1960--1994. The results of the present analysis suggest that for bananas, beef, coffee, soybeans, wool and wheat seasonally adjusted growth rates, ARCH-GARCH models account for some of the non-linear dependence in these commodity price series. As an alternative to the ARCH-GARCH models, several neural network models were estimated and in some cases outperformed the ARCH family of models in terms of forecast ability. This further demonstrated the nonlinearity present in these time series. Although, further examination is needed, all prices were found to be non-linearly dependent. It was determined by use of different statistical measures for testing for deterministic chaos that wheat prices may be an example of such behavior. Therefore, their may be something to be gained in terms of short-run forecast accuracy by using semi-parametric modeling approaches as applied to wheat prices

    The Optimal Implementation of On-Line Optimization for Chemical and Refinery Processes.

    Get PDF
    On-line optimization is an effective approach for process operation and economic improvement and source reduction in chemical and refinery processes. On-line optimization involves three steps of work as: data validation, parameter estimation, and economic optimization. This research evaluated statistical algorithms for gross error detection, data reconciliation, and parameter estimation, and developed an open-form steady state process model for the Monsanto designed sulfuric acid process of IMC Agrico Company. The plant model was used to demonstrate improved economics and reduced emissions from on-line optimization and to test the methodology of on-line optimization. Also, a modified compensation strategy was proposed to improve the misrectification of data reconciliation algorithms and it was compared with measurement test method. In addition, two ways to conduct on-line optimization were studied. One required two separated optimization problems to update parameters, and the other combined data validation and parameter estimation into one optimization problem. Two-step estimation demonstrated a better performance in estimation accuracy than one-step estimation for sulfuric acid process, while one-step estimation required less computation time. The measurement test method, Tjoa-Biegler\u27 contaminated Gaussian distribution method, and robust method were evaluated theoretically and numerically to compare the performance of these methods. Results from these evaluation were used to recommend the best way to conduct on-line optimization. The optimal procedure is to conduct combined gross error detection and data reconciliation to detect and rectify gross errors in plant data from DCS using Tjoa-Biegler\u27s method or robust method. This step generates a set of measurements containing only random errors which is used for simultaneous data reconciliation and parameter estimation using the least squares method (the normal distribution). Updated parameters are used in the plant model for economic optimization that generates optimal set points for DCS. Applying this procedure to the Monsanto sulfuric acid plant had an increased profit of 3% over current operating condition and an emission reduction of 10% which is consistent with other reported applications. Also, this optimal procedure to conduct on-line optimization has been incorporated into an interactive on-line optimization program which used a window interface developed with Visual Basic and GAMS to solve the nonlinear optimization problems. This program is to be available through the EPA Technology Tool Program

    Determining the factors affecting investment decision on tanker industry: a case study on Bangladesh Shipping Corporation

    Get PDF

    Prognostic-based Life Extension Methodology with Application to Power Generation Systems

    Get PDF
    Practicable life extension of engineering systems would be a remarkable application of prognostics. This research proposes a framework for prognostic-base life extension. This research investigates the use of prognostic data to mobilize the potential residual life. The obstacles in performing life extension include: lack of knowledge, lack of tools, lack of data, and lack of time. This research primarily considers using the acoustic emission (AE) technology for quick-response diagnostic. To be specific, an important feature of AE data was statistically modeled to provide quick, robust and intuitive diagnostic capability. The proposed model was successful to detect the out of control situation when the data of faulty bearing was applied. This research also highlights the importance of self-healing materials. One main component of the proposed life extension framework is the trend analysis module. This module analyzes the pattern of the time-ordered degradation measures. The trend analysis is helpful not only for early fault detection but also to track the improvement in the degradation rate. This research considered trend analysis methods for the prognostic parameters, degradation waveform and multivariate data. In this respect, graphical methods was found appropriate for trend detection of signal features. Hilbert Huang Transform was applied to analyze the trends in waveforms. For multivariate data, it was realized that PCA is able to indicate the trends in the data if accompanied by proper data processing. In addition, two algorithms are introduced to address non-monotonic trends. It seems, both algorithms have the potential to treat the non-monotonicity in degradation data. Although considerable research has been devoted to developing prognostics algorithms, rather less attention has been paid to post-prognostic issues such as maintenance decision making. A multi-objective optimization model is presented for a power generation unit. This model proves the ability of prognostic models to balance between power generation and life extension. In this research, the confronting objective functions were defined as maximizing profit and maximizing service life. The decision variables include the shaft speed and duration of maintenance actions. The results of the optimization models showed clearly that maximizing the service life requires lower shaft speed and longer maintenance time
    • …
    corecore