812 research outputs found

    Study on identification of nonlinear systems using Quasi-ARX models

    Get PDF
    制度:新 ; 報告番号:甲3660号 ; 学位の種類:博士(工学) ; 授与年月日:2012/9/15 ; 早大学位記番号:新6026Waseda Universit

    Cooperative co-evolution for feature selection in big data with random feature grouping

    Get PDF
    © 2020, The Author(s). A massive amount of data is generated with the evolution of modern technologies. This high-throughput data generation results in Big Data, which consist of many features (attributes). However, irrelevant features may degrade the classification performance of machine learning (ML) algorithms. Feature selection (FS) is a technique used to select a subset of relevant features that represent the dataset. Evolutionary algorithms (EAs) are widely used search strategies in this domain. A variant of EAs, called cooperative co-evolution (CC), which uses a divide-and-conquer approach, is a good choice for optimization problems. The existing solutions have poor performance because of some limitations, such as not considering feature interactions, dealing with only an even number of features, and decomposing the dataset statically. In this paper, a novel random feature grouping (RFG) has been introduced with its three variants to dynamically decompose Big Data datasets and to ensure the probability of grouping interacting features into the same subcomponent. RFG can be used in CC-based FS processes, hence called Cooperative Co-Evolutionary-Based Feature Selection with Random Feature Grouping (CCFSRFG). Experiment analysis was performed using six widely used ML classifiers on seven different datasets from the UCI ML repository and Princeton University Genomics repository with and without FS. The experimental results indicate that in most cases [i.e., with naïve Bayes (NB), support vector machine (SVM), k-Nearest Neighbor (k-NN), J48, and random forest (RF)] the proposed CCFSRFG-1 outperforms an existing solution (a CC-based FS, called CCEAFS) and CCFSRFG-2, and also when using all features in terms of accuracy, sensitivity, and specificity

    UQ and AI: data fusion, inverse identification, and multiscale uncertainty propagation in aerospace components

    Get PDF
    A key requirement for engineering designs is that they offer good performance across a range of uncertain conditions while exhibiting an admissibly low probability of failure. In order to design components that offer good performance across a range of uncertain conditions, it is necessary to take account of the effect of the uncertainties associated with a candidate design. Uncertainty Quantification (UQ) methods are statistical methods that may be used to quantify the effect of the uncertainties inherent in a system on its performance. This thesis expands the envelope of UQ methods for the design of aerospace components, supporting the integration of UQ methods in product development by addressing four industrial challenges. Firstly, a method for propagating uncertainty through computational models in a hierachy of scales is described that is based on probabilistic equivalence and Non-Intrusive Polynomial Chaos (NIPC). This problem is relevant to the design of aerospace components as the computational models used to evaluate candidate designs are typically multiscale. This method was then extended to develop a formulation for inverse identification, where the probability distributions for the material properties of a coupon are deduced from measurements of its response. We demonstrate how probabilistic equivalence and the Maximum Entropy Principle (MEP) may be used to leverage data from simulations with scarce experimental data- with the intention of making this stage of product design less expensive and time consuming. The third contribution of this thesis is to develop two novel meta-modelling strategies to promote the wider exploration of the design space during the conceptual design phase. Design Space Exploration (DSE) in this phase is crucial as decisions made at the early, conceptual stages of an aircraft design can restrict the range of alternative designs available at later stages in the design process, despite limited quantitative knowledge of the interaction between requirements being available at this stage. A histogram interpolation algorithm is presented that allows the designer to interactively explore the design space with a model-free formulation, while a meta-model based on Knowledge Based Neural Networks (KBaNNs) is proposed in which the outputs of a high-level, inexpensive computer code are informed by the outputs of a neural network, in this way addressing the criticism of neural networks that they are purely data-driven and operate as black boxes. The final challenge addressed by this thesis is how to iteratively improve a meta-model by expanding the dataset used to train it. Given the reliance of UQ methods on meta-models this is an important challenge. This thesis proposes an adaptive learning algorithm for Support Vector Machine (SVM) metamodels, which are used to approximate an unknown function. In particular, we apply the adaptive learning algorithm to test cases in reliability analysis.Open Acces

    Hybrid Advanced Optimization Methods with Evolutionary Computation Techniques in Energy Forecasting

    Get PDF
    More accurate and precise energy demand forecasts are required when energy decisions are made in a competitive environment. Particularly in the Big Data era, forecasting models are always based on a complex function combination, and energy data are always complicated. Examples include seasonality, cyclicity, fluctuation, dynamic nonlinearity, and so on. These forecasting models have resulted in an over-reliance on the use of informal judgment and higher expenses when lacking the ability to determine data characteristics and patterns. The hybridization of optimization methods and superior evolutionary algorithms can provide important improvements via good parameter determinations in the optimization process, which is of great assistance to actions taken by energy decision-makers. This book aimed to attract researchers with an interest in the research areas described above. Specifically, it sought contributions to the development of any hybrid optimization methods (e.g., quadratic programming techniques, chaotic mapping, fuzzy inference theory, quantum computing, etc.) with advanced algorithms (e.g., genetic algorithms, ant colony optimization, particle swarm optimization algorithm, etc.) that have superior capabilities over the traditional optimization approaches to overcome some embedded drawbacks, and the application of these advanced hybrid approaches to significantly improve forecasting accuracy

    A Novel Hybrid Dimensionality Reduction Method using Support Vector Machines and Independent Component Analysis

    Get PDF
    Due to the increasing demand for high dimensional data analysis from various applications such as electrocardiogram signal analysis and gene expression analysis for cancer detection, dimensionality reduction becomes a viable process to extracts essential information from data such that the high-dimensional data can be represented in a more condensed form with much lower dimensionality to both improve classification accuracy and reduce computational complexity. Conventional dimensionality reduction methods can be categorized into stand-alone and hybrid approaches. The stand-alone method utilizes a single criterion from either supervised or unsupervised perspective. On the other hand, the hybrid method integrates both criteria. Compared with a variety of stand-alone dimensionality reduction methods, the hybrid approach is promising as it takes advantage of both the supervised criterion for better classification accuracy and the unsupervised criterion for better data representation, simultaneously. However, several issues always exist that challenge the efficiency of the hybrid approach, including (1) the difficulty in finding a subspace that seamlessly integrates both criteria in a single hybrid framework, (2) the robustness of the performance regarding noisy data, and (3) nonlinear data representation capability. This dissertation presents a new hybrid dimensionality reduction method to seek projection through optimization of both structural risk (supervised criterion) from Support Vector Machine (SVM) and data independence (unsupervised criterion) from Independent Component Analysis (ICA). The projection from SVM directly contributes to classification performance improvement in a supervised perspective whereas maximum independence among features by ICA construct projection indirectly achieving classification accuracy improvement due to better intrinsic data representation in an unsupervised perspective. For linear dimensionality reduction model, I introduce orthogonality to interrelate both projections from SVM and ICA while redundancy removal process eliminates a part of the projection vectors from SVM, leading to more effective dimensionality reduction. The orthogonality-based linear hybrid dimensionality reduction method is extended to uncorrelatedness-based algorithm with nonlinear data representation capability. In the proposed approach, SVM and ICA are integrated into a single framework by the uncorrelated subspace based on kernel implementation. Experimental results show that the proposed approaches give higher classification performance with better robustness in relatively lower dimensions than conventional methods for high-dimensional datasets

    An advance extended binomial GLMBoost ensemble method with synthetic minority over-sampling technique for handling imbalanced datasets

    Get PDF
    Classification is an important activity in a variety of domains. Class imbalance problem have reduced the performance of the traditional classification approaches. An imbalance problem arises when mismatched class distributions are discovered among the instances of class of classification datasets. An advance extended binomial GLMBoost (EBGLMBoost) coupled with synthetic minority over-sampling technique (SMOTE) technique is the proposed model in the study to manage imbalance issues. The SMOTE is used to solve the proposed model, ensuring that the target variable's distribution is balanced, whereas the GLMBoost ensemble techniques are built to deal with imbalanced datasets. For the entire experiment, twenty different datasets are used, and support vector machine (SVM), Nu-SVM, bagging, and AdaBoost classification algorithms are used to compare with the suggested method. The model's sensitivity, specificity, geometric mean (G-mean), precision, recall, and F-measure resulted in percentages for training and testing datasets are 99.37, 66.95, 80.81, 99.21, 99.37, 99.29 and 98.61, 54.78, 69.88, 98.77, 96.61, 98.68, respectively. With the help of the Wilcoxon test, it is determined that the proposed technique performed well on unbalanced data. Finally, the proposed solutions are capable of efficiently dealing with the problem of class imbalance
    corecore