870 research outputs found

    Optimization-Based Evolutionary Data Mining Techniques for Structural Health Monitoring

    Get PDF
    In recent years, data mining technology has been employed to solve various Structural Health Monitoring (SHM) problems as a comprehensive strategy because of its computational capability. Optimization is one the most important functions in Data mining. In an engineering optimization problem, it is not easy to find an exact solution. In this regard, evolutionary techniques have been applied as a part of procedure of achieving the exact solution. Therefore, various metaheuristic algorithms have been developed to solve a variety of engineering optimization problems in SHM. This study presents the most applicable as well as effective evolutionary techniques used in structural damage identification. To this end, a brief overview of metaheuristic techniques is discussed in this paper. Then the most applicable optimization-based algorithms in structural damage identification are presented, i.e. Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Imperialist Competitive Algorithm (ICA) and Ant Colony Optimization (ACO). Some related examples are also detailed in order to indicate the efficiency of these algorithms

    A New Generation of Mixture-Model Cluster Analysis with Information Complexity and the Genetic EM Algorithm

    Get PDF
    In this dissertation, we extend several relatively new developments in statistical model selection and data mining in order to improve one of the workhorse statistical tools - mixture modeling (Pearson, 1894). The traditional mixture model assumes data comes from several populations of Gaussian distributions. Thus, what remains is to determine how many distributions, their population parameters, and the mixing proportions. However, real data often do not fit the restrictions of normality very well. It is likely that data from a single population exhibiting either asymmetrical or nonnormal tail behavior could be erroneously modeled as two populations, resulting in suboptimal decisions. To avoid these pitfalls, we develop the mixture model under a broader distributional assumption by fitting a group of multivariate elliptically-contoured distributions (Anderson and Fang, 1990; Fang et al., 1990). Special cases include the multivariate Gaussian and power exponential distributions, as well as the multivariate generalization of the Student’s T. This gives us the flexibility to model nonnormal tail and peak behavior, though the symmetry restriction still exists. The literature has many examples of research generalizing the Gaussian mixture model to other distributions (Farrell and Mersereau, 2004; Hasselblad, 1966; John, 1970a), but our effort is more general. Further, we generalize the mixture model to be non-parametric, by developing two types of kernel mixture model. First, we generalize the mixture model to use the truly multivariate kernel density estimators (Wand and Jones, 1995). Additionally, we develop the power exponential product kernel mixture model, which allows the density to adjust to the shape of each dimension independently. Because kernel density estimators enforce no functional form, both of these methods can adapt to nonnormal asymmetric, kurtotic, and tail characteristics. Over the past two decades or so, evolutionary algorithms have grown in popularity, as they have provided encouraging results in a variety of optimization problems. Several authors have applied the genetic algorithm - a subset of evolutionary algorithms - to mixture modeling, including Bhuyan et al. (1991), Krishna and Murty (1999), and Wicker (2006). These procedures have the benefit that they bypass computational issues that plague the traditional methods. We extend these initialization and optimization methods by combining them with our updated mixture models. Additionally, we “borrow” results from robust estimation theory (Ledoit and Wolf, 2003; Shurygin, 1983; Thomaz, 2004) in order to data-adaptively regularize population covariance matrices. Numerical instability of the covariance matrix can be a significant problem for mixture modeling, since estimation is typically done on a relatively small subset of the observations. We likewise extend various information criteria (Akaike, 1973; Bozdogan, 1994b; Schwarz, 1978) to the elliptically-contoured and kernel mixture models. Information criteria guide model selection and estimation based on various approximations to the Kullback-Liebler divergence. Following Bozdogan (1994a), we use these tools to sequentially select the best mixture model, select the best subset of variables, and detect influential observations - all without making any subjective decisions. Over the course of this research, we developed a full-featured Matlab toolbox (M3) which implements all the new developments in mixture modeling presented in this dissertation. We show results on both simulated and real world datasets. Keywords: mixture modeling, nonparametric estimation, subset selection, influence detection, evidence-based medical diagnostics, unsupervised classification, robust estimation

    An Overview on Application of Machine Learning Techniques in Optical Networks

    Get PDF
    Today's telecommunication networks have become sources of enormous amounts of widely heterogeneous data. This information can be retrieved from network traffic traces, network alarms, signal quality indicators, users' behavioral data, etc. Advanced mathematical tools are required to extract meaningful information from these data and take decisions pertaining to the proper functioning of the networks from the network-generated data. Among these mathematical tools, Machine Learning (ML) is regarded as one of the most promising methodological approaches to perform network-data analysis and enable automated network self-configuration and fault management. The adoption of ML techniques in the field of optical communication networks is motivated by the unprecedented growth of network complexity faced by optical networks in the last few years. Such complexity increase is due to the introduction of a huge number of adjustable and interdependent system parameters (e.g., routing configurations, modulation format, symbol rate, coding schemes, etc.) that are enabled by the usage of coherent transmission/reception technologies, advanced digital signal processing and compensation of nonlinear effects in optical fiber propagation. In this paper we provide an overview of the application of ML to optical communications and networking. We classify and survey relevant literature dealing with the topic, and we also provide an introductory tutorial on ML for researchers and practitioners interested in this field. Although a good number of research papers have recently appeared, the application of ML to optical networks is still in its infancy: to stimulate further work in this area, we conclude the paper proposing new possible research directions

    Renewable Energy Integration in Distribution System with Artificial Intelligence

    Get PDF
    With the increasing attention of renewable energy development in distribution power system, artificial intelligence (AI) can play an indispensiable role. In this thesis, a series of artificial intelligence based methods are studied and implemented to further enhance the performance of power system operation and control. Due to the large volume of heterogeneous data provided by both the customer and the grid side, a big data visualization platform is built to feature out the hidden useful knowledge for smart grid (SG) operation, control and situation awareness. An open source cluster calculation framework with Apache Spark is used to discover big data hidden information. The data is transmitted with an Open System Interconnection (OSI) model to the data visualization platform with a high-speed communication architecture. Google Earth and Global Geographic Information System (GIS) are used to design the visualization platform and realize the results. Based on the data visualization platform above, the external manifestation of the data is studied. In the following work, I try to understand the internal hidden information of the data. A short-term load forecasting approach is designed based on support vector regression (SVR) to provide a higher accuracy load forecasting for the network reconfiguration. The nonconvexity of three-phase balanced optimal power flow is relaxed to an optimal power flow (OPF) problem with the second-order cone program (SOCP). The alternating direction method of multipliers (ADMM) is used to compute the optimal power flow in distributed manner. Considering the reality of distribution systems, a three-phase unbalanced distribtion system is built, which consists of the hourly operation scheduling at substation level and the minutes power flow operation at feeder level. The operaion cost of system with renewable generation is minimized at substation level. The stochastoc distribution model of renewable generation is simulated with a chance constraint, and the derived deterministic form is modeled with Gaussian Mixture Model (GMM) with genetic algorithm-based expectationmaximization (GAEM). The system cost is further reduced with OPF in real-time (RT) scheduling. The semidefinite programming (SDP) is used to relax the nonconvexity of the three-phase unbalanced distribution system into a convex problem, which helps to achieve the global optimal result. In the parallel manner, the ADMM is realizing getting the results in a short time. Clouds have a big impact on solar energy forecasting. Firstly, a convolutional neural network based mathod is used to estimate the solar irradiance, Secondly, the regression results are collected to predict the renewable generation. After that, a novel approach is proposed to capture the Global horizontal irradiance (GHI) conveniently and accurately. Considering the nonstationary property of the GHI on cloudy days, the GHI capturing is cast as an image regression problem. In traditional approaches, the image regression problem is treated as two parts, feature extraction and regression, which are optimized separately and no interconnections. Considering the nonlinear regression capability, a convolutional neural network (CNN) based image regression approach is proposed to provide an End-to- End solution for the cloudy day GHI capturing problem in this paper. For data cleaning, the Gaussian mixture model with Bayesian inference is employed to detect and eliminate the anomaly data in a nonparametric manner. The purified data are used as input data for the proposed image regression approach. The numerical results demonstrate the feasibility and effectiveness of the proposed approach
    • …
    corecore