2,245 research outputs found

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    Online Detection of Outliers and Structural Breaks using Sequential Monte Carlo Methods

    Get PDF
    Outliers and structural breaks occur quite frequently in time series data. Whereas outliers often contain valuable information about the process under study, they are known to have serious negative impact on statistical data analysis. Most obvious effect is model misspecification and biased parameter estimation which results in wrong conclusions and inaccurate predictions. Structural time series consist of underlying features such as level, slope, cycles or seasonal components. Structural breaks are permanent disruptions of one or more of these components and might be a signal of serious changes in the observed process. Detecting outliers and estimating the location of structural breaks has progressively become monumental both as a theoretical research problem and an essential part of applied data analysis. Among numerous applications include finance, industrial manufacturing, medical informatics, severe weather prediction. Given that these data arrive rather frequently and sequentially in time, fast reliable and accurate detection techniques are required. We propose a model from class of state-space models of the form yt=f(Xt,ψ,vt) y_{t} = f(X_{t}, \psi, v_{t}) and Xt=g(Xt1,ψ,wt) X_{t} = g(X_{t-1}, \psi, w_{t}) where {Xt}t0 \big\{ X_{t} \big\}_{t\geq 0} is a hidden Markov state process. The inference of {Xt}t0 \big\{ X_{t} \big\}_{t\geq 0} depends on the observation process {yt}t1 \{y_{t}\}_{t\geq 1} and the parameter vector ψ \psi , whose elements are usually unknown. The innovations vt v_{t} and wt w_{t} are conditionally \textit{Gaussian} given the precision parameter λ \lambda and auxiliary state ω \omega . We employ sequential Monte Carlo techniques to approximate the joint target distribution p(X0:t,ψy1:t) p(X_{0:t}, \psi|y_{1:t}) . The posterior estimates for the auxiliary states ω \omega will be used to identify outliers and structural breaks. The results prove that the algorithm is comparable to traditional and computationally expensive MCMC and superior to regular techniques such as Exponentially Weighted Moving Average (EWMA), Shewhart, and cumulative sum (CUSUM) control chart

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    NASA Space Engineering Research Center Symposium on VLSI Design

    Get PDF
    The NASA Space Engineering Research Center (SERC) is proud to offer, at its second symposium on VLSI design, presentations by an outstanding set of individuals from national laboratories and the electronics industry. These featured speakers share insights into next generation advances that will serve as a basis for future VLSI design. Questions of reliability in the space environment along with new directions in CAD and design are addressed by the featured speakers

    Statistical Methods for Semiconductor Manufacturing

    Get PDF
    In this thesis techniques for non-parametric modeling, machine learning, filtering and prediction and run-to-run control for semiconductor manufacturing are described. In particular, algorithms have been developed for two major applications area: - Virtual Metrology (VM) systems; - Predictive Maintenance (PdM) systems. Both technologies have proliferated in the past recent years in the semiconductor industries, called fabs, in order to increment productivity and decrease costs. VM systems aim of predicting quantities on the wafer, the main and basic product of the semiconductor industry, that may be physically measurable or not. These quantities are usually ’costly’ to be measured in economic or temporal terms: the prediction is based on process variables and/or logistic information on the production that, instead, are always available and that can be used for modeling without further costs. PdM systems, on the other hand, aim at predicting when a maintenance action has to be performed. This approach to maintenance management, based like VM on statistical methods and on the availability of process/logistic data, is in contrast with other classical approaches: - Run-to-Failure (R2F), where there are no interventions performed on the machine/process until a new breaking or specification violation happens in the production; - Preventive Maintenance (PvM), where the maintenances are scheduled in advance based on temporal intervals or on production iterations. Both aforementioned approaches are not optimal, because they do not assure that breakings and wasting of wafers will not happen and, in the case of PvM, they may lead to unnecessary maintenances without completely exploiting the lifetime of the machine or of the process. The main goal of this thesis is to prove through several applications and feasibility studies that the use of statistical modeling algorithms and control systems can improve the efficiency, yield and profits of a manufacturing environment like the semiconductor one, where lots of data are recorded and can be employed to build mathematical models. We present several original contributions, both in the form of applications and methods. The introduction of this thesis will be an overview on the semiconductor fabrication process: the most common practices on Advanced Process Control (APC) systems and the major issues for engineers and statisticians working in this area will be presented. Furthermore we will illustrate the methods and mathematical models used in the applications. We will then discuss in details the following applications: - A VM system for the estimation of the thickness deposited on the wafer by the Chemical Vapor Deposition (CVD) process, that exploits Fault Detection and Classification (FDC) data is presented. In this tool a new clustering algorithm based on Information Theory (IT) elements have been proposed. In addition, the Least Angle Regression (LARS) algorithm has been applied for the first time to VM problems. - A new VM module for multi-step (CVD, Etching and Litography) line is proposed, where Multi-Task Learning techniques have been employed. - A new Machine Learning algorithm based on Kernel Methods for the estimation of scalar outputs from time series inputs is illustrated. - Run-to-Run control algorithms that employ both the presence of physical measures and statistical ones (coming from a VM system) is shown; this tool is based on IT elements. - A PdM module based on filtering and prediction techniques (Kalman Filter, Monte Carlo methods) is developed for the prediction of maintenance interventions in the Epitaxy process. - A PdM system based on Elastic Nets for the maintenance predictions in Ion Implantation tool is described. Several of the aforementioned works have been developed in collaborations with major European semiconductor companies in the framework of the European project UE FP7 IMPROVE (Implementing Manufacturing science solutions to increase equiPment pROductiVity and fab pErformance); such collaborations will be specified during the thesis, underlying the practical aspects of the implementation of the proposed technologies in a real industrial environment

    Degradation modeling and degradation-aware control of power electronic systems

    Get PDF
    The power electronics market is valued at 23.25billionin2019andisprojectedtoreach23.25 billion in 2019 and is projected to reach 36.64 billion by 2027. Power electronic systems (PES) have been extensively used in a wide range of critical applications, including automotive, renewable energy, industrial variable-frequency drive, etc. Thus, the PESs\u27 reliability and robustness are immensely important for the smooth operation of mission-critical applications. Power semiconductor switches are one of the most vulnerable components in the PES. The vulnerability of these switches impacts the reliability and robustness of the PES. Thus, switch-health monitoring and prognosis are critical for avoiding unexpected shutdowns and preventing catastrophic failures. The importance of the prognosis study increases dramatically with the growing popularity of the next-generation power semiconductor switches, wide bandgap switches. These switches show immense promise in the high-power high-frequency operations due to their higher breakdown voltage and lower switch loss. But their wide adaptation is limited by the inadequate reliability study. A thorough prognosis study comprising switch degradation modeling, remaining useful life (RUL) estimation, and degradation-aware controller development, is important to enhance the PESs\u27 robustness, especially with wide bandgap switches. In this dissertation, three studies are conducted to achieve these objectives- 1) Insulated Gate Bipolar Transistor (IGBT) degradation modeling and RUL estimation, 2) cascode Gallium Nitride (GaN) Field-Effect Transistor (FET) degradation modeling and RUL estimation, and 3) Degradation-aware controller design for a PES, solid-state transformer (SST). The first two studies have addressed the significant variation in RUL estimation and proposed degradation identification methods for IGBT and cascode GaN FET. In the third study, a system-level integration of the switch degradation model is implemented in the SST. The insight into the switch\u27s degradation pattern from the first two studies is integrated into developing a degradation-aware controller for the SST. State-of-the-art controllers do not consider the switch degradation that results in premature system failure. The proposed low-complexity degradation-aware and adaptive SST controller ensures optimal degradation-aware power transfer and robust operation over the lifetime

    Feature Selection For The Fuzzy Artmap Neural Network Using A Hybrid Genetic Algorithm And Tabu Search

    Get PDF
    Prestasi pengelas rangkaian neural amat bergantung kepada set data yang digunakan dalam process pembelajaran. The performance of Neural-Network (NN)-based classifiers is strongly dependent on the data set used for learning
    corecore