1,071 research outputs found

    Profile control charts based on nonparametric LL-1 regression methods

    Full text link
    Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and its explanatory variables. Monitoring such functional profiles has been a rapidly growing field due to increasing demands. This paper develops a novel nonparametric LL-1 location-scale model to screen the shapes of profiles. The model is built on three basic elements: location shifts, local shape distortions, and overall shape deviations, which are quantified by three individual metrics. The proposed approach is applied to the previously analyzed vertical density profile data, leading to some interesting insights.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS501 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A data-driven approach for Network Intrusion Detection and Monitoring based on Kernel Null Space

    Get PDF
    International audienceIn this study, we propose a new approach to determine intrusions of network in real-time based on statistical process control technique and kernel null space method. The training samples in a class are mapped to a single point using the Kernel Null Foley-Sammon Transform. The Novelty Score are computed from testing samples in order to determine the threshold for the real-time detection of anomaly. The efficiency of the proposed method is illustrated over the KDD99 data set. The experimental results show that our new method outperforms the OCSVM and the original Kernel Null Space method by 1.53% and 3.86% respectively in terms of accuracy

    Quantifying Forecast Uncertainty in the Energy Domain

    Get PDF
    This dissertation focuses on quantifying forecast uncertainties in the energy domain, especially for the electricity and natural gas industry. Accurate forecasts help the energy industry minimize their production costs. However, inaccurate weather forecasts, unusual human behavior, sudden changes in economic conditions, unpredictable availability of renewable sources (wind and solar), etc., represent uncertainties in the energy demand-supply chain. In the current smart grid era, total electricity demand from non-renewable sources influences by the uncertainty of the renewable sources. Thus, quantifying forecast uncertainty has become important to improve the quality of forecasts and decision making. In the natural gas industry, the task of the gas controllers is to guide the hourly natural gas flow in such a way that it remains within a certain daily maximum and minimum flow limits to avoid penalties. Due to inherent uncertainties in the natural gas forecasts, setting such maximum and minimum flow limits a day or more in advance is difficult. Probabilistic forecasts (cumulative distribution functions), which quantify forecast uncertainty, are a useful tool to guide gas controllers to make such tough decisions. Three methods (parametric, semi-parametric, and non-parametric) are presented in this dissertation to generate 168-hour horizon probabilistic forecasts for two real utilities (electricity and natural gas) in the US. Probabilistic forecasting is used as a tool to solve a real-life problem in the natural gas industry. A benchmark was created based on the existing solution, which assumes forecast error is normal. Two new probabilistic forecasting methods are implemented in this work without the normality assumption. There is no single popular evaluation technique available to assess probabilistic forecasts, which is one reason for people’s lack of interest in using probabilistic forecasts. Existing scoring rules are complicated, dataset dependent, and provide less emphasis on reliability (empirical distribution matches with observed distribution) than sharpness (the smallest distance between any two quantiles of a CDF). A graphical way to evaluate probabilistic forecasts along with two new scoring rules are offered in this work. The non-parametric and semi-parametric probabilistic forecasting methods outperformed the benchmark method during unusual days (difficult days to forecast) as well as on other days

    Life-cycle cost analysis and probabilistic cost estimating in engineering design using an air duct design case study

    Get PDF
    Although the issue of uncertainties in cost model parameters has been recognized as an important aspect of life-cycle cost analysis, it is often ignored or not well treated in cost estimating. A simulation approach employing kernel estimation techniques and their asymptotic properties in the development of the probability distribution functions (PDFs) of cost estimates is proposed. This eliminates the guess work inherent in current simulation based cost estimating procedures, reduces the amount of data sampled and makes it easier to specify the accuracy desired in the estimated distribution. Building energy costs can be reduced considerably if air duct systems are designed for the least life-cycle cost. The IPS-Method, a simple approach to HVAC air duct design is suggested. The Diameter and Enhanced Friction Charts are also developed. These are charts that implicitly incorporate the LCC and are better than the existing Friction Chart for the selection of duct sizes. Through illustrative examples, the ease and effectiveness of these are demonstrated. For more complex designs, a Segregated Genetic Algorithm (SGA) is recommend. A sample problem with variable time-of-day operating conditions and utility rates is used to illustrate its capabilities. The results are compared to those obtained using weighted average flow rates and utility rates to show the life-cycle cost savings possible by using this approach. Although life-cycle cost savings may be only between 0.4% and 8.3% for some simple designs, much larger savings may occur with more complex designs and operating constraints. The SGA is combined with probabilistic cost estimating to optimize HVAC air duct systems with uncertainties in the model parameters. The designs based on the SGA method tended to be less sensitive to typical variations in the component physical parameters and, therefore, are expected to result in lower balancing and operating costs

    Template estimation for samples of curves and functional calibration estimation via the method of maximum entropy on the mean

    Get PDF
    L'une des principales difficultés de l'analyse des données fonctionnelles consiste à extraire un motif commun qui synthétise l'information contenue par toutes les fonctions de l'échantillon. Le Chapitre 2 examine le problème d'identification d'une fonction qui représente le motif commun en supposant que les données appartiennent à une variété ou en sont suffisamment proches, d'une variété non linéaire de basse dimension intrinsèque munie d'une structure géométrique inconnue et incluse dans un espace de grande dimension. Sous cette hypothèse, un approximation de la distance géodésique est proposé basé sur une version modifiée de l'algorithme Isomap. Cette approximation est utilisée pour calculer la fonction médiane empirique de Fréchet correspondante. Cela fournit un estimateur intrinsèque robuste de la forme commune. Le Chapitre 3 étudie les propriétés asymptotiques de la méthode de normalisation quantile développée par Bolstad, et al. (2003) qui est devenue l'une des méthodes les plus populaires pour aligner des courbes de densité en analyse de données de microarrays en bioinformatique. Les propriétés sont démontrées considérant la méthode comme un cas particulier de la procédure de la moyenne structurelle pour l'alignement des courbes proposée par Dupuy, Loubes and Maza (2011). Toutefois, la méthode échoue dans certains cas. Ainsi, nous proposons une nouvelle méthode, pour faire face à ce problème. Cette méthode utilise l'algorithme développée dans le Chapitre 2. Dans le Chapitre 4, nous étendons le problème d'estimation de calage pour la moyenne d'une population finie de la variable de sondage dans un cadre de données fonctionnelles. Nous considérons le problème de l'estimation des poids de sondage fonctionnel à travers le principe du maximum d'entropie sur la moyenne -MEM-. En particulier, l'estimation par calage est considérée comme un problème inverse linéaire de dimension infinie suivant la structure de l'approche du MEM. Nous donnons un résultat précis d'estimation des poids de calage fonctionnels pour deux types de mesures aléatoires a priori: la measure Gaussienne centrée et la measure de Poisson généralisée.One of the main difficulties in functional data analysis is the extraction of a meaningful common pattern that summarizes the information conveyed by all functions in the sample. The problem of finding a meaningful template function that represents this pattern is considered in Chapter 2 assuming that the functional data lie on an intrinsically low-dimensional smooth manifold with an unknown underlying geometric structure embedding in a high-dimensional space. Under this setting, an approximation of the geodesic distance is developed based on a robust version of the Isomap algorithm. This approximation is used to compute the corresponding empirical Fréchet median function, which provides a robust intrinsic estimator of the template. The Chapter 3 investigates the asymptotic properties of the quantile normalization method by Bolstad, et al. (2003) which is one of the most popular methods to align density curves in microarray data analysis. The properties are proved by considering the method as a particular case of the structural mean curve alignment procedure by Dupuy, Loubes and Maza (2011). However, the method fails in some case of mixtures, and a new methodology to cope with this issue is proposed via the algorithm developed in Chapter 2. Finally, the problem of calibration estimation for the finite population mean of a survey variable under a functional data framework is studied in Chapter 4. The functional calibration sampling weights of the estimator are obtained by matching the calibration estimation problem with the maximum entropy on the mean -MEM- principle. In particular, the calibration estimation is viewed as an infinite-dimensional linear inverse problem following the structure of the MEM approach. A precise theoretical setting is given and the estimation of functional calibration weights assuming, as prior measures, the centered Gaussian and compound Poisson random measures is carried out

    Statistical Process Monitoring of Isolated and Persistent Defects in Complex Geometrical Shapes

    Full text link
    Traditional Statistical Process Control methodologies face several challenges when monitoring defects in complex geometries, such as those of products obtained via Additive Manufacturing techniques. Many approaches cannot be applied in these settings due to the high dimensionality of the data and the lack of parametric and distributional assumptions on the object shapes. Motivated by a case study involving the monitoring of egg-shaped trabecular structures, we investigate two recently-proposed methodologies to detect deviations from the nominal IC model caused by excess or lack of material. Our study focuses on the detection of both isolated large changes in the geometric structure, as well as persistent small deviations. We compare the approach of Scimone et al. (2022) with Zhao and del Castillo (2021) for monitoring defects in a small Phase I sample of 3D-printed objects. While the former control chart is able to detect large defects, the latter allows the detection of nonconforming objects with persistent small defects. Furthermore, we address the fundamental issue of selecting the number of eigenvalues to be monitored in Zhao and del Castillo's method by proposing a dimensionality reduction technique based on kernel principal components. This approach is shown to provide a good detection capability even when considering a large number of eigenvalues. By leveraging the sensitivity of the two monitoring schemes to different magnitudes of nonconformities, we also propose a novel joint monitoring scheme that is capable of identifying both types of defects in the considered case study. Computer code in R and Matlab that implements these methods and replicates the results is available as part of the supplementary material.Comment: 39 pages, 5 figures, 3 table
    • …
    corecore