4,230 research outputs found

    Uncertainty modelling in power spectrum estimation of environmental processes

    Get PDF
    For efficient reliability analysis of buildings and structures, robust load models are required in stochastic dynamics, which can be estimated in particular from environmental processes, such as earthquakes or wind loads. To determine the response behaviour of a dynamic system under such loads, the power spectral density (PSD) function is a widely used tool for identifying the frequency components and corresponding amplitudes of environmental processes. Since the real data records required for this purpose are often subject to aleatory and epistemic uncertainties, and the PSD estimation process itself can induce further uncertainties, a rigorous quantification of these is essential, as otherwise a highly inaccurate load model could be generated which may yield in misleading simulation results. A system behaviour that is actually catastrophic can thus be shifted into an acceptable range, classifying the system as safe even though it is exposed to a high risk of damage or collapse. To address these issues, alternative loading models are proposed using probabilistic and non-deterministic models, that are able to efficiently account for these uncertainties and to model the loadings accordingly. Various methods are used in the generation of these load models, which are selected in particular according to the characteristic of the data and the number of available records. In case multiple data records are available, reliable statistical information can be extracted from a set of similar PSD functions that differ, for instance, only slightly in shape and peak frequency. Based on these statistics, a PSD function model is derived utilising subjective probabilities to capture the epistemic uncertainties and represent this information effectively. The spectral densities are characterised as random variables instead of employing discrete values, and thus the PSD function itself represents a non-stationary random process comprising a range of possible valid PSD functions for a given data set. If only a limited amount of data records is available, it is not possible to derive such reliable statistical information. Therefore, an interval-based approach is proposed that determines only an upper and lower bound and does not rely on any distribution within these bounds. A set of discrete-valued PSD functions is transformed into an interval-valued PSD function by optimising the weights of pre-derived basis functions from a Radial Basis Function Network such that they compose an upper and lower bound that encompasses the data set. Therefore, a range of possible values and system responses are identified rather than discrete values, which are able to quantify the epistemic uncertainties. When generating such a load model using real data records, the problem can arise that the individual records exhibit a high spectral variance in the frequency domain and therefore differ too much from each other, although they appear to be similar in the time domain. A load model derived from these data may not cover the entire spectral range and is therefore not representative. The data are therefore grouped according to their similarity using the Bhattacharyya distance and k-means algorithm, which may generate two or more load models from the entire data set. These can be applied separately to the structure under investigation, leading to more accurate simulation results. This approach can also be used to estimate the spectral similarity of individual data sets in the frequency domain, which is particularly relevant for the load models mentioned above. If the uncertainties are modelled directly in the time signal, it can be a challenging task to transform them efficiently into the frequency domain. Such a signal may consist only of reliable bounds in which the actual signal lies. A method is presented that can automatically propagate this interval uncertainty through the discrete Fourier transform, obtaining the exact bounds on the Fourier amplitude and an estimate of the PSD function. The method allows such an interval signal to be propagated without making assumptions about the dependence and distribution of the error over the time steps. These novel representations of load models are able to quantify epistemic uncertainties inherent in real data records and induced due to the PSD estimation process. The strengths and advantages of these approaches in practice are demonstrated by means of several numerical examples concentrated in the field of stochastic dynamics.Für eine effiziente Zuverlässigkeitsanalyse von Gebäuden und Strukturen sind robuste Belastungsmodelle in der stochastischen Dynamik erforderlich, die insbesondere aus Umweltprozessen wie Erdbeben oder Windlasten geschätzt werden können. Um das Antwortverhalten eines dynamischen Systems unter solchen Belastungen zu bestimmen, ist die Funktion der Leistungsspektraldichte (PSD) ein weit verbreitetes Werkzeug zur Identifizierung der Frequenzkomponenten und der entsprechenden Amplituden von Umweltprozessen. Da die zu diesem Zweck benötigten realen Datensätze häufig mit aleatorischen und epistemischen Unsicherheiten behaftet sind und der PSD-Schätzprozess selbst weitere Unsicherheiten induzieren kann, ist eine strenge Quantifizierung dieser Unsicherheiten unerlässlich, da andernfalls ein sehr ungenaues Belastungsmodell erzeugt werden könnte, das zu fehlerhaften Simulationsergebnissen führen kann. Ein eigentlich katastrophales Systemverhalten kann so in einen akzeptablen Bereich verschoben werden, so dass das System als sicher eingestuft wird, obwohl es einem hohen Risiko der Beschädigung oder des Zusammenbruchs ausgesetzt ist. Um diese Probleme anzugehen, werden alternative Belastungsmodelle vorgeschlagen, die probabilistische und nicht-deterministische Modelle verwenden, welche in der Lage sind, diese Unsicherheiten effizient zu berücksichtigen und die Belastungen entsprechend zu modellieren. Bei der Erstellung dieser Lastmodelle werden verschiedene Methoden verwendet, die insbesondere nach dem Charakter der Daten und der Anzahl der verfügbaren Datensätze ausgewählt werden. Wenn mehrere Datensätze verfügbar sind, können zuverlässige statistische Informationen aus einer Reihe ähnlicher PSD-Funktionen extrahiert werden, die sich z.B. nur geringfügig in Form und Spitzenfrequenz unterscheiden. Auf der Grundlage dieser Statistiken wird ein Modell der PSD-Funktion abgeleitet, das subjektive Wahrscheinlichkeiten verwendet, um die epistemischen Unsicherheiten zu erfassen und diese Informationen effektiv darzustellen. Die spektralen Leistungsdichten werden als Zufallsvariablen charakterisiert, anstatt diskrete Werte zu verwenden, somit stellt die PSD-Funktion selbst einen nicht-stationären Zufallsprozess dar, der einen Bereich möglicher gültiger PSD-Funktionen für einen gegebenen Datensatz umfasst. Wenn nur eine begrenzte Anzahl von Datensätzen zur Verfügung steht, ist es nicht möglich, solche zuverlässigen statistischen Informationen abzuleiten. Daher wird ein intervallbasierter Ansatz vorgeschlagen, der nur eine obere und untere Grenze bestimmt und sich nicht auf eine Verteilung innerhalb dieser Grenzen stützt. Ein Satz von diskret wertigen PSD-Funktionen wird in eine intervallwertige PSD-Funktion umgewandelt, indem die Gewichte von vorab abgeleiteten Basisfunktionen aus einem Radialbasisfunktionsnetz so optimiert werden, dass sie eine obere und untere Grenze bilden, die den Datensatz umfassen. Damit wird ein Bereich möglicher Werte und Systemreaktionen anstelle diskreter Werte ermittelt, welche in der Lage sind, epistemische Unsicherheiten zu erfassen. Bei der Erstellung eines solchen Lastmodells aus realen Datensätzen kann das Problem auftreten, dass die einzelnen Datensätze eine hohe spektrale Varianz im Frequenzbereich aufweisen und sich daher zu stark voneinander unterscheiden, obwohl sie im Zeitbereich ähnlich erscheinen. Ein aus diesen Daten abgeleitetes Lastmodell deckt möglicherweise nicht den gesamten Spektralbereich ab und ist daher nicht repräsentativ. Die Daten werden daher mit Hilfe der Bhattacharyya-Distanz und des k-means-Algorithmus nach ihrer Ähnlichkeit gruppiert, wodurch zwei oder mehr Belastungsmodelle aus dem gesamten Datensatz erzeugt werden können. Diese können separat auf die zu untersuchende Struktur angewandt werden, was zu genaueren Simulationsergebnissen führt. Dieser Ansatz kann auch zur Schätzung der spektralen Ähnlichkeit einzelner Datensätze im Frequenzbereich verwendet werden, was für die oben genannten Lastmodelle besonders relevant ist. Wenn die Unsicherheiten direkt im Zeitsignal modelliert werden, kann es eine schwierige Aufgabe sein, sie effizient in den Frequenzbereich zu transformieren. Ein solches Signal kann möglicherweise nur aus zuverlässigen Grenzen bestehen, in denen das tatsächliche Signal liegt. Es wird eine Methode vorgestellt, mit der diese Intervallunsicherheit automatisch durch die diskrete Fourier Transformation propagiert werden kann, um die exakten Grenzen der Fourier-Amplitude und der Schätzung der PSD-Funktion zu erhalten. Die Methode ermöglicht es, ein solches Intervallsignal zu propagieren, ohne Annahmen über die Abhängigkeit und Verteilung des Fehlers über die Zeitschritte zu treffen. Diese neuartigen Darstellungen von Lastmodellen sind in der Lage, epistemische Unsicherheiten zu quantifizieren, die in realen Datensätzen enthalten sind und durch den PSD-Schätzprozess induziert werden. Die Stärken und Vorteile dieser Ansätze in der Praxis werden anhand mehrerer numerischer Beispiele aus dem Bereich der stochastischen Dynamik demonstriert

    A Fuzzy-Random Extension of the Lee-Carter Mortality Prediction Model

    Get PDF
    The Lee-Carter model is a useful dynamic stochastic model to represent the evolution of central mortality rates throughout time. This model only considers the uncertainty about the coefficient related to the mortality trend over time but not to the age-dependent coefficients. This paper proposes a fuzzy-random extension of the Lee-Carter model that allows quantifying the uncertainty of both kinds of parameters. As it is commonplace in actuarial literature, the variability of the time-dependent index is modeled as an ARIMA time series. Likewise, the uncertainty of the age-dependent coefficients is also quantified, but by using triangular fuzzy numbers. The consideration of this last hypothesis requires developing and solving a fuzzy regression model. Once the fuzzy-random extension has been introduced, it is also shown how to obtain some variables linked with central mortality rates such as death probabilities or life expectancies by using fuzzy numbers arithmetic. It is simultaneously shown the applicability of our developments with data of Spanish male population in the period 1970-2012. Finally, we make a comparative assessment of our method with alternative Lee-Carter model estimates on 16 Western Europe populations

    Enhance interval width of crime forecasting with ARIMA model-fuzzy alpha cut

    Get PDF
    With qualified data or information a better decision can be made. The interval width of forecasting is one of data values to assist in the selection decision making process in regards to crime prevention. However, in time series forecasting, especially the use of ARIMA model, the amount of historical data available can affect forecasting result including interval width forecasting value. This study proposes a combination technique, in order to get get a better interval width crime forecasting value. The propose combination technique between ARIMA model and Fuzzy Alpha Cut are presented. The use of variation alpha values are used, they are 0.3, 0.5, and 0.7. The experimental results have shown the use of ARIMA-FAC with alpha=0.5 is appropriate. The overall results obtained have shown the interval width crime forecasting with ARIMA-FAC is better than interval width crime forecasting with 95% CI ARIMA model

    Determination of fuzzy relations for economic fuzzy time series models by neural networks

    Get PDF
    Based on the works /11, 22, 27/ a fuzzy time series model is proposed and applied to predict chaotic financial process. Thwe general methodological framework of classical and fuzzy modelling of economic time series is considered. A complete fuzzy time series modellling approach is proposed which includes: determining and developing of fuzzy time series models, developing and calculating of fuzzy relations among the observations, calculating and interpreting the outputs. To generate fuzzy rules from data, the neural network with SCL-based product-space clustering is used

    CAESAR models for developmental toxicity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The new REACH legislation requires assessment of a large number of chemicals in the European market for several endpoints. Developmental toxicity is one of the most difficult endpoints to assess, on account of the complexity, length and costs of experiments. Following the encouragement of QSAR (<it>in silico</it>) methods provided in the REACH itself, the CAESAR project has developed several models.</p> <p>Results</p> <p>Two QSAR models for developmental toxicity have been developed, using different statistical/mathematical methods. Both models performed well. The first makes a classification based on a random forest algorithm, while the second is based on an adaptive fuzzy partition algorithm. The first model has been implemented and inserted into the CAESAR on-line application, which is java-based software that allows everyone to freely use the models.</p> <p>Conclusions</p> <p>The CAESAR QSAR models have been developed with the aim to minimize false negatives in order to make them more usable for REACH. The CAESAR on-line application ensures that both industry and regulators can easily access and use the developmental toxicity model (as well as the models for the other four endpoints).</p

    Comparison of Clustering Methods for Time Course Genomic Data: Applications to Aging Effects

    Full text link
    Time course microarray data provide insight about dynamic biological processes. While several clustering methods have been proposed for the analysis of these data structures, comparison and selection of appropriate clustering methods are seldom discussed. We compared 33 probabilistic based clustering methods and 33 distance based clustering methods for time course microarray data. Among probabilistic methods, we considered: smoothing spline clustering also known as model based functional data analysis (MFDA), functional clustering models for sparsely sampled data (FCM) and model-based clustering (MCLUST). Among distance based methods, we considered: weighted gene co-expression network analysis (WGCNA), clustering with dynamic time warping distance (DTW) and clustering with autocorrelation based distance (ACF). We studied these algorithms in both simulated settings and case study data. Our investigations showed that FCM performed very well when gene curves were short and sparse. DTW and WGCNA performed well when gene curves were medium or long (>=10>=10 observations). SSC performed very well when there were clusters of gene curves similar to one another. Overall, ACF performed poorly in these applications. In terms of computation time, FCM, SSC and DTW were considerably slower than MCLUST and WGCNA. WGCNA outperformed MCLUST by generating more accurate and biological meaningful clustering results. WGCNA and MCLUST are the best methods among the 6 methods compared, when performance and computation time are both taken into account. WGCNA outperforms MCLUST, but MCLUST provides model based inference and uncertainty measure of clustering results

    PROBABILISTIC APPROACHES TO STABILITY AND DEFORMATION PROBLEMS IN BRACED EXCAVATION

    Get PDF
    This dissertation is aimed at applying probabilistic approaches to evaluating the basal-heave stability and the excavation-induced wall and ground movements for serviceability assessment of excavation in clays. The focuses herein are the influence of spatial variability of soil parameters and small sample size on the results of the probabilistic analysis, and Bayesian updating of soil parameters using field observations in braced excavations. Simplified approaches for reliability analysis of basal-heave in a braced excavation in clay considering the effect of spatial variability in random fields are presented. The proposed approaches employ the variance reduction technique (or more precisely, equivalent variance method) to consider the effect of spatial variability so that the analysis for the probability of basal-heave failure can be performed using well-established first-order reliability method (FORM). Case studies show that simplified approaches yield results that are nearly identical to those obtained from the conventional random field modeling (RFM). The proposed approaches are shown to be effective and efficient for the probabilistic analysis of basal-heave in a braced excavation considering spatial variability. The variance reduction technique is then used in the probabilistic serviceability assessment in a case study. To characterize the effect of uncertainty in sample statistics and its influence on the results of probabilistic analysis, a simple procedure involving bootstrapping is presented. The procedure is applied to assessing the probability of serviceability failure in a braced excavation. The analysis for the probability of failure, referred to herein as probability of exceeding a specified limiting deformation, necessitates an evaluation of the means and standard deviations of critical soil parameters. In geotechnical practice, these means and standard deviations are often estimated from a very limited data set, which can lead to uncertainty in the derived sample statistics. In this study, bootstrapping is used to characterize the uncertainty or variation of sample statistics and its effect on the failure probability. Through the bootstrapping analysis, the probability of exceedance can be presented as a confidence interval instead of a single, fixed probability. The information gained should enable the engineers to make a more rational assessment of the risk of serviceability failure in a braced excavation. The case study demonstrates the potential of bootstrap method in coping with the problem of having to evaluate failure probability with uncertain sample statistics. Finally, a Bayesian framework using field observations for back analysis and updating of soil parameters in a multi-stage braced excavation is developed. Because of the uncertainties in the initial estimates of soil parameters and in the analysis model and other factors such as construction quality, the updated soil parameters are presented in the form of posterior distributions. In this dissertation, these posterior distributions are derived using Markov chain Monte Carlo (MCMC) sampling method implemented with Metropolis-Hastings algorithm. In the proposed framework, Bayesian updating is first realized with one type of response observation (maximum wall deflection or maximum ground surface settlement), and then this Bayesian framework is extended to allow for simultaneous use of two types of response observations in the updating. The proposed framework is illustrated with a quality excavation case and shown effective regardless of the prior knowledge of soil parameters and type of response observations adopted. The probabilistic approaches presented in this dissertation, ranging from probability-based design of basal heave, to probabilistic analysis of serviceability failure in a braced excavation considering spatial variability of soil parameters, to bootstrapping for characterizing the uncertainty of sample statistics and its effect, and to MCMC-based Bayesian updating of soil parameters during the construction, illustrate the potential of probability/statistics as a tool for enabling more rational solutions in geotechnical fields. The case studies presented in this dissertation demonstrate the usefulness of these tools

    Forecasting in Mathematics

    Get PDF
    Mathematical probability and statistics are an attractive, thriving, and respectable part of mathematics. Some mathematicians and philosophers of science say they are the gateway to mathematics’ deepest mysteries. Moreover, mathematical statistics denotes an accumulation of mathematical discussions connected with efforts to most efficiently collect and use numerical data subject to random or deterministic variations. Currently, the concept of probability and mathematical statistics has become one of the fundamental notions of modern science and the philosophy of nature. This book is an illustration of the use of mathematics to solve specific problems in engineering, statistics, and science in general
    • …
    corecore