3,266 research outputs found
Quantitative Association Rules Applied to Climatological Time Series Forecasting
This work presents the discovering of association rules based on evolutionary techniques in order to obtain relationships among correlated time series. For this purpose, a genetic algorithm has been proposed to determine the intervals that form the rules without discretizing the attributes and allowing the overlapping of the regions covered by the rules. In addition, the algorithm has been tested on real-world climatological time series such as temperature, wind and ozone and results are reported and compared to that of the well-known Apriori algorithm
Rules Discovery of High Ozone in Klang Areas using Data Mining Approach
Ground level ozone (O3) is one of the common pollution issues that has a negative influence on human health. However, the increasing trends in O3 level nowadays which due to rapid development has become a great concern over the world. Thus, developing an accurate O3 forecasting model is necessary. However, the interesting pattern from the data should be identified beforehand. Association rules is a data mining technique that has an advantage to discover frequent patterns in a dataset, which subsequently will be useful in the research domain. Therefore, this paper presents the discovering knowledge based on association rules and clustering technique towards a climatological O3 dataset. In this study, the data was analysed to find the behaviour of each precursors. Later K-means clustering technique was used to find the suitable range for each chosen variable independently, then applied Apriori based association rules technique to present the behaviours in a meaningful and understandable format. The climatological O3 time series data has been collected from Department of Environment for Klang station from year 1997 to 2012. However, the proposed method only applied on high O3 concentration data during stated years to find the association pattern. The outcome has discovered 17 strong rules. The patterns and behaviours of the selected variables during high O3 concentration has been discovered. The rules are benefit to the government on how to control the air quality later
How complex climate networks complement eigen techniques for the statistical analysis of climatological data
Eigen techniques such as empirical orthogonal function (EOF) or coupled
pattern (CP) / maximum covariance analysis have been frequently used for
detecting patterns in multivariate climatological data sets. Recently,
statistical methods originating from the theory of complex networks have been
employed for the very same purpose of spatio-temporal analysis. This climate
network (CN) analysis is usually based on the same set of similarity matrices
as is used in classical EOF or CP analysis, e.g., the correlation matrix of a
single climatological field or the cross-correlation matrix between two
distinct climatological fields. In this study, formal relationships as well as
conceptual differences between both eigen and network approaches are derived
and illustrated using exemplary global precipitation, evaporation and surface
air temperature data sets. These results allow to pinpoint that CN analysis can
complement classical eigen techniques and provides additional information on
the higher-order structure of statistical interrelationships in climatological
data. Hence, CNs are a valuable supplement to the statistical toolbox of the
climatologist, particularly for making sense out of very large data sets such
as those generated by satellite observations and climate model intercomparison
exercises.Comment: 18 pages, 11 figure
Reliability, Sufficiency, and the Decomposition of Proper Scores
Scoring rules are an important tool for evaluating the performance of
probabilistic forecasting schemes. In the binary case, scoring rules (which are
strictly proper) allow for a decomposition into terms related to the resolution
and to the reliability of the forecast. This fact is particularly well known
for the Brier Score. In this paper, this result is extended to forecasts for
finite--valued targets. Both resolution and reliability are shown to have a
positive effect on the score. It is demonstrated that resolution and
reliability are directly related to forecast attributes which are desirable on
grounds independent of the notion of scores. This finding can be considered an
epistemological justification of measuring forecast quality by proper scores. A
link is provided to the original work of DeGroot et al (1982), extending their
concepts of sufficiency and refinement. The relation to the conjectured
sharpness principle of Gneiting et al (2005a) is elucidated.Comment: v1: 9 pages; submitted to International Journal of Forecasting v2: 12
pages; Significant change of contents; stronger focus on decomposition;
Extensive comments on and extensions of earlier work, in particular
sufficienc
Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution
This research presents the mining of quantitative association rules based on evolutionary computation techniques.
First, a real-coded genetic algorithm that extends the well-known binary-coded CHC algorithm has been projected to determine
the intervals that define the rules without needing to discretize the attributes. The proposed algorithm is evaluated in synthetic
datasets under different levels of noise in order to test its performance and the reported results are then compared to that of
a multi-objective differential evolution algorithm, recently published. Furthermore, rules from real-world time series such as
temperature, humidity, wind speed and direction of the wind, ozone, nitrogen monoxide and sulfur dioxide have been discovered
with the objective of finding all existing relations between atmospheric pollution and climatological conditions.Ministerio de Ciencia y Tecnología TIN2007-68084-C-00Junta de Andalucía P07-TIC-0261
An evolutionary algorithm to discover quantitative association rules in multidimensional time series
An evolutionary approach for finding existing
relationships among several variables of a multidimensional
time series is presented in this work. The proposed model to
discover these relationships is based on quantitative association
rules. This algorithm, called QARGA (Quantitative
Association Rules by Genetic Algorithm), uses a particular
codification of the individuals that allows solving two basic
problems. First, it does not perform a previous attribute
discretization and, second, it is not necessary to set which
variables belong to the antecedent or consequent. Therefore,
it may discover all underlying dependencies among
different variables. To evaluate the proposed algorithm
three experiments have been carried out. As initial step,
several public datasets have been analyzed with the purpose
of comparing with other existing evolutionary approaches.
Also, the algorithm has been applied to synthetic time series
(where the relationships are known) to analyze its potential
for discovering rules in time series. Finally, a real-world
multidimensional time series composed by several climatological
variables has been considered. All the results show
a remarkable performance of QARGA.Ministerio de Ciencia y Tecnología TIN2007- 68084-C02-02Junta de Andalucia P07-TIC- 0261
- …