2,757 research outputs found
Vertical wind profile characterization and identification of patterns based on a shape clustering algorithm
Wind power plants are becoming a generally accepted resource in the generation mix of many utilities. At the same time, the size and the power rating of individual wind turbines have increased considerably. Under these circumstances, the sector is increasingly demanding an accurate characterization of vertical wind speed profiles to estimate properly the incoming wind speed at the rotor swept area and, consequently, assess the potential for a wind power plant site. The present paper describes a shape-based clustering characterization and visualization of real vertical wind speed data. The proposed solution allows us to identify the most likely vertical wind speed patterns for a specific location based on real wind speed measurements. Moreover, this clustering approach also provides characterization and classification of such vertical wind profiles. This solution is highly suitable for a large amount of data collected by remote sensing equipment, where wind speed values at different heights within the rotor swept area are available for subsequent analysis. The methodology is based on z-normalization, shape-based distance metric solution and the Ward-hierarchical clustering method. Real vertical wind speed profile data corresponding to a Spanish wind power plant and collected by using a commercialWindcube equipment during several months are used to assess the proposed characterization and clustering process, involving more than 100000 wind speed data values. All analyses have been implemented using open-source R-software. From the results, at least four different vertical wind speed patterns are identified to characterize properly over 90% of the collected wind speed data along the day. Therefore, alternative analytical function criteria should be subsequently proposed for vertical wind speed characterization purposes.The authors are grateful for the financial support from the Spanish Ministry of the Economy and Competitiveness and the European Union —ENE2016-78214-C2-2-R—and the Spanish Education, Culture and Sport Ministry —FPU16/042
A General Spatio-Temporal Clustering-Based Non-local Formulation for Multiscale Modeling of Compartmentalized Reservoirs
Representing the reservoir as a network of discrete compartments with
neighbor and non-neighbor connections is a fast, yet accurate method for
analyzing oil and gas reservoirs. Automatic and rapid detection of coarse-scale
compartments with distinct static and dynamic properties is an integral part of
such high-level reservoir analysis. In this work, we present a hybrid framework
specific to reservoir analysis for an automatic detection of clusters in space
using spatial and temporal field data, coupled with a physics-based multiscale
modeling approach. In this work a novel hybrid approach is presented in which
we couple a physics-based non-local modeling framework with data-driven
clustering techniques to provide a fast and accurate multiscale modeling of
compartmentalized reservoirs. This research also adds to the literature by
presenting a comprehensive work on spatio-temporal clustering for reservoir
studies applications that well considers the clustering complexities, the
intrinsic sparse and noisy nature of the data, and the interpretability of the
outcome.
Keywords: Artificial Intelligence; Machine Learning; Spatio-Temporal
Clustering; Physics-Based Data-Driven Formulation; Multiscale Modelin
Clustering Time Series from Mixture Polynomial Models with Discretised Data
Clustering time series is an active research area with applications in many fields. One common feature of time series is the likely presence of outliers. These uncharacteristic data can significantly effect the quality of clusters formed. This paper evaluates a method of over-coming the detrimental effects of outliers. We describe some of the alternative approaches to clustering time series, then specify a particular class of model for experimentation with k-means clustering and a correlation based distance metric. For data derived from this class of model we demonstrate that discretising the data into a binary series of above and below the median improves the clustering when the data has outliers. More specifically, we show that firstly discretisation does not significantly effect the accuracy of the clusters when there are no outliers and secondly it significantly increases the accuracy in the presence of outliers, even when the probability of outlier is very low
Comparison of Clustering Methods for Time Course Genomic Data: Applications to Aging Effects
Time course microarray data provide insight about dynamic biological
processes. While several clustering methods have been proposed for the analysis
of these data structures, comparison and selection of appropriate clustering
methods are seldom discussed. We compared probabilistic based clustering
methods and distance based clustering methods for time course microarray
data. Among probabilistic methods, we considered: smoothing spline clustering
also known as model based functional data analysis (MFDA), functional
clustering models for sparsely sampled data (FCM) and model-based clustering
(MCLUST). Among distance based methods, we considered: weighted gene
co-expression network analysis (WGCNA), clustering with dynamic time warping
distance (DTW) and clustering with autocorrelation based distance (ACF). We
studied these algorithms in both simulated settings and case study data. Our
investigations showed that FCM performed very well when gene curves were short
and sparse. DTW and WGCNA performed well when gene curves were medium or long
( observations). SSC performed very well when there were clusters of gene
curves similar to one another. Overall, ACF performed poorly in these
applications. In terms of computation time, FCM, SSC and DTW were considerably
slower than MCLUST and WGCNA. WGCNA outperformed MCLUST by generating more
accurate and biological meaningful clustering results. WGCNA and MCLUST are the
best methods among the 6 methods compared, when performance and computation
time are both taken into account. WGCNA outperforms MCLUST, but MCLUST provides
model based inference and uncertainty measure of clustering results
Exact Mean Computation in Dynamic Time Warping Spaces
Dynamic time warping constitutes a major tool for analyzing time series. In
particular, computing a mean series of a given sample of series in dynamic time
warping spaces (by minimizing the Fr\'echet function) is a challenging
computational problem, so far solved by several heuristic and inexact
strategies. We spot some inaccuracies in the literature on exact mean
computation in dynamic time warping spaces. Our contributions comprise an exact
dynamic program computing a mean (useful for benchmarking and evaluating known
heuristics). Based on this dynamic program, we empirically study properties
like uniqueness and length of a mean. Moreover, experimental evaluations reveal
substantial deficits of state-of-the-art heuristics in terms of their output
quality. We also give an exact polynomial-time algorithm for the special case
of binary time series
- …