Search CORE

31 research outputs found

Robust Methods for Soft Clustering of Multidimensional Time Series

Author: D'Urso Pierpaolo
Lafuente Rego Borja Raúl
López-Oriona Ángel
Vilar José
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.[Abstract] Three robust algorithms for clustering multidimensional time series from the perspective of underlying processes are proposed. The methods are robust extensions of a fuzzy C-means model based on estimates of the quantile cross-spectral density. Robustness to the presence of anomalous elements is achieved by using the so-called metric, noise and trimmed approaches. Analyses from a wide simulation study indicate that the algorithms are substantially effective in coping with the presence of outlying series, clearly outperforming alternative procedures. The usefulness of the suggested methods is also highlighted by means of a specific application.This research has been supported by MINECO (MTM2017-82724-R and PID2020-113578RB-100), the Xunta de Galicia (ED431C-2020-14), and “CITIC” (ED431G 2019/01).Xunta de Galicia; ED431C-2020-14Xunta de Galicia; ED431G 2019/0

Repositorio da Universidade da Coruña

Clustering time series: an application to COVID-19 data

Author: Margherita Gerolimetto
Stefano Magrini
Publication venue
Publication date: 01/01/2022
Field of study

In this paper we present an attempt of clustering time series focusing on Italian data about COVID-19. From the methodological point of view, we first present a review of the most important methods existing in literature for time series clustering. Similarly to cross-sectional clustering, time series clustering moves from the choice of an opportune algorithm to produce clusters. Several algorithms have been developed to carry out time series clustering and the choice of which one is more adapt depends on both the aim of the analysis itself and the typology of data at hand. We apply some of these methods to the data set of daily time series on intensive care and deaths for COVID19 stretching from, respectively, 23/02/2020 to 15/02/2022 and from 23/02/2020 to 29/03/2022. These data refer to the 19 Italian regions and the two autonomous provinces of Trento and Bolzano

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Comparison of Clustering Methods for Time Course Genomic Data: Applications to Aging Effects

Author: Horvath Steve
Ophoff Roel
Telesca Donatello
Zhang Yafeng
Publication venue
Publication date: 29/04/2014
Field of study

Time course microarray data provide insight about dynamic biological processes. While several clustering methods have been proposed for the analysis of these data structures, comparison and selection of appropriate clustering methods are seldom discussed. We compared

3

probabilistic based clustering methods and

3

distance based clustering methods for time course microarray data. Among probabilistic methods, we considered: smoothing spline clustering also known as model based functional data analysis (MFDA), functional clustering models for sparsely sampled data (FCM) and model-based clustering (MCLUST). Among distance based methods, we considered: weighted gene co-expression network analysis (WGCNA), clustering with dynamic time warping distance (DTW) and clustering with autocorrelation based distance (ACF). We studied these algorithms in both simulated settings and case study data. Our investigations showed that FCM performed very well when gene curves were short and sparse. DTW and WGCNA performed well when gene curves were medium or long (

>=10

observations). SSC performed very well when there were clusters of gene curves similar to one another. Overall, ACF performed poorly in these applications. In terms of computation time, FCM, SSC and DTW were considerably slower than MCLUST and WGCNA. WGCNA outperformed MCLUST by generating more accurate and biological meaningful clustering results. WGCNA and MCLUST are the best methods among the 6 methods compared, when performance and computation time are both taken into account. WGCNA outperforms MCLUST, but MCLUST provides model based inference and uncertainty measure of clustering results

arXiv.org e-Print Archive

eScholarship - University of California

Clustering stationary and non-stationary time series based on autocorrelation distance of hierarchical and k-means algorithms

Author: Aldho Riski Irawan
Dian Sukma Pratiwi
Kartika Fithriasari
Mohammad Alfan Alfian Riyadi
Publication venue: 'Universitas Ahmad Dahlan, Kampus 3'
Publication date: 01/12/2017
Field of study

Observing large dimension time series could be time-consuming. One identification and classification approach is a time series clustering. This study aimed to compare the accuracy of two algorithms, hierarchical cluster and K-Means cluster, using ACF’s distance for clustering stationary and non-stationary time series data. This research uses both simulation and real datasets. The simulation generates 7 stationary data models and another 7 of non-stationary data models. On the other hands, the real dataset is the daily temperature data in 34 cities in Indonesia. As a result, K-Means algorithm has the highest accuracy for both data models

Crossref

International Journal of Advances in Intelligent Informatics

Directory of Open Access Journals

International Journal of Advances in Intelligent Informatics (IJAIN)

Copula-based fuzzy clustering of spatial time series

Author: Alonso
Athanasopoulos
Basford
Birant
Bárdossy
Caiado
Caiado
Caiado
Caiado
Campello
Coppi
Coppi
Coppi
Coppi
De Luca
De Luca
Di Lascio
Di Lascio
Durante
Durante
Durante
Durante
Durante
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
D’Urso
Ester
Everitt
Fouedjio
Garcia-Escudero
Genest
Grabisch
Guthke
Handl
Hu
Hubert
Hwang
Hyndman
Hüllermeier
Ienco
Izakian
James
Joe
Kamdar
Kaufman
Kazianka
Klement
Krishnapuram
Lafuente-Rego
Maharaj
Maharaj
Maharaj
Montes
Nelsen
Otranto
Patton
Piccolo
Rand
Shekhar
Torabi
Torabi
Vilar
Viroli
Wang
Wang
Warren Liao
Wedel
Xie
Xie
Yager
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

This paper contributes to the existing literature on the analysis of spatial time series presenting a new clustering algorithm called COFUST, i.e. COpula-based FUzzy clustering algorithm for Spatial Time series. The underlying idea of this algorithm is to perform a fuzzy Partitioning Around Medoids (PAM) clustering using copula-based approach to interpret comovements of time series. This generalisation allows both to extend usual clustering methods for time series based on Pearson’s correlation and to capture the uncertainty that arises assigning units to clusters. Furthermore, its flexibility permits to include directly in the algorithm the spatial information. Our approach is presented and discussed using both simulated and real data, highlighting its main advantages

Crossref

Bournemouth University Research Online

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca- Università del Salento

Archivio istituzionale della ricerca - Università di Padova

Fuzzy clustering with entropy regularization for interval-valued data with an application to scientific journal citations

Author: Alaimo LS
D'Urso P
De Giovanni L
Mattera R
Vitale V
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

In recent years, the research of statistical methods to analyze complex structures of data has increased. In particular, a lot of attention has been focused on the interval-valued data. In a classical cluster analysis framework, an interesting line of research has focused on the clustering of interval-valued data based on fuzzy approaches. Following the partitioning around medoids fuzzy approach research line, a new fuzzy clustering model for interval-valued data is suggested. In particular, we propose a new model based on the use of the entropy as a regularization function in the fuzzy clustering criterion. The model uses a robust weighted dissimilarity measure to smooth noisy data and weigh the center and radius components of the interval-valued data, respectively. To show the good performances of the proposed clustering model, we provide a simulation study and an application to the clustering of scientific journals in research evaluation

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Spatio-temporal clustering: Neighbourhoods based on median seasonal entropy

Author: Ruiz Reina Miguel Ángel
Publication venue: 'Elsevier BV'
Publication date: 24/08/2021
Field of study

In this research, a new uncertainty clustering method has been developed and applied to the spatial time series with seasonality. The new unsupervised grouping method is based on Neighbourhoods and Median Seasonal Entropy. This classification method aims to discover similar behaviours for a time series group and find a dissimilarity measure concerning a reference series r. The Neighbourhood’s Internal Verification Coefficient criterion makes it possible to measure intra-group similarity. This clustering criterion is flexible for spatial information. Our empirical approach allows us to measure accommodation decisions for tourists who visit Spain and decide to stay either in hotels or in tourist apartments. The results show the existence of dynamic seasonal patterns of behaviour. These insights support the decisions of economic agents.This research is associated with the group of Faculty of Economic and Business Sciences at the University of Malaga: “Social Indicators-SEJ157”. The research group has funded the professional editing service in English. Research Funders: “Funding for open access charge: Universidad de Málaga/CBUA”

Repositorio Institucional Universidad de Málaga

Entropy-based fuzzy clustering of interval-valued time series

Author: D'Urso P.
De Giovanni L.
Mattera R.
Vitale V.
Publication venue: Springer
Publication date
Field of study

This paper proposes a fuzzy C-medoids-based clustering method with entropy regularization to solve the issue of grouping complex data as interval-valued time series. The dual nature of the data, that are both time-varying and interval-valued, needs to be considered and embedded into clustering techniques. In this work, a new dissimilarity measure, based on Dynamic Time Warping, is proposed. The performance of the new clustering procedure is evaluated through a simulation study and an application to financial time series

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

Clustering networked funded European research activities through rank-size laws

Author: Cerqueti R.
Cerqueti R.
Iovanella A.
Iovanella A.
Mattera R.
Mattera R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

This paper treats a well-established public evaluation problem, which is the analysis of the funded research projects. We specifically deal with the collection of the research actions funded by the European Union over the 7th Framework Programme for Research and Technological Development and Horizon 2020. The reference period is 2007–2020. The study is developed through three methodological steps. First, we consider the networked scientific institutions by stating a link between two organizations when they are partners in the same funded project. In doing so, we build yearly complex networks. We compute four nodal centrality measures with relevant, informative content for each of them. Second, we implement a rank-size procedure on each network and each centrality measure by testing four meaningful classes of parametric curves to fit the ranked data. At the end of such a step, we derive the best fit curve and the calibrated parameters. Third, we perform a clustering procedure based on the best-fit curves of the ranked data for identifying regularities and deviations among years of research and scientific institutions. The joint employment of the three methodological approaches allows a clear view of the research activity in Europe in recent years

LSBU Research Open

Distribution-based entropy weighting clustering of skewed and heavy tailed time series

Author: Giacalone Massimiliano
Gibert Karina
Mattera Raffaele
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The goal of clustering is to identify common structures in a data set by forming groups of homogeneous objects. The observed characteristics of many economic time series motivated the development of classes of distributions that can accommodate properties, such as heavy tails and skewness. Thanks to its flexibility, the skewed exponential power distribution (also called skewed generalized error distribution) ensures a unified and general framework for clustering possibly skewed and heavy tailed time series. This paper develops a clustering procedure of model-based type, assuming that the time series are generated by the same underlying probability distribution but with different parameters. Moreover, we propose to optimally combine the estimated parameters to form the clusters with an entropy weighing k-means approach. The usefulness of the proposal is shown by means of application to financial time series, demonstrating also how the obtained clusters can be used to form portfolio of stocks.Peer ReviewedPostprint (published version

Archivio della ricerca - Università degli studi di Napoli Federico II

UPCommons. Portal del coneixement obert de la UPC

Archivio della ricerca- Università di Roma La Sapienza

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"