Search CORE

831 research outputs found

Maximum Margin Clustering for State Decomposition of Metastable Systems

Author: Allwein
Becker
Berglund
Biancalani
Bowman
Boyd
Chema
Chodera
Chodera
Chodera
Crammer
Daura
Deuflhard
Deuflhard
Elmer
Genova
Glättli
Groningen
Hao Wu
Hastie
Horn
Jain
Keller
Kellogg
Kloeden
Kwak
McGibbon
Mehrmann
Noé
Noé
Noé
Noé
Noé
Nüske
Prinz
Pryor
Pérez-Hernández
Rahimi
Sarich
Schwantes
Shalev-Shwartz
Shao
Sorin
Swope
Vapnik
Wu
Xu
Yao
Zhang
Publication venue
Publication date: 31/12/2014
Field of study

When studying a metastable dynamical system, a prime concern is how to decompose the phase space into a set of metastable states. Unfortunately, the metastable state decomposition based on simulation or experimental data is still a challenge. The most popular and simplest approach is geometric clustering which is developed based on the classical clustering technique. However, the prerequisites of this approach are: (1) data are obtained from simulations or experiments which are in global equilibrium and (2) the coordinate system is appropriately selected. Recently, the kinetic clustering approach based on phase space discretization and transition probability estimation has drawn much attention due to its applicability to more general cases, but the choice of discretization policy is a difficult task. In this paper, a new decomposition method designated as maximum margin metastable clustering is proposed, which converts the problem of metastable state decomposition to a semi-supervised learning problem so that the large margin technique can be utilized to search for the optimal decomposition without phase space discretization. Moreover, several simulation examples are given to illustrate the effectiveness of the proposed method

arXiv.org e-Print Archive

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Local and Global Error Models to Improve Uncertainty Quantification

Author: Josset Laureline
Lunati Ivan
Publication venue
Publication date: 18/06/2018
Field of study

In groundwater applications, Monte Carlo methods are employed to model the uncertainty on geological parameters. However, their brute-force application becomes computationally prohibitive for highly detailed geological descriptions, complex physical processes, and a large number of realizations. The Distance Kernel Method (DKM) overcomes this issue by clustering the realizations in a multidimensional space based on the flow responses obtained by means of an approximate (computationally cheaper) model; then, the uncertainty is estimated from the exact responses that are computed only for one representative realization per cluster (the medoid). Usually, DKM is employed to decrease the size of the sample of realizations that are considered to estimate the uncertainty. We propose to use the information from the approximate responses for uncertainty quantification. The subset of exact solutions provided by DKM is then employed to construct an error model and correct the potential bias of the approximate model. Two error models are devised that both employ the difference between approximate and exact medoid solutions, but differ in the way medoid errors are interpolated to correct the whole set of realizations. The Local Error Model rests upon the clustering defined by DKM and can be seen as a natural way to account for intra-cluster variability; the Global Error Model employs a linear interpolation of all medoid errors regardless of the cluster to which the single realization belongs. These error models are evaluated for an idealized pollution problem in which the uncertainty of the breakthrough curve needs to be estimated. For this numerical test case, we demonstrate that the error models improve the uncertainty quantification provided by the DKM algorithm and are effective in correcting the bias of the estimate computed solely from the MsFV results. The framework presented here is not specific to the methods considered and can be applied to other combinations of approximate models and techniques to select a subset of realization

RERO DOC Digital Library

An adaptive version of k-medoids to deal with the uncertainty in clustering heterogeneous data using an intermediary fusion approach

Author: A Oliva
A Strehl
Aalaa Mojahed
B Khaleghi
Beatriz de la Iglesia
BV Dasarathy
D Hall
DJ Berndt
E Acar
G Salton
GRG Lanckriet
GRG Lanckriet
H-S Park
L Kaufman
L Kaufman
LR Dice
M Žitnik
MA Abidi
MH Vliet van
N-EE Faouzi
OA Akeem
P Pavlidis
RA Baeza-Yates
S Jaccard
TN Manjunath
TY Chan
WM Rand
Y Shi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This paper introduces Hk-medoids, a modified version of the standard k-medoids algorithm. The modification extends the algorithm for the problem of clustering complex heterogeneous objects that are described by a diversity of data types, e.g. text, images, structured data and time series. We first proposed an intermediary fusion approach to calculate fused similarities between objects, SMF, taking into account the similarities between the component elements of the objects using appropriate similarity measures. The fused approach entails uncertainty for incomplete objects or for objects which have diverging distances according to the different component. Our implementation of Hk-medoids proposed here works with the fused distances and deals with the uncertainty in the fusion process. We experimentally evaluate the potential of our proposed algorithm using five datasets with different combinations of data types that define the objects. Our results show the feasibility of the our algorithm, and also they show a performance enhancement when comparing to the application of the original SMF approach in combination with a standard k-medoids that does not take uncertainty into account. In addition, from a theoretical point of view, our proposed algorithm has lower computation complexity than the popular PAM implementation

Crossref

University of East Anglia digital repository