Search CORE

18,032 research outputs found

On Instance Weighted Clustering Ensembles

Author: Helian Na
Lilley Mariana
Moggridge Paul
Sun Yi
Publication venue
Publication date: 06/10/2023
Field of study

© ESANN, 2023. This is the accepted manuscript version of an article which has been published in final form at: www.esann.org/proceedings/2023Ensemble clustering is a technique which combines multipleclustering results, and instance weighting is a technique which highlightsimportant instances in a dataset. Both techniques are known to enhanceclustering performance and robustness. In this research, ensembles andinstance weighting are integrated with the spectral clustering algorithm.We believe this is the first attempt at creating diversity in the generativemechanism using density based instance weighting for a spectral ensemble.The proposed approach is empirically validated using synthetic datasetscomparing against spectral and a spectral ensemble with random instanceweighting. Results show that using the instance weighted sub-samplingapproach as the generative mechanism for an ensemble of spectral cluster-ing leads to improved clustering performance on datasets with imbalancedclusters.Peer reviewe

University of Hertfordshire Research Archive

Combining Multiple Clusterings via Crowd Agreement Estimation and Multi-Granularity Link Analysis

Author: Huang Dong
Lai Jian-Huang
Wang Chang-Dong
Publication venue: 'Elsevier BV'
Publication date: 03/06/2016
Field of study

The clustering ensemble technique aims to combine multiple clusterings into a probably better and more robust clustering and has been receiving an increasing attention in recent years. There are mainly two aspects of limitations in the existing clustering ensemble approaches. Firstly, many approaches lack the ability to weight the base clusterings without access to the original data and can be affected significantly by the low-quality, or even ill clusterings. Secondly, they generally focus on the instance level or cluster level in the ensemble system and fail to integrate multi-granularity cues into a unified model. To address these two limitations, this paper proposes to solve the clustering ensemble problem via crowd agreement estimation and multi-granularity link analysis. We present the normalized crowd agreement index (NCAI) to evaluate the quality of base clusterings in an unsupervised manner and thus weight the base clusterings in accordance with their clustering validity. To explore the relationship between clusters, the source aware connected triple (SACT) similarity is introduced with regard to their common neighbors and the source reliability. Based on NCAI and multi-granularity information collected among base clusterings, clusters, and data instances, we further propose two novel consensus functions, termed weighted evidence accumulation clustering (WEAC) and graph partitioning with multi-granularity link analysis (GP-MGLA) respectively. The experiments are conducted on eight real-world datasets. The experimental results demonstrate the effectiveness and robustness of the proposed methods.Comment: The MATLAB source code of this work is available at: https://www.researchgate.net/publication/28197031

arXiv.org e-Print Archive

CiteSeerX

Reconstructing the world trade multiplex: the role of intensive and extensive biases

Author: Fagiolo Giorgio
Garlaschelli Diego
Mastrandrea Rossana
Squartini Tiziano
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2014
Field of study

In economic and financial networks, the strength of each node has always an important economic meaning, such as the size of supply and demand, import and export, or financial exposure. Constructing null models of networks matching the observed strengths of all nodes is crucial in order to either detect interesting deviations of an empirical network from economically meaningful benchmarks or reconstruct the most likely structure of an economic network when the latter is unknown. However, several studies have proved that real economic networks and multiplexes are topologically very different from configurations inferred only from node strengths. Here we provide a detailed analysis of the World Trade Multiplex by comparing it to an enhanced null model that simultaneously reproduces the strength and the degree of each node. We study several temporal snapshots and almost one hundred layers (commodity classes) of the multiplex and find that the observed properties are systematically well reproduced by our model. Our formalism allows us to introduce the (static) concept of extensive and intensive bias, defined as a measurable tendency of the network to prefer either the formation of extra links or the reinforcement of link weights, with respect to a reference case where only strengths are enforced. Our findings complement the existing economic literature on (dynamic) intensive and extensive trade margins. More in general, they show that real-world multiplexes can be strongly shaped by layer-specific local constraints

arXiv.org e-Print Archive

Crossref

HAL AMU

Archivio della ricerca della Scuola IMT Alti Studi Lucca

Archivio della ricerca della Scuola Superiore Sant'Anna

IMT Institutional Repository

On Thermalization in Classical Scalar Field Theory

Author: Aarts
Aoki
Bettencourt
Bonini
Christof Wetterich
García-Bellido
Gert Aarts
Gian Franco Bonini
Heinz
Khlebnikov
Parisi
Prokopec
Weinberg
Wetterich
Publication venue: 'Elsevier BV'
Publication date: 01/01/2000
Field of study

Thermalization of classical fields is investigated in a \phi^4 scalar field theory in 1+1 dimensions, discretized on a lattice. We numerically integrate the classical equations of motion using initial conditions sampled from various nonequilibrium probability distributions. Time-dependent expectation values of observables constructed from the canonical momentum are compared with thermal ones. It is found that a closed system, evolving from one initial condition, thermalizes to high precision in the thermodynamic limit, in a time-averaged sense. For ensembles consisting of many members with the same energy, we find that expectation values become stationary - and equal to the thermal values - in the limit of infinitely many members. Initial ensembles with a nonzero (noncanonical) spread in the energy density or other conserved quantities evolve to noncanonical stationary ensembles. In the case of a narrow spread, asymptotic values of primary observables are only mildly affected. In contrast, fluctuations and connected correlation functions will differ substantially from the canonical values. This raises doubts on the use of a straightforward expansion in terms of 1PI-vertex functions to study thermalization.Comment: 17 pages with 6 eps figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Cronfa at Swansea University

CERN Document Server

LinkCluE: A MATLAB Package for Link-Based Cluster Ensembles

Author: Natthakan Iam-on
Simon Garrett
Publication venue
Publication date
Field of study

Cluster ensembles have emerged as a powerful meta-learning paradigm that provides improved accuracy and robustness by aggregating several input data clusterings. In particular, link-based similarity methods have recently been introduced with superior performance to the conventional co-association approach. This paper presents a MATLAB package, LinkCluE, that implements the link-based cluster ensemble framework. A variety of functional methods for evaluating clustering results, based on both internal and external criteria, are also provided. Additionally, the underlying algorithms together with the sample uses of the package with interesting real and synthetic datasets are demonstrated herein.

Research Papers in Economics

Low-temperature behaviour of social and economic networks

Author: Bianconi
Boguñá
Burda
Diego Garlaschelli
Guido Caldarelli
Li
Sebastian Ahnert
Squartini
Thomas Fink
Publication venue: 'MDPI AG'
Publication date: 01/01/2013
Field of study

Real-world social and economic networks typically display a number of particular topological properties, such as a giant connected component, a broad degree distribution, the small-world property and the presence of communities of densely interconnected nodes. Several models, including ensembles of networks also known in social science as Exponential Random Graphs, have been proposed with the aim of reproducing each of these properties in isolation. Here we define a generalized ensemble of graphs by introducing the concept of graph temperature, controlling the degree of topological optimization of a network. We consider the temperature-dependent version of both existing and novel models and show that all the aforementioned topological properties can be simultaneously understood as the natural outcomes of an optimized, low-temperature topology. We also show that seemingly different graph models, as well as techniques used to extract information from real networks, are all found to be particular low-temperature cases of the same generalized formalism. One such technique allows us to extend our approach to real weighted networks. Our results suggest that a low graph temperature might be an ubiquitous property of real socio-economic networks, placing conditions on the diffusion of information across these systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Archivio della ricerca della Scuola IMT Alti Studi Lucca

IMT Institutional Repository