Search CORE

72,451 research outputs found

Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score

Author: Coraggio Luca
Coretto Pietro
Publication venue
Publication date: 02/05/2022
Field of study

Cluster analysis requires many decisions: the clustering method and the implied reference model, the number of clusters and, often, several hyper-parameters and algorithms' tunings. In practice, one produces several partitions, and a final one is chosen based on validation or selection criteria. There exist an abundance of validation methods that, implicitly or explicitly, assume a certain clustering notion. Moreover, they are often restricted to operate on partitions obtained from a specific method. In this paper, we focus on groups that can be well separated by quadratic or linear boundaries. The reference cluster concept is defined through the quadratic discriminant score function and parameters describing clusters' size, center and scatter. We develop two cluster-quality criteria called quadratic scores. We show that these criteria are consistent with groups generated from a general class of elliptically-symmetric distributions. The quest for this type of groups is common in applications. The connection with likelihood theory for mixture models and model-based clustering is investigated. Based on bootstrap resampling of the quadratic scores, we propose a selection rule that allows choosing among many clustering solutions. The proposed method has the distinctive advantage that it can compare partitions that cannot be compared with other state-of-the-art methods. Extensive numerical experiments and the analysis of real data show that, even if some competing methods turn out to be superior in some setups, the proposed methodology achieves a better overall performance.Comment: Supplemental materials are included at the end of the pape

arXiv.org e-Print Archive

Archivio della ricerca - Università degli studi di Napoli Federico II

Archivio della Ricerca - Università di Salerno

Murnaghan-Nakayama Rule The Explanation and Usage of the Algorithm

Author: Sandal Elias
Publication venue: 'UiT The Arctic University of Norway'
Publication date: 15/05/2023
Field of study

Character values are not the easiest to calculate, so it is important to find good algorithms that can help ease these calculations. In the 20th century, the two mathematicians Murnaghan and Nakayama developed a rule that calculates character values for partitions on some computations. This rule has later been given the name The Murnaghan-Nakayama rule, after these two authors. The Murnaghan-Nakayama rule is a combinatorial method for computing character values of irreducible representations of symmetric groups. This makes this rule an important part of representation theory. One of the versions of this rule is stated in the recursive Murnaghan-Nakayama rule. Where, in this version, we can use border strips and diagrams to calculate the character values of representations on a given composition. This algorithm is quite fast in these calculations. The Murnaghan-Nakayama rule can also be considered a central algorithm in representation theory over symmetric groups. It is a fascinating and powerful algorithm that has a strong connection to both combinatorics and representation theory

Munin - Open Research Archive

Partitioning of Uniform Dependency Algorithms for Parallel Execution on MIMD/ Systolic Systems

Author: Fortes Jose A. B.
Shang Weijia
Publication venue: 'Purdue University (bepress)'
Publication date: 01/04/1988
Field of study

An algorithm can be modeled as an index set and a set of dependence vectors. Each index vector in the index set indexes a computation of the algorithm. If the execution of a computation depends on the execution of another computation, then this dependency is represented as the difference between the index vectors of the computations. The dependence matrix corresponds to a matrix where each column is a dependence vector. An independent partition of the index set is such that there are no dependencies between computations that belong to different blocks of the partition. This report considers uniform dependence algorithms with any arbitrary kind of index set and proposes two very simple methods to find independent partitions of the index set. Each method has advantages over the other one for certain kind of application, and they both outperform previously proposed approaches in terms of computational complexity and/or optimality. Also, lower bounds and upper bounds of the cardinality of the maximal independent partitions are given. For some algorithms it is shown that the cardinality of the maximal partition is equal to the greatest common divisor of some subdeterminants of the dependence matrix. In an MIMD/multiple systolic array computation environment, if different blocks of ail independent partition are assigned to different processors/arrays, the communications between processors/arrays will be minimized to zero. This is significant because the communications usually dominate the overhead in MIMD machines. Some issues of mapping partitioned algorithms into MIMD/systolic systems are addressed. Based on the theory of partitioning, a new method is proposed to test if a system of linear Diophantine equations has integer solutions

Purdue E-Pubs

Mesoscopic Community Structure of Financial Markets Revealed by Price and Sign Fluctuations

Author: Almog Assaf
Besamusca Ferry
Garlaschelli Diego
MacMahon Mel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

The mesoscopic organization of complex systems, from financial markets to the brain, is an intermediate between the microscopic dynamics of individual units (stocks or neurons, in the mentioned cases), and the macroscopic dynamics of the system as a whole. The organization is determined by "communities" of units whose dynamics, represented by time series of activity, is more strongly correlated internally than with the rest of the system. Recent studies have shown that the binary projections of various financial and neural time series exhibit nontrivial dynamical features that resemble those of the original data. This implies that a significant piece of information is encoded into the binary projection (i.e. the sign) of such increments. Here, we explore whether the binary signatures of multiple time series can replicate the same complex community organization of the financial market, as the original weighted time series. We adopt a method that has been specifically designed to detect communities from cross-correlation matrices of time series data. Our analysis shows that the simpler binary representation leads to a community structure that is almost identical with that obtained using the full weighted representation. These results confirm that binary projections of financial time series contain significant structural information.Comment: 15 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

Archivio della ricerca della Scuola IMT Alti Studi Lucca

Leiden University Scholary Publications

Dynamic programming for graphs on surfaces

Author: B. Courcelle
B. Mohar
E.D. Demaine
E.D. Demaine
F. Dorn
F. Dorn
F. Dorn
F. Flajolet
F.V. Fomin
H.L. Bodlaender
J.A. Telle
N. Robertson
P. Flajolet
P. Seymour
S. Arnborg
S. Cabello
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We provide a framework for the design and analysis of dynamic programming algorithms for surface-embedded graphs on n vertices and branchwidth at most k. Our technique applies to general families of problems where standard dynamic programming runs in 2O(k·log k). Our approach combines tools from topological graph theory and analytic combinatorics.Postprint (updated version

Crossref

UPCommons. Portal del coneixement obert de la UPC

Efficient Algorithms for Searching the Minimum Information Partition in Integrated Information Theory

Author: Kanai Ryota
Kitazono Jun
Oizumi Masafumi
Publication venue: 'MDPI AG'
Publication date: 13/02/2018
Field of study

The ability to integrate information in the brain is considered to be an essential property for cognition and consciousness. Integrated Information Theory (IIT) hypothesizes that the amount of integrated information (

\Phi

) in the brain is related to the level of consciousness. IIT proposes that to quantify information integration in a system as a whole, integrated information should be measured across the partition of the system at which information loss caused by partitioning is minimized, called the Minimum Information Partition (MIP). The computational cost for exhaustively searching for the MIP grows exponentially with system size, making it difficult to apply IIT to real neural data. It has been previously shown that if a measure of

\Phi

satisfies a mathematical property, submodularity, the MIP can be found in a polynomial order by an optimization algorithm. However, although the first version of

\Phi

is submodular, the later versions are not. In this study, we empirically explore to what extent the algorithm can be applied to the non-submodular measures of

\Phi

by evaluating the accuracy of the algorithm in simulated data and real neural data. We find that the algorithm identifies the MIP in a nearly perfect manner even for the non-submodular measures. Our results show that the algorithm allows us to measure

\Phi

in large systems within a practical amount of time

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals