487 research outputs found
A novel ensemble clustering for operational transients classification with application to a nuclear power plant turbine
International audienceThe objective of the present work is to develop a novel approach for combining in an ensemble multiple base clusterings of operational transients of industrial equipment, when the number of clusters in the final consensus clustering is unknown. A measure of pairwise similarity is used to quantify the co-association matrix that describes the similarity among the different base clusterings. Then, a Spectral Clustering technique of literature, embedding the unsupervised K-Means algorithm, is applied to the co-association matrix for finding the optimum number of clusters of the final consensus clustering, based on Silhouette validity index calculation. The proposed approach is developed with reference to an artificial case study, properly designed to mimic the signal trend behavior of a Nuclear Power Plant (NPP) turbine during shutdown. The results of the artificial case have been compared with those achieved by a state-of-art approach, known as Cluster-based Similarity Partitioning and Serial Graph Partitioning and Fill-reducing Matrix Ordering Algorithms (CSPA-METIS). The comparison shows that the proposed approach is able to identify a final consensus clustering that classifies the transients with better accuracy and robustness compared to the CSPA-METIS approach. The approach is, then, validated on an industrial case concerning 149 shutdown transients of a NPP turbine
The complexity of the characteristic and the minimal polynomial
AbstractWe investigate the complexity of (1) computing the characteristic polynomial, the minimal polynomial, and all the invariant factors of an integer matrix, and of (2) verifying them, when the coefficients are given as input.It is known that each coefficient of the characteristic polynomial of a matrix A is computable in GapL, and the constant term, the determinant of A, is complete for GapL. We show that the verification of the characteristic polynomial is complete for complexity class C=L (exact counting logspace).We show that each coefficient of the minimal polynomial of a matrix A can be computed in AC0(GapL), the AC0-closure of GapL, and there is a coefficient which is hard for GapL. Furthermore, the verification of the minimal polynomial is in AC0(C=L) and is hard for C=L. The hardness result extends to (computing and verifying) the system of all invariant factors of a matrix
A study and evaluation of image analysis techniques applied to remotely sensed data
An analysis of phenomena causing nonlinearities in the transformation from Landsat multispectral scanner coordinates to ground coordinates is presented. Experimental results comparing rms errors at ground control points indicated a slight improvement when a nonlinear (8-parameter) transformation was used instead of an affine (6-parameter) transformation. Using a preliminary ground truth map of a test site in Alabama covering the Mobile Bay area and six Landsat images of the same scene, several classification methods were assessed. A methodology was developed for automatic change detection using classification/cluster maps. A coding scheme was employed for generation of change depiction maps indicating specific types of changes. Inter- and intraseasonal data of the Mobile Bay test area were compared to illustrate the method. A beginning was made in the study of data compression by applying a Karhunen-Loeve transform technique to a small section of the test data set. The second part of the report provides a formal documentation of the several programs developed for the analysis and assessments presented
Self-attention Dual Embedding for Graphs with Heterophily
Graph Neural Networks (GNNs) have been highly successful for the node
classification task. GNNs typically assume graphs are homophilic, i.e.
neighboring nodes are likely to belong to the same class. However, a number of
real-world graphs are heterophilic, and this leads to much lower classification
accuracy using standard GNNs. In this work, we design a novel GNN which is
effective for both heterophilic and homophilic graphs. Our work is based on
three main observations. First, we show that node features and graph topology
provide different amounts of informativeness in different graphs, and therefore
they should be encoded independently and prioritized in an adaptive manner.
Second, we show that allowing negative attention weights when propagating graph
topology information improves accuracy. Finally, we show that asymmetric
attention weights between nodes are helpful. We design a GNN which makes use of
these observations through a novel self-attention mechanism. We evaluate our
algorithm on real-world graphs containing thousands to millions of nodes and
show that we achieve state-of-the-art results compared to existing GNNs. We
also analyze the effectiveness of the main components of our design on
different graphs.Comment: 9 pages, 15 figure
Graph Interpolation via Fast Fused-Gromovization
Graph data augmentation has proven to be effective in enhancing the
generalizability and robustness of graph neural networks (GNNs) for graph-level
classifications. However, existing methods mainly focus on augmenting the graph
signal space and the graph structure space independently, overlooking their
joint interaction. This paper addresses this limitation by formulating the
problem as an optimal transport problem that aims to find an optimal strategy
for matching nodes between graphs considering the interactions between graph
structures and signals. To tackle this problem, we propose a novel graph mixup
algorithm dubbed FGWMixup, which leverages the Fused Gromov-Wasserstein (FGW)
metric space to identify a "midpoint" of the source graphs. To improve the
scalability of our approach, we introduce a relaxed FGW solver that accelerates
FGWMixup by enhancing the convergence rate from to
. Extensive experiments conducted on five datasets,
utilizing both classic (MPNNs) and advanced (Graphormers) GNN backbones,
demonstrate the effectiveness of FGWMixup in improving the generalizability and
robustness of GNNs
Efficient Two-Level Swarm Intelligence Approach for Multiple Sequence Alignment
This paper proposes two-level particle swarm optimization (TL-PSO), an efficient PSO variant that addresses two levels of optimization problem. Level one works on optimizing dimension for entire swarm, whereas level two works for optimizing each particle's position. The issue addressed here is one of the most challenging multiple sequence alignment (MSA) problem. TL-PSO deals with the arduous task of determination of exact sequence length with most suitable gap positions in MSA. The two levels considered here are: to obtain optimal sequence length in level one and to attain optimum gap positions for maximal alignment score in level two. The performance of TL-PSO has been assessed through a comparative study with two kinds of benchmark dataset of DNA and RNA. The efficiency of the proposed approach is evaluated with four popular scoring schemes at specific parameters. TL-PSO alignments are compared with four PSO variants, i.e. S-PSO, M-PSO, ED-MPSO and CPSO-Sk, and two leading alignment software, i.e. ClustalW and T-Coffee, at different alignment scores. Hence obtained results prove the competence of TL-PSO at accuracy aspects and conclude better score scheme
Significant Scales in Community Structure
Many complex networks show signs of modular structure, uncovered by community
detection. Although many methods succeed in revealing various partitions, it
remains difficult to detect at what scale some partition is significant. This
problem shows foremost in multi-resolution methods. We here introduce an
efficient method for scanning for resolutions in one such method. Additionally,
we introduce the notion of "significance" of a partition, based on subgraph
probabilities. Significance is independent of the exact method used, so could
also be applied in other methods, and can be interpreted as the gain in
encoding a graph by making use of a partition. Using significance, we can
determine "good" resolution parameters, which we demonstrate on benchmark
networks. Moreover, optimizing significance itself also shows excellent
performance. We demonstrate our method on voting data from the European
Parliament. Our analysis suggests the European Parliament has become
increasingly ideologically divided and that nationality plays no role.Comment: To appear in Scientific Report
- …