Search CORE

1,011,041 research outputs found

{VoG}: {Summarizing} and Understanding Large Graphs

Author: Faloutsos C.
Kang U.
Koutra D.
Vreeken J.
Publication venue
Publication date: 01/01/2014
Field of study

How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the "importance" of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a "vocabulary" of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a well-founded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph. Our contributions are three-fold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop \method, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multi-million-edge real graphs, including Flickr and the Notre Dame web graph

VoG: Summarizing and Understanding Large Graphs

Author: Faloutsos Christos
Kang U
Koutra Danai
Vreeken Jilles
Publication venue
Publication date: 01/01/2014
Field of study

arXiv.org e-Print Archive

CiteSeerX

Clustering for Different Scales of Measurement - the Gap-Ratio Weighted K-means Algorithm

Author: Gibaru Olivier
Guérin Joris
Nyiri Eric
Thiery Stéphane
Publication venue
Publication date: 22/03/2017
Field of study

This paper describes a method for clustering data that are spread out over large regions and which dimensions are on different scales of measurement. Such an algorithm was developed to implement a robotics application consisting in sorting and storing objects in an unsupervised way. The toy dataset used to validate such application consists of Lego bricks of different shapes and colors. The uncontrolled lighting conditions together with the use of RGB color features, respectively involve data with a large spread and different levels of measurement between data dimensions. To overcome the combination of these two characteristics in the data, we have developed a new weighted K-means algorithm, called gap-ratio K-means, which consists in weighting each dimension of the feature space before running the K-means algorithm. The weight associated with a feature is proportional to the ratio of the biggest gap between two consecutive data points, and the average of all the other gaps. This method is compared with two other variants of K-means on the Lego bricks clustering problem as well as two other common classification datasets.Comment: 13 pages, 6 figures, 2 tables. This paper is under the review process for AIAP 201

arXiv.org e-Print Archive

Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm

Author: Fordellone Mario
Vichi Maurizio
Publication venue
Publication date: 01/01/2018
Field of study

The identification of different homogeneous groups of observations and their appropriate analysis in PLS-SEM has become a critical issue in many appli- cation fields. Usually, both SEM and PLS-SEM assume the homogeneity of all units on which the model is estimated, and approaches of segmentation present in literature, consist in estimating separate models for each segments of statistical units, which have been obtained either by assigning the units to segments a priori defined. However, these approaches are not fully accept- able because no causal structure among the variables is postulated. In other words, a modeling approach should be used, where the obtained clusters are homogeneous with respect to the structural causal relationships. In this paper, a new methodology for simultaneous non-hierarchical clus- tering and PLS-SEM is proposed. This methodology is motivated by the fact that the sequential approach of applying first SEM or PLS-SEM and second the clustering algorithm such as K-means on the latent scores of the SEM/PLS-SEM may fail to find the correct clustering structure existing in the data. A simulation study and an application on real data are included to evaluate the performance of the proposed methodology

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

The quantum correlation between the selection of the problem and that of the solution sheds light on the mechanism of the quantum speed up

Author: D. R. Simon
Giuseppe Castagnoli
L. K. Grover
P. Kaye
Publication venue: 'American Physical Society (APS)'
Publication date: 26/10/2010
Field of study

In classical problem solving, there is of course correlation between the selection of the problem on the part of Bob (the problem setter) and that of the solution on the part of Alice (the problem solver). In quantum problem solving, this correlation becomes quantum. This means that Alice contributes to selecting 50% of the information that specifies the problem. As the solution is a function of the problem, this gives to Alice advanced knowledge of 50% of the information that specifies the solution. Both the quadratic and exponential speed ups are explained by the fact that quantum algorithms start from this advanced knowledge.Comment: Earlier version submitted to QIP 2011. Further clarified section 1, "Outline of the argument", submitted to Phys Rev A, 16 page

arXiv.org e-Print Archive

A novel ensemble method for electric vehicle power consumption forecasting: Application to the Spanish system

Author: Asencio Cortés G.
Gastalver Rubio Adolfo
Gómez-Quiles Catalina
Manresa Joan
Martínez-Álvarez Francisco
Riquelme Santos Jesús Manuel
Riquelme Santos José Cristóbal
Troncoso Lora Alicia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

The use of electric vehicle across the world has become one of the most challenging issues for environmental policies. The galloping climate change and the expected running out of fossil fuels turns the use of such non-polluting cars into a priority for most developed countries. However, such a use has led to major concerns to power companies, since they must adapt their generation to a new scenario, in which electric vehicles will dramatically modify the curve of generation. In this paper, a novel approach based on ensemble learning is proposed. In particular, ARIMA, GARCH and PSF algorithms' performances are used to forecast the electric vehicle power consumption in Spain. It is worth noting that the studied time series of consumption is non-stationary and adds difficulties to the forecasting process. Thus, an ensemble is proposed by dynamically weighting all algorithms over time. The proposal presented has been implemented for a real case, in particular, at the Spanish Control Centre for the Electric Vehicle. The performance of the approach is assessed by means of WAPE, showing robust and promising results for this research field.Ministerio de Economía y Competitividad Proyectos ENE2016-77650-R, PCIN-2015-04 y TIN2017-88209-C2-R

Extending CKKW-merging to One-Loop Matrix Elements

Author: C.W. Bauer
C.W. Bauer
F. Krauss
J. Alwall .
J. Campbell
L. Lönnblad
Leif Lönnblad
M. Mangano
M.L. Mangano
M.L. Mangano
N. Lavesson
N. Lavesson
Nils Lavesson
P. Nason
R.Y. Zhu
S. Catani
S. Frixione
S. Frixione
S. Frixione
S. Mrenna
T. Becher
T. Sjöstrand
Z. Nagy
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

We extend earlier schemes for merging tree-level matrix elements with parton showers to include also merging with one-loop matrix elements. In this paper we make a first study on how to include one-loop corrections, not only for events with a given jet multiplicity, but simultaneously for several different jet multiplicities. Results are presented for the simplest non-trivial case of hadronic events at LEP as a proof-of-concept

arXiv.org e-Print Archive

Lund University Publications