Search CORE

2 research outputs found

An Effective and Efficient Graph Representation Learning Approach for Big Graphs

Author: Cuzzocrea Alfredo
Joaristi Mikel
Leung Carson K.
Serra Edoardo
Soufargi Selim
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2021
Field of study

In the Big Data era, large graph datasets are becoming increasingly popular due to their capability to integrate and interconnect large sources of data in many fields, e.g., social media, biology, communication networks, etc. Graph representation learning is a flexible tool that automatically extracts features from a graph node. These features can be directly used for machine learning tasks. Graph representation learning approaches producing features preserving the structural information of the graphs are still an open problem, especially in the context of large-scale graphs. In this paper, we propose a new fast and scalable structural representation learning approach called SparseStruct. Our approach uses a sparse internal representation for each node, and we formally proved its ability to preserve structural information. Thanks to a light-weight algorithm where each iteration costs only linear time in the number of the edges, SparseStruct is able to easily process large graphs. In addition, it provides improvements in comparison with state of the art in terms of prediction and classification accuracy by also providing strong robustness to noise data

Boise State University - ScholarWorks

Approximate Range-Sum Query Answering on Data Cubes with Probabilistic Guarantees

Author: Alfredo Cuzzocrea
Wei Wang
Publication venue
Publication date: 01/01/2007
Field of study

2siApproximate range aggregate queries are one of the most frequent and useful kinds of queries for Decision Support Systems (DSS), as they are widely used in many data analysis tasks. Traditionally, sampling-based techniques have been proposed to tackle this problem. However, their effectiveness degrade when the underlying data distribution is skewed. Another approach based on the outlier management can limit the effect of data skews but fails to address other requirements of approximate range aggregate queries, such as error guarantees and query processing efficiency. In this paper, we present a technique that provides approximate answers to range aggregate queries on OLAP data cubes efficiently, with theoretical guarantees on the errors. Our basic idea is to build different data structures to manage outliers and the rest of the data. Carefully chosen outliers are organized in a quad-tree based indexing data structure to provide efficient access for query processing. A query-workload adaptive, tree-like synopsis data structure, called T unable P artition-Tree (TP-Tree), is proposed to organize samples extracted from non-outlier data. Our experiments clearly demonstrate the merits of our technique, by comparing with previous well-known techniques.nonenoneCUZZOCREA A; W. WANGCuzzocrea, Alfredo Massimiliano; W., Wan

Archivio istituzionale della ricerca - Università di Trieste

CiteSeerX