Search CORE

9,166 research outputs found

Network Sampling: From Static to Streaming Graphs

Author: Ahmed Nesreen K.
Kompella Ramana
Neville Jennifer
Publication venue
Publication date: 13/11/2012
Field of study

Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network sampling, by highlighting the different objectives, population and units of interest, and classes of network sampling methods. In addition, we propose a spectrum of computational models for network sampling methods, ranging from the traditionally studied model based on the assumption of a static domain to a more challenging model that is appropriate for streaming domains. We design a family of sampling methods based on the concept of graph induction that generalize across the full spectrum of computational models (from static to streaming) while efficiently preserving many of the topological properties of the input graphs. Furthermore, we demonstrate how traditional static sampling algorithms can be modified for graph streams for each of the three main classes of sampling methods: node, edge, and topology-based sampling. Our experimental results indicate that our proposed family of sampling methods more accurately preserves the underlying properties of the graph for both static and streaming graphs. Finally, we study the impact of network sampling algorithms on the parameter estimation and performance evaluation of relational classification algorithms

arXiv.org e-Print Archive

CiteSeerX

Density-biased clustering based on reservoir sampling

Author: Kittisak Kerdprasop
Publication venue: School of Computer Engineering, Institute of Engineering, Suranaree University of Technology
Publication date: 01/01/2000
Field of study

Suranaree University of Technology Intellectual Repository

Density-biased clustering based on reservoir sampling

Author: Kittisak Kerdprasop
Nittaya Kerdprasop
Pairote Sattayatham
Publication venue: School of Mathematics, Institute of Science, Suranaree University of Technology
Publication date: 01/01/2000
Field of study

Suranaree University of Technology Intellectual Repository

Quality Assessment of Linked Datasets using Probabilistic Approximation

Author: A Hogan
AZ Broder
BH Bloom
C Guéret
JS Vitter
P Hitzler
Publication venue
Publication date: 17/03/2015
Field of study

With the increasing application of Linked Open Data, assessing the quality of datasets by computing quality metrics becomes an issue of crucial importance. For large and evolving datasets, an exact, deterministic computation of the quality metrics is too time consuming or expensive. We employ probabilistic techniques such as Reservoir Sampling, Bloom Filters and Clustering Coefficient estimation for implementing a broad set of data quality metrics in an approximate but sufficiently accurate way. Our implementation is integrated in the comprehensive data quality assessment framework Luzzu. We evaluated its performance and accuracy on Linked Open Datasets of broad relevance.Comment: 15 pages, 2 figures, To appear in ESWC 2015 proceeding

arXiv.org e-Print Archive

Crossref

Fraunhofer-ePrints

Graph Sample and Hold: A Framework for Big-Graph Analytics

Author: Ahmed Nesreen K.
Duffield Nick
Kompella Ramana
Neville Jennifer
Publication venue
Publication date: 16/03/2014
Field of study

Sampling is a standard approach in big-graph analytics; the goal is to efficiently estimate the graph properties by consulting a sample of the whole population. A perfect sample is assumed to mirror every property of the whole population. Unfortunately, such a perfect sample is hard to collect in complex populations such as graphs (e.g. web graphs, social networks etc), where an underlying network connects the units of the population. Therefore, a good sample will be representative in the sense that graph properties of interest can be estimated with a known degree of accuracy. While previous work focused particularly on sampling schemes used to estimate certain graph properties (e.g. triangle count), much less is known for the case when we need to estimate various graph properties with the same sampling scheme. In this paper, we propose a generic stream sampling framework for big-graph analytics, called Graph Sample and Hold (gSH). To begin, the proposed framework samples from massive graphs sequentially in a single pass, one edge at a time, while maintaining a small state. We then show how to produce unbiased estimators for various graph properties from the sample. Given that the graph analysis algorithms will run on a sample instead of the whole population, the runtime complexity of these algorithm is kept under control. Moreover, given that the estimators of graph properties are unbiased, the approximation error is kept under control. Finally, we show the performance of the proposed framework (gSH) on various types of graphs, such as social graphs, among others

arXiv.org e-Print Archive

CiteSeerX

Combined electronic nose and tongue for a flavour sensing system

Author: Cole Marina
Covington James A.
Gardner J. W.
Publication venue: Elsevier Science SA
Publication date: 07/03/2011
Field of study

We present a novel, smart sensing system developed for the flavour analysis of liquids. The system comprises both a so-called "electronic tongue" based on shear horizontal surface acoustic wave (SH-SAW) sensors analysing the liquid phase and a so-called "electronic nose" based on chemFET sensors analysing the gaseous phase. Flavour is generally understood to be the overall experience from the combination of oral and nasal stimulation and is principally derived from a combination of the human senses of taste (gustation) and smell (olfaction). Thus, by combining two types of microsensors, an artificial flavour sensing system has been developed. Initial tests conducted with different liquid samples, i.e. water, orange juice and milk (of different fat content), resulted in 100% discrimination using principal components analysis; although it was found that there was little contribution from the electronic nose. Therefore further flavour experiments were designed to demonstrate the potential of the combined electronic nose/tongue flavour system. Consequently, experiments were conducted on low vapour pressure taste-biased solutions and high vapour pressure, smell-biased solutions. Only the combined flavour analysis system could achieve 100% discrimination between all the different liquids. We believe that this is the first report of a SAW-based analysis system that determines flavour through the combination of both liquid and headspace analysis

Warwick Research Archives Portal Repository

Massive Scale Streaming Graphs: Evolving Network Analysis and Mining

Author: Shazia Tabassum
Publication venue
Publication date: 06/05/2020
Field of study

Repositório Aberto da Universidade do Porto