6,326 research outputs found
Node Classification in Uncertain Graphs
In many real applications that use and analyze networked data, the links in
the network graph may be erroneous, or derived from probabilistic techniques.
In such cases, the node classification problem can be challenging, since the
unreliability of the links may affect the final results of the classification
process. If the information about link reliability is not used explicitly, the
classification accuracy in the underlying network may be affected adversely. In
this paper, we focus on situations that require the analysis of the uncertainty
that is present in the graph structure. We study the novel problem of node
classification in uncertain graphs, by treating uncertainty as a first-class
citizen. We propose two techniques based on a Bayes model and automatic
parameter selection, and show that the incorporation of uncertainty in the
classification process as a first-class citizen is beneficial. We
experimentally evaluate the proposed approach using different real data sets,
and study the behavior of the algorithms under different conditions. The
results demonstrate the effectiveness and efficiency of our approach
Collaboration in an Open Data eScience: A Case Study of Sloan Digital Sky Survey
Current science and technology has produced more and more publically
accessible scientific data. However, little is known about how the open data
trend impacts a scientific community, specifically in terms of its
collaboration behaviors. This paper aims to enhance our understanding of the
dynamics of scientific collaboration in the open data eScience environment via
a case study of co-author networks of an active and highly cited open data
project, called Sloan Digital Sky Survey. We visualized the co-authoring
networks and measured their properties over time at three levels: author,
institution, and country levels. We compared these measurements to a random
network model and also compared results across the three levels. The study
found that 1) the collaboration networks of the SDSS community transformed from
random networks to small-world networks; 2) the number of author-level
collaboration instances has not changed much over time, while the number of
collaboration instances at the other two levels has increased over time; 3)
pairwise institutional collaboration become common in recent years. The open
data trend may have both positive and negative impacts on scientific
collaboration.Comment: iConference 201
Fast Search for Dynamic Multi-Relational Graphs
Acting on time-critical events by processing ever growing social media or
news streams is a major technical challenge. Many of these data sources can be
modeled as multi-relational graphs. Continuous queries or techniques to search
for rare events that typically arise in monitoring applications have been
studied extensively for relational databases. This work is dedicated to answer
the question that emerges naturally: how can we efficiently execute a
continuous query on a dynamic graph? This paper presents an exact subgraph
search algorithm that exploits the temporal characteristics of representative
queries for online news or social media monitoring. The algorithm is based on a
novel data structure called the Subgraph Join Tree (SJ-Tree) that leverages the
structural and semantic characteristics of the underlying multi-relational
graph. The paper concludes with extensive experimentation on several real-world
datasets that demonstrates the validity of this approach.Comment: SIGMOD Workshop on Dynamic Networks Management and Mining (DyNetMM),
201
A Semantic Model for Selective Knowledge Discovery over OAI-PMH Structured Resources
This work presents OntoOAI, a semantic model for the selective discovery of knowledge about resources structured with the OAI-PMH protocol, to verify the feasibility and account for limitations in the application of technologies of the Semantic Web to data sets for selective knowledge discovery, understood as the process of finding resources that were not explicitly requested by a user but are potentially useful based on their context. OntoOAI is tested with a combination of three sources of information: Redalyc.org, the portal of the Network of Journals of Latin America and the Caribbean, Spain, and Portugal; the institutional repository of Roskilde University (called RUDAR); and DBPedia. Its application allows the verification that it is feasible to use semantic technologies to achieve selective knowledge discovery and gives a sample of the limitations of the use of OAI-PMH data for this purpose
Exploring Communities in Large Profiled Graphs
Given a graph and a vertex , the community search (CS) problem
aims to efficiently find a subgraph of whose vertices are closely related
to . Communities are prevalent in social and biological networks, and can be
used in product advertisement and social event recommendation. In this paper,
we study profiled community search (PCS), where CS is performed on a profiled
graph. This is a graph in which each vertex has labels arranged in a
hierarchical manner. Extensive experiments show that PCS can identify
communities with themes that are common to their vertices, and is more
effective than existing CS approaches. As a naive solution for PCS is highly
expensive, we have also developed a tree index, which facilitate efficient and
online solutions for PCS
Automatic Metadata Generation using Associative Networks
In spite of its tremendous value, metadata is generally sparse and
incomplete, thereby hampering the effectiveness of digital information
services. Many of the existing mechanisms for the automated creation of
metadata rely primarily on content analysis which can be costly and
inefficient. The automatic metadata generation system proposed in this article
leverages resource relationships generated from existing metadata as a medium
for propagation from metadata-rich to metadata-poor resources. Because of its
independence from content analysis, it can be applied to a wide variety of
resource media types and is shown to be computationally inexpensive. The
proposed method operates through two distinct phases. Occurrence and
co-occurrence algorithms first generate an associative network of repository
resources leveraging existing repository metadata. Second, using the
associative network as a substrate, metadata associated with metadata-rich
resources is propagated to metadata-poor resources by means of a discrete-form
spreading activation algorithm. This article discusses the general framework
for building associative networks, an algorithm for disseminating metadata
through such networks, and the results of an experiment and validation of the
proposed method using a standard bibliographic dataset
- …