Search CORE

7 research outputs found

Grammar-Based Geodesics in Semantic Networks

Author: Aleman-Meza
Anyanwu
Bavelas
Blanco-Fernández
Bollen
Bollen
Bonacich
Brandes
Brin
Carroll
Chitrapura
Cormen
Dijkstra
Freeman
Harary
Jennifer H. Watkins
Leavitt
Lin
Marko A. Rodriguez
Newman
Rodriguez
Rodriguez
Rodriguez
Rodriguez
Sabidussi
Sheth
Sowa
Wasserman
Weiss
Zhuge
Publication venue: 'Elsevier BV'
Publication date: 03/09/2010
Field of study

A geodesic is the shortest path between two vertices in a connected network. The geodesic is the kernel of various network metrics including radius, diameter, eccentricity, closeness, and betweenness. These metrics are the foundation of much network research and thus, have been studied extensively in the domain of single-relational networks (both in their directed and undirected forms). However, geodesics for single-relational networks do not translate directly to multi-relational, or semantic networks, where vertices are connected to one another by any number of edge labels. Here, a more sophisticated method for calculating a geodesic is necessary. This article presents a technique for calculating geodesics in semantic networks with a focus on semantic networks represented according to the Resource Description Framework (RDF). In this framework, a discrete "walker" utilizes an abstract path description called a grammar to determine which paths to include in its geodesic calculation. The grammar-based model forms a general framework for studying geodesic metrics in semantic networks.Comment: First draft written in 200

arXiv.org e-Print Archive

Crossref

ABSTRACT Node Ranking In Labeled Directed Graphs

Author: Krishna P. Chitrapura
Srinivas R. Kashyap
Publication venue
Publication date: 01/04/2008
Field of study

Our work is motivated by the problem of ranking hyperlinked documents for a given query. Given an arbitrary directed graph with edge and node labels, we present a new flow-based model and an efficient method to dynamically rank the nodes of this graph with respect to any of the original labels. Ranking documents for a given query in a hyperlinked document set and ranking of authors/articles for a given topic in a citation database are some typical applications of our method. We outline the structural conditions that the graph must satisfy for our ranking to be different from the traditional PageRank. We have built a system using two indices that is capable of dynamically ranking documents for any given query. We validate our system and method using experiments on a few datasets: a crawl of the IBM Intranet (12 million pages), a crawl of the www (30 million pages) and the DBLP citation dataset. We compare our method to existing schemes for topic-biased ranking that require a classifier and the traditional PageRank. In these experiments, we demonstrate that our method is well suited for fine-grained ranking and that our method performs better than the existing schemes. We also demonstrate that our system can obtain an improved ranking with very little impact on query time

CiteSeerX

A model for handling approximate, noisy or incomplete labeling in text classification

Author: Ganesh Ramakrishnan
Krishna Prasad Chitrapura
Pushpak Bhattacharyya
Raghu Krishnapuram
Publication venue
Publication date: 01/01/2005
Field of study

We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties associated with the labeling process. Given a labeled or partially labeled training corpus of text documents, the model estimates the joint distribution of training documents and class labels by using a generalization of the Expectation Maximization algorithm. The estimates can be used in standard classification models to reduce error rates. Since uncertainties in the labeling are taken into account, the model provides an elegant mechanism to deal with noisy labels. We provide an intuitive modification to the EM iterations by re-estimating the empirical distribution in order to reinforce feature values in unlabeled data and to reduce the influence of noisily labeled examples. Considerable improvement in the classification accuracies of two popular classification algorithms on standard labeled data-sets with and without artificially introduced noise, as well as in the presence and absence of unlabeled data, indicates that this may be a promising method to reduce the burden of manual labeling. 1

CiteSeerX

Crossref