10,861 research outputs found
Fast Shortest Path Distance Estimation in Large Networks
We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications.
In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks.
We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random.
Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship
A practical guide to density matrix embedding theory in quantum chemistry
Density matrix embedding theory (DMET) provides a theoretical framework to
treat finite fragments in the presence of a surrounding molecular or bulk
environment, even when there is significant correlation or entanglement between
the two. In this work, we give a practically oriented and explicit description
of the numerical and theoretical formulation of DMET. We also describe in
detail how to perform self-consistent DMET optimizations. We explore different
embedding strategies with and without a self-consistency condition in hydrogen
rings, beryllium rings, and a sample S2 reaction. The source code
for the calculations in this work can be obtained from
\url{https://github.com/sebwouters/qc-dmet}.Comment: 41 pages, 10 figure
Multi-provider network service embedding
[no abstract
Deliverable DJRA1.2. Solutions and protocols proposal for the network control, management and monitoring in a virtualized network context
This deliverable presents several research proposals for the FEDERICA network, in different subjects, such as monitoring, routing, signalling, resource discovery, and isolation. For each topic one or more possible solutions are elaborated, explaining the background, functioning and the implications of the proposed solutions.This deliverable goes further on the research aspects within FEDERICA. First of all the architecture of the control plane for the FEDERICA infrastructure will be defined. Several possibilities could be implemented, using the basic FEDERICA infrastructure as a starting point. The focus on this document is the intra-domain aspects of the control plane and their properties. Also some inter-domain aspects are addressed. The main objective of this deliverable is to lay great stress on creating and implementing the prototype/tool for the FEDERICA slice-oriented control system using the appropriate framework. This deliverable goes deeply into the definition of the containers between entities and their syntax, preparing this tool for the future implementation of any kind of algorithm related to the control plane, for both to apply UPB policies or to configure it by hand. We opt for an open solution despite the real time limitations that we could have (for instance, opening web services connexions or applying fast recovering mechanisms). The application being developed is the central element in the control plane, and additional features must be added to this application. This control plane, from the functionality point of view, is composed by several procedures that provide a reliable application and that include some mechanisms or algorithms to be able to discover and assign resources to the user. To achieve this, several topics must be researched in order to propose new protocols for the virtual infrastructure. The topics and necessary features covered in this document include resource discovery, resource allocation, signalling, routing, isolation and monitoring. All these topics must be researched in order to find a good solution for the FEDERICA network. Some of these algorithms have started to be analyzed and will be expanded in the next deliverable. Current standardization and existing solutions have been investigated in order to find a good solution for FEDERICA. Resource discovery is an important issue within the FEDERICA network, as manual resource discovery is no option, due to scalability requirement. Furthermore, no standardization exists, so knowledge must be obtained from related work. Ideally, the proposed solutions for these topics should not only be adequate specifically for this infrastructure, but could also be applied to other virtualized networks.Postprint (published version
Greedy routing and virtual coordinates for future networks
At the core of the Internet, routers are continuously struggling with
ever-growing routing and forwarding tables. Although hardware advances
do accommodate such a growth, we anticipate new requirements e.g. in
data-oriented networking where each content piece has to be referenced
instead of hosts, such that current approaches relying on global
information will not be viable anymore, no matter the hardware
progress. In this thesis, we investigate greedy routing methods that
can achieve similar routing performance as today but use much less
resources and which rely on local information only. To this end, we
add specially crafted name spaces to the network in which virtual
coordinates represent the addressable entities. Our scheme enables participating
routers to make forwarding decisions using only neighbourhood information,
as the overarching pseudo-geometric name space structure already
organizes and incorporates "vicinity" at a global level.
A first challenge to the application of greedy routing on virtual
coordinates to future networks is that of "routing dead-ends"
that are local minima due to the difficulty of consistent coordinates
attribution. In this context, we propose a routing recovery scheme
based on a multi-resolution embedding of the network in low-dimensional Euclidean spaces.
The recovery is performed by routing greedily on a blurrier view of the network. The
different network detail-levels are obtained though the embedding of
clustering-levels of the graph. When compared with
higher-dimensional embeddings of a given network, our method shows a
significant diminution of routing failures for similar header and
control-state sizes.
A second challenge to the application of virtual coordinates and
greedy routing to future networks is the support of
"customer-provider" as well as "peering" relationships between
participants, resulting in a differentiated services
environment. Although an application of greedy routing within such a
setting would combine two very common fields of today's networking
literature, such a scenario has, surprisingly, not been studied so
far. In this context we propose two approaches to address this scenario.
In a first approach we implement a path-vector protocol similar to
that of BGP on top of a greedy embedding of the network. This allows
each node to build a spatial map associated with each of its
neighbours indicating the accessible regions. Routing is then
performed through the use of a decision-tree classifier taking the
destination coordinates as input. When applied on a real-world dataset
(the CAIDA 2004 AS graph) we demonstrate an up to 40% compression ratio of
the routing control information at the network's core as well as a computationally efficient
decision process comparable to methods such as binary trees and tries.
In a second approach, we take inspiration from consensus-finding in social
sciences and transform the three-dimensional distance data structure
(where the third dimension encodes the service differentiation) into a
two-dimensional matrix on which classical embedding tools can be used.
This transformation is achieved by agreeing on a set of
constraints on the inter-node distances guaranteeing an
administratively-correct greedy routing. The computed distances are
also enhanced to encode multipath support. We demonstrate a good
greedy routing performance as well as an above 90% satisfaction of multipath constraints
when relying on the non-embedded obtained distances on synthetic datasets.
As various embeddings of the consensus distances do not fully exploit their multipath potential, the use of compression techniques such as transform coding to
approximate the obtained distance allows for better routing performances
- …