2,514 research outputs found
Graph-based Semi-Supervised & Active Learning for Edge Flows
We present a graph-based semi-supervised learning (SSL) method for learning
edge flows defined on a graph. Specifically, given flow measurements on a
subset of edges, we want to predict the flows on the remaining edges. To this
end, we develop a computational framework that imposes certain constraints on
the overall flows, such as (approximate) flow conservation. These constraints
render our approach different from classical graph-based SSL for vertex labels,
which posits that tightly connected nodes share similar labels and leverages
the graph structure accordingly to extrapolate from a few vertex labels to the
unlabeled vertices. We derive bounds for our method's reconstruction error and
demonstrate its strong performance on synthetic and real-world flow networks
from transportation, physical infrastructure, and the Web. Furthermore, we
provide two active learning algorithms for selecting informative edges on which
to measure flow, which has applications for optimal sensor deployment. The
first strategy selects edges to minimize the reconstruction error bound and
works well on flows that are approximately divergence-free. The second approach
clusters the graph and selects bottleneck edges that cross cluster-boundaries,
which works well on flows with global trends
Pregelix: Big(ger) Graph Analytics on A Dataflow Engine
There is a growing need for distributed graph processing systems that are
capable of gracefully scaling to very large graph datasets. Unfortunately, this
challenge has not been easily met due to the intense memory pressure imposed by
process-centric, message passing designs that many graph processing systems
follow. Pregelix is a new open source distributed graph processing system that
is based on an iterative dataflow design that is better tuned to handle both
in-memory and out-of-core workloads. As such, Pregelix offers improved
performance characteristics and scaling properties over current open source
systems (e.g., we have seen up to 15x speedup compared to Apache Giraph and up
to 35x speedup compared to distributed GraphLab), and makes more effective use
of available machine resources to support Big(ger) Graph Analytics
GraphR: Accelerating Graph Processing Using ReRAM
This paper presents GRAPHR, the first ReRAM-based graph processing
accelerator. GRAPHR follows the principle of near-data processing and explores
the opportunity of performing massive parallel analog operations with low
hardware and energy cost. The analog computation is suit- able for graph
processing because: 1) The algorithms are iterative and could inherently
tolerate the imprecision; 2) Both probability calculation (e.g., PageRank and
Collaborative Filtering) and typical graph algorithms involving integers (e.g.,
BFS/SSSP) are resilient to errors. The key insight of GRAPHR is that if a
vertex program of a graph algorithm can be expressed in sparse matrix vector
multiplication (SpMV), it can be efficiently performed by ReRAM crossbar. We
show that this assumption is generally true for a large set of graph
algorithms. GRAPHR is a novel accelerator architecture consisting of two
components: memory ReRAM and graph engine (GE). The core graph computations are
performed in sparse matrix format in GEs (ReRAM crossbars). The
vector/matrix-based graph computation is not new, but ReRAM offers the unique
opportunity to realize the massive parallelism with unprecedented energy
efficiency and low hardware cost. With small subgraphs processed by GEs, the
gain of performing parallel operations overshadows the wastes due to sparsity.
The experiment results show that GRAPHR achieves a 16.01x (up to 132.67x)
speedup and a 33.82x energy saving on geometric mean compared to a CPU baseline
system. Com- pared to GPU, GRAPHR achieves 1.69x to 2.19x speedup and consumes
4.77x to 8.91x less energy. GRAPHR gains a speedup of 1.16x to 4.12x, and is
3.67x to 10.96x more energy efficiency compared to PIM-based architecture.Comment: Accepted to HPCA 201
Theories for influencer identification in complex networks
In social and biological systems, the structural heterogeneity of interaction
networks gives rise to the emergence of a small set of influential nodes, or
influencers, in a series of dynamical processes. Although much smaller than the
entire network, these influencers were observed to be able to shape the
collective dynamics of large populations in different contexts. As such, the
successful identification of influencers should have profound implications in
various real-world spreading dynamics such as viral marketing, epidemic
outbreaks and cascading failure. In this chapter, we first summarize the
centrality-based approach in finding single influencers in complex networks,
and then discuss the more complicated problem of locating multiple influencers
from a collective point of view. Progress rooted in collective influence
theory, belief-propagation and computer science will be presented. Finally, we
present some applications of influencer identification in diverse real-world
systems, including online social platforms, scientific publication, brain
networks and socioeconomic systems.Comment: 24 pages, 6 figure
On-line failure prediction in safety-critical systems
In safety-critical systems such as Air Traffic Control system, SCADA systems, Railways Control Systems, there has been a rapid transition from monolithic systems to highly modular ones, using off-the-shelf hardware and software applications possibly developed by different manufactures. This shift increased the probability that a fault occurring in an application propagates to others with the risk of a failure of the entire safety-critical system. This calls for new tools for the on-line detection of anomalous behaviors of the system, predicting thus a system failure before it happens, allowing the deployment of appropriate mitigation policies. The paper proposes a novel architecture, namely CASPER, for online failure prediction that has the distinctive features to be (i) black-box: no knowledge of applications internals and logic of the system is required (ii) non-intrusive: no status information of the components is used such as CPU or memory usage; The architecture has been implemented to predict failures in a real Air Traffic Control System. CASPER exhibits high degree of accuracy in predicting failures with low false positive rate. The experimental validation shows how operators are provided with predictions issued a few hundred of seconds before the occurrence of the failure
Online failure prediction in air traffic control systems
This thesis introduces a novel approach to online failure prediction for mission critical distributed systems that has the distinctive features to be black-box, non-intrusive and online. The approach combines Complex Event Processing (CEP) and Hidden Markov Models (HMM) so as to analyze symptoms of failures that might occur in the form of anomalous conditions of performance metrics identified for such purpose. The thesis presents an architecture named CASPER, based on CEP and HMM, that relies on sniffed information from the communication network of a mission critical system, only, for predicting anomalies that can lead to software failures. An instance of Casper has been implemented, trained and tuned to monitor a real Air Traffic Control (ATC) system developed by Selex ES, a Finmeccanica Company. An extensive experimental evaluation of CASPER is presented. The obtained results show (i) a very low percentage of false positives over both normal and under stress conditions, and (ii) a sufficiently high failure prediction time that allows the system to apply appropriate recovery procedures
Online failure prediction in air traffic control systems
This thesis introduces a novel approach to online failure prediction for mission critical distributed systems that has the distinctive features to be black-box, non-intrusive and online. The approach combines Complex Event Processing (CEP) and Hidden Markov Models (HMM) so as to analyze symptoms of failures that might occur in the form of anomalous conditions of performance metrics identified for such purpose. The thesis presents an architecture named CASPER, based on CEP and HMM, that relies on sniffed information from the communication network of a mission critical system, only, for predicting anomalies that can lead to software failures. An instance of Casper has been implemented, trained and tuned to monitor a real Air Traffic Control (ATC) system developed by Selex ES, a Finmeccanica Company. An extensive experimental evaluation of CASPER is presented. The obtained results show (i) a very low percentage of false positives over both normal and under stress conditions, and (ii) a sufficiently high failure prediction time that allows the system to apply appropriate recovery procedures
Near-Optimal Motion Planning Algorithms Via A Topological and Geometric Perspective
Motion planning is a fundamental problem in robotics, which involves finding a path for an autonomous system, such as a robot, from a given source to a destination while avoiding collisions with obstacles. The properties of the planning space heavily influence the performance of existing motion planning algorithms, which can pose significant challenges in handling complex regions, such as narrow passages or cluttered environments, even for simple objects. The problem of motion planning becomes deterministic if the details of the space are fully known, which is often difficult to achieve in constantly changing environments. Sampling-based algorithms are widely used among motion planning paradigms because they capture the topology of space into a roadmap. These planners have successfully solved high-dimensional planning problems with a probabilistic-complete guarantee, i.e., it guarantees to find a path if one exists as the number of vertices goes to infinity. Despite their progress, these methods have failed to optimize the sub-region information of the environment for reuse by other planners. This results in re-planning overhead at each execution, affecting the performance complexity for computation time and memory space usage.
In this research, we address the problem by focusing on the theoretical foundation of the algorithmic approach that leverages the strengths of sampling-based motion planners and the Topological Data Analysis methods to extract intricate properties of the environment. The work contributes a novel algorithm to overcome the performance shortcomings of existing motion planners by capturing and preserving the essential topological and geometric features to generate a homotopy-equivalent roadmap of the environment. This roadmap provides a mathematically rich representation of the environment, including an approximate measure of the collision-free space. In addition, the roadmap graph vertices sampled close to the obstacles exhibit advantages when navigating through narrow passages and cluttered environments, making obstacle-avoidance path planning significantly more efficient.
The application of the proposed algorithms solves motion planning problems, such as sub-optimal planning, diverse path planning, and fault-tolerant planning, by demonstrating the improvement in computational performance and path quality. Furthermore, we explore the potential of these algorithms in solving computational biology problems, particularly in finding optimal binding positions for protein-ligand or protein-protein interactions.
Overall, our work contributes a new way to classify routes in higher dimensional space and shows promising results for high-dimensional robots, such as articulated linkage robots. The findings of this research provide a comprehensive solution to motion planning problems and offer a new perspective on solving computational biology problems
- …