518 research outputs found
Scalable Exact Parent Sets Identification in Bayesian Networks Learning with Apache Spark
In Machine Learning, the parent set identification problem is to find a set
of random variables that best explain selected variable given the data and some
predefined scoring function. This problem is a critical component to structure
learning of Bayesian networks and Markov blankets discovery, and thus has many
practical applications, ranging from fraud detection to clinical decision
support. In this paper, we introduce a new distributed memory approach to the
exact parent sets assignment problem. To achieve scalability, we derive
theoretical bounds to constraint the search space when MDL scoring function is
used, and we reorganize the underlying dynamic programming such that the
computational density is increased and fine-grain synchronization is
eliminated. We then design efficient realization of our approach in the Apache
Spark platform. Through experimental results, we demonstrate that the method
maintains strong scalability on a 500-core standalone Spark cluster, and it can
be used to efficiently process data sets with 70 variables, far beyond the
reach of the currently available solutions
An Integer Programming approach to Bayesian Network Structure Learning
We study the problem of learning a Bayesian Network structure from data using an Integer Programming approach. We study the existing approaches, an in particular some recent works that formulate the problem as an Integer Programming model. By discussing some weaknesses of the existing approaches, we propose an alternative solution, based on a statistical sparsification of the search space. Results show how our approach can lead to promising results, especially for large network
Exact Computation of Influence Spread by Binary Decision Diagrams
Evaluating influence spread in social networks is a fundamental procedure to
estimate the word-of-mouth effect in viral marketing. There are enormous
studies about this topic; however, under the standard stochastic cascade
models, the exact computation of influence spread is known to be #P-hard. Thus,
the existing studies have used Monte-Carlo simulation-based approximations to
avoid exact computation.
We propose the first algorithm to compute influence spread exactly under the
independent cascade model. The algorithm first constructs binary decision
diagrams (BDDs) for all possible realizations of influence spread, then
computes influence spread by dynamic programming on the constructed BDDs. To
construct the BDDs efficiently, we designed a new frontier-based search-type
procedure. The constructed BDDs can also be used to solve other
influence-spread related problems, such as random sampling without rejection,
conditional influence spread evaluation, dynamic probability update, and
gradient computation for probability optimization problems.
We conducted computational experiments to evaluate the proposed algorithm.
The algorithm successfully computed influence spread on real-world networks
with a hundred edges in a reasonable time, which is quite impossible by the
naive algorithm. We also conducted an experiment to evaluate the accuracy of
the Monte-Carlo simulation-based approximation by comparing exact influence
spread obtained by the proposed algorithm.Comment: WWW'1
Cellular network capacity and coverage enhancement with MDT data and Deep Reinforcement Learning
Recent years witnessed a remarkable increase in the availability of data and computing resources in comm-unication networks. This contributed to the rise of data-driven over model-driven algorithms for network automation. This paper investigates a Minimization of Drive Tests (MDT)-driven Deep Reinforcement Learning (DRL) algorithm to optimize coverage and capacity by tuning antennas tilts on a cluster of cells from TIM's cellular network. We jointly utilize MDT data, electromagnetic simulations, and network Key Performance indicators (KPIs) to define a simulated network environment for the training of a Deep Q-Network (DQN) agent. Some tweaks have been introduced to the classical DQN formulation to improve the agent's sample efficiency, stability and performance. In particular, a custom exploration policy is designed to introduce soft constraints at training time. Results show that the proposed algorithm outperforms baseline approaches like DQN and best-first search in terms of long-term reward and sample efficiency. Our results indicate that MDT -driven approaches constitute a valuable tool for autonomous coverage and capacity optimization of mobile radio networks
Recommended from our members
Improving Computer Network Operations Through Automated Interpretation of State
Networked systems today are hyper-scaled entities that provide core functionality for distributed services and applications spanning personal, business, and government use. It is critical to maintain correct operation of these networks to avoid adverse business outcomes. The advent of programmable networks has provided much needed fine-grained network control, enabling providers and operators alike to build some innovative networking architectures and solutions. At the same time, they have given rise to new challenges in network management. These architectures, coupled with a multitude of devices, protocols, virtual overlays on top of physical data-plane etc. make network management a highly challenging task. Existing network management methodologies have not evolved at the same pace as the technologies and architectures. Current network management practices do not provide adequate solutions for highly dynamic, programmable environments. We have a long way to go in developing management methodologies that can meaningfully contribute to networks becoming self-healing entities. The goal of my research is to contribute to the design and development of networks towards transforming them into self-healing entities.
Network management includes a multitude of tasks, not limited to diagnosis and troubleshooting, but also performance engineering and tuning, security analysis etc. This research explores novel methods of utilizing network state to enhance networking capabilities. It is constructed around hypotheses based on careful analysis of practical deficiencies in the field. I try to generate real-world impact with my research by tackling problems that are prevalent in deployed networks, and that bear practical relevance to the current state of networking. The overarching goal of this body of work is to examine various approaches that could help enhance network management paradigms, providing administrators with a better understanding of the underlying state of the network, thus leading to more informed decision-making. The research looks into two distinct areas of network management, troubleshooting and routing, presenting novel approaches to accomplishing certain goals in each of these areas, demonstrating that they can indeed enhance the network management experience
Contagion Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis and Network Algorithms
This monograph provides an overview of the mathematical theories and
computational algorithm design for contagion source detection in large
networks. By leveraging network centrality as a tool for statistical inference,
we can accurately identify the source of contagions, trace their spread, and
predict future trajectories. This approach provides fundamental insights into
surveillance capability and asymptotic behavior of contagion spreading in
networks. Mathematical theory and computational algorithms are vital to
understanding contagion dynamics, improving surveillance capabilities, and
developing effective strategies to prevent the spread of infectious diseases
and misinformation.Comment: Suggested Citation: Chee Wei Tan and Pei-Duo Yu (2023), "Contagion
Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis
and Network Algorithms", Foundations and Trends in Networking: Vol. 13: No.
2-3, pp 107-251. http://dx.doi.org/10.1561/130000006
- …