5 research outputs found

    A multi-agent system with distributed bayesian reasoning for network fault diagnosis

    Full text link
    In this paper, an innovative approach to perform distributed Bayesian inference using a multi-agent architecture is presented. The final goal is dealing with uncertainty in network diagnosis, but the solution can be of applied in other fields. The validation testbed has been a P2P streaming video service. An assessment of the work is presented, in order to show its advantages when it is compared with traditional manual processes and other previous systems

    Fault diagnosis for IP-based network with real-time conditions

    Get PDF
    BACKGROUND: Fault diagnosis techniques have been based on many paradigms, which derive from diverse areas and have different purposes: obtaining a representation model of the network for fault localization, selecting optimal probe sets for monitoring network devices, reducing fault detection time, and detecting faulty components in the network. Although there are several solutions for diagnosing network faults, there are still challenges to be faced: a fault diagnosis solution needs to always be available and able enough to process data timely, because stale results inhibit the quality and speed of informed decision-making. Also, there is no non-invasive technique to continuously diagnose the network symptoms without leaving the system vulnerable to any failures, nor a resilient technique to the network's dynamic changes, which can cause new failures with different symptoms. AIMS: This thesis aims to propose a model for the continuous and timely diagnosis of IP-based networks faults, independent of the network structure, and based on data analytics techniques. METHOD(S): This research's point of departure was the hypothesis of a fault propagation phenomenon that allows the observation of failure symptoms at a higher network level than the fault origin. Thus, for the model's construction, monitoring data was collected from an extensive campus network in which impact link failures were induced at different instants of time and with different duration. These data correspond to widely used parameters in the actual management of a network. The collected data allowed us to understand the faults' behavior and how they are manifested at a peripheral level. Based on this understanding and a data analytics process, the first three modules of our model, named PALADIN, were proposed (Identify, Collection and Structuring), which define the data collection peripherally and the necessary data pre-processing to obtain the description of the network's state at a given moment. These modules give the model the ability to structure the data considering the delays of the multiple responses that the network delivers to a single monitoring probe and the multiple network interfaces that a peripheral device may have. Thus, a structured data stream is obtained, and it is ready to be analyzed. For this analysis, it was necessary to implement an incremental learning framework that respects networks' dynamic nature. It comprises three elements, an incremental learning algorithm, a data rebalancing strategy, and a concept drift detector. This framework is the fourth module of the PALADIN model named Diagnosis. In order to evaluate the PALADIN model, the Diagnosis module was implemented with 25 different incremental algorithms, ADWIN as concept-drift detector and SMOTE (adapted to streaming scenario) as the rebalancing strategy. On the other hand, a dataset was built through the first modules of the PALADIN model (SOFI dataset), which means that these data are the incoming data stream of the Diagnosis module used to evaluate its performance. The PALADIN Diagnosis module performs an online classification of network failures, so it is a learning model that must be evaluated in a stream context. Prequential evaluation is the most used method to perform this task, so we adopt this process to evaluate the model's performance over time through several stream evaluation metrics. RESULTS: This research first evidences the phenomenon of impact fault propagation, making it possible to detect fault symptoms at a monitored network's peripheral level. It translates into non-invasive monitoring of the network. Second, the PALADIN model is the major contribution in the fault detection context because it covers two aspects. An online learning model to continuously process the network symptoms and detect internal failures. Moreover, the concept-drift detection and rebalance data stream components which make resilience to dynamic network changes possible. Third, it is well known that the amount of available real-world datasets for imbalanced stream classification context is still too small. That number is further reduced for the networking context. The SOFI dataset obtained with the first modules of the PALADIN model contributes to that number and encourages works related to unbalanced data streams and those related to network fault diagnosis. CONCLUSIONS: The proposed model contains the necessary elements for the continuous and timely diagnosis of IPbased network faults; it introduces the idea of periodical monitorization of peripheral network elements and uses data analytics techniques to process it. Based on the analysis, processing, and classification of peripherally collected data, it can be concluded that PALADIN achieves the objective. The results indicate that the peripheral monitorization allows diagnosing faults in the internal network; besides, the diagnosis process needs an incremental learning process, conceptdrift detection elements, and rebalancing strategy. The results of the experiments showed that PALADIN makes it possible to learn from the network manifestations and diagnose internal network failures. The latter was verified with 25 different incremental algorithms, ADWIN as concept-drift detector and SMOTE (adapted to streaming scenario) as the rebalancing strategy. This research clearly illustrates that it is unnecessary to monitor all the internal network elements to detect a network's failures; instead, it is enough to choose the peripheral elements to be monitored. Furthermore, with proper processing of the collected status and traffic descriptors, it is possible to learn from the arriving data using incremental learning in cooperation with data rebalancing and concept drift approaches. This proposal continuously diagnoses the network symptoms without leaving the system vulnerable to failures while being resilient to the network's dynamic changes.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: José Manuel Molina López.- Secretario: Juan Carlos Dueñas López.- Vocal: Juan Manuel Corchado Rodrígue

    A Lightweight Approach to Distributed Network Diagnosis under Uncertainty

    No full text

    A probabilistic approach to G-PON self healing

    Get PDF
    [ES] El despliegue de accesos de fibra óptica hasta los hogares (Fiber To The Home, FTTH en adelante) es una prioridad de los operadores de telecomunicación para soportar nuevos servicios digitales y mejorar la experiencia de los usuarios. G-PON es la tecnología más común; su instalación plantea importantes retos en el diagnóstico y reparación de averías de esta infraestructura, de características muy diferentes a las de los tradicionales pares de cobre. En este artículo presentamos una experiencia basada en un enfoque probabilístico del problema.[EN] Fiber To The Home (FTTH) rollout is a priority for telecom operators to provide fixed broadband new services and improve customer experience. G-PON is the most common technical choice that creates new challenges related to diagnosis and self healing. A probabilistic approach has been evaluated in a lab environment to overcome the uncertainties of this scenario, and results that is suitable for live network.García Algarra, J.; González Ordás, J.; Arozarena, P.; Alfonso, R.; Carrera, Á. (2014). Un enfoque probabilístico en la autorreparación de redes G-PON. Revista Iberoamericana de Automática e Informática industrial. 11(1):80-85. https://doi.org/10.1016/j.riai.2013.11.005OJS8085111Badonnel R., State R., Festor O. 2006. Probabilistic Management of Ad-Hoc Networks, NOMS 2006, Vancouver, Canada. DOI: 10.1109/NOMS.2006.1687564.Barco-Moreno R. 2007. Bayesian modeling of fault diagnosis in mobile communication networks. Ph. D. dissertation, Universidad de Málaga, Spain.Brunner M., Dudkowski D., Mingardi C., Nunzi G. 2009. Probabilistic Decentralized Network Management, Proceedings IEEE INM 2009, Hofstra University, Long Island, New York, USA, pp. 25-32. DOI: 10.1109/INM.2009.5188783.Chen C., Nagi S., Clack C. 2009, Complexity and Emergence in Engineering Systems. Complex Systems in Knowledge based Environments: Theory, Models and Applications. Tolk, Andreas; Jain, Lakhmi C. (editors). Springer: New York, NY, USA. DOI: 10.1007/978-3-540-88075-2_5.Ding J., Jiang N., Li X., Krämer B., Davoli F., Bai Y. 2006. Construction of Simulation or Probabilistic Inference in uncertain and Dynamic Networks Based on Bayesian Networks, Intermational Coference on ITS Telecommunications, pp. 983-986. DOI: 10.1109/ITST.2006.288718.García-Algarra J., Arozarena-Llopis P., García-Gómez S., Carrera-Barroso A., Toribio-Sardón R. 2011. A Lightweight Approach to Distributed Network Diagnosis under Uncertainty. In Intelligent Networking, Collaborative Systems and Applications. S. Caballé, F Xhafa and A. Ajith (editors). Springer, pp. 95-116. DOI: 10.1007/978-3-642-16793-5_5.García-Gómez S. González-Ordás J., García-Algarra J., Toribio-Sardón R., Sedano-Frade A., Buisán-García, F. KOWLAN: A Multi Agent System for Bayesian Diagnosis in Telecommunication Networks. Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT’09. IEEE/WIC/ACM International Joint Conferences on. Vol. 3, pp. 195-198.ITU-T, 1989. Principles for a Telecommunications Management Network, Recommendation M.3010.Moštak R., Spahija B., Deljac Ž. 2010. Fault diagnosis in Optical access network using Bayesian Network. SoftCom 2010. Proceedings, pp. 342-345.Pearl J. 1985. Bayesian networks: A model of self-activated memory for evidential reasoning, UCLA Report CSD-850017.Prieto, A., Gillblad, D., Steinert, R., & Miron, A. (2011). Toward decentralized probabilistic management. IEEE Communications Magazine, 49(7), 80-86. doi:10.1109/mcom.2011.5936159Sedano A., González-Ordás J., Arozarena P., García-Gómez S., Carrera-Barroso A., 2010. Distributed Bayesian Diagnosis for Telecommunication Networks, Advances in Practical Applications of Agents and Multiagent Systems: 8th International Conference on Practical Applications of Agents and Multiagent Systems. DOI: 10.1007/978-3-642-12384-9_28
    corecore