13 research outputs found

    On using pretext tasks to learn representations from network logs

    Get PDF
    Learning meaningful representations from network data is critical to ease the adoption of AI as a cornerstone to process network logs. Since a large portion of such data is textual, Natural Language Processing (NLP) appears as an obvious candidate to learn their representations. Indeed, the literature proposes impressive applications of NLP applied to textual network data. However, in the absence of labels, objectively evaluating the goodness of the learned representations is still an open problem. We call for a systematic adoption of domain-specific pretext tasks to select the best representation from network data. Relying on such tasks enables us to evaluate different representations on side machine learning problems and, ultimately, unveiling the best candidate representations for the more interesting downstream tasks for which labels are scarce or unavailable. We apply pretext tasks in the analysis of logs collected from SSH honeypots. Here, a cumbersome downstream task is to cluster events that exhibit a similar attack pattern. We propose the following pipeline: first, we represent the input data using a classic NLP-based approach. Then, we design pretext tasks to objectively evaluate the representation goodness and to select the best one. Finally, we use the best representation to solve the unsupervised task, which uncovers interesting behaviours and attack patterns. All in all, our proposal can be generalized to other text-based network logs beyond honeypots

    Design of Analysis Modules

    Full text link
    This public deliverable describes the design and specification of a first set of basic analysis modules for addressing the use cases identified in WP1. The document focuses on the required algorithms, which use as input the measurements and analysis provided by the lower layers (WP2 and WP3) of the mPlane architecture to provide more advanced analysis and answers towards the resolution of the problem addressed by the use case. These analysis modules include both stream and batch processing algorithms and address issues such as classifications, estimations, predictions, detections, correlations and diagnosis.mPlane – an Intelligent Measurement Plane for Future Network and Application Managemen

    Cross-check of Analysis Modules and Reasoner Interactions

    Full text link
    This deliverable presents an extended set of Analysis Modules, including both the improvements done to those presented in deliverable D4.1 as well as the new analysis algorithms designed and developed to address use-cases. The deliverable also describes a complete workflow description for the different use-cases, including both stream processing for real-time monitoring applications as well as batch processing for “off-line” analysis. This workflow description specifies the iterative interaction loop between WP2, WP3, T4.1, and T4.2, thereby allowing for a cross-checking of the analysis modules and the reasoner interactions.mPlane – an Intelligent Measurement Plane for Future Network and Application Managemen
    corecore