1 research outputs found

    Features and Machine Learning Systems for Structured and Sequential Data

    No full text
    Modern web and communication technology relies heavily on sequential and structured data for its process execution and communication protocols. Due to its complex properties, a manual analysis and detection of problems on this data is too time-consuming and expensive, and hence not feasible. As a consequence, features and automatic learning systems on this type of data are highly sought after. To address these issues, the thesis proposes features and systems for learning on structured, sequential and temporal data, both in abstract and in concrete form, with a focus on analyses in the fields of IT security and Quality of Service, on the data domains of analysis data of malware binaries and JavaScript code, as well as on mobile network communication data. The proposed features and feature combinations cover various statistical, non-behavioral and behavioral, stateless, stateful, structural and temporal concepts, and are used individually and in a complementary manner, e.g. via hierarchical or ensemble approaches. The proposed learning systems are evaluated against competitive approaches, where they outperform commonly used and state-of-the-art methods, including approaches using neural networks. Specific practically relevant aspects are also addressed in depth, like high levels of automation to extend the scope of the system application, different re-training procedures, or the calibration of metrics relevant for the specific domain. To improve the interpretability of the system processes and their results and to increase the system reliability and its level of trust, different visualization approaches are proposed, focussing on interpretable and transparent feature projections and relevance analyses. These additional discussions on the proposed ideas further support a potential adaptation of the proposed ideas to concrete application scenarios
    corecore