3 research outputs found

    Interpretable Sequence Classification via Discrete Optimization

    Full text link
    Sequence classification is the task of predicting a class label given a sequence of observations. In many applications such as healthcare monitoring or intrusion detection, early classification is crucial to prompt intervention. In this work, we learn sequence classifiers that favour early classification from an evolving observation trace. While many state-of-the-art sequence classifiers are neural networks, and in particular LSTMs, our classifiers take the form of finite state automata and are learned via discrete optimization. Our automata-based classifiers are interpretable---supporting explanation, counterfactual reasoning, and human-in-the-loop modification---and have strong empirical performance. Experiments over a suite of goal recognition and behaviour classification datasets show our learned automata-based classifiers to have comparable test performance to LSTM-based classifiers, with the added advantage of being interpretable

    Platforms for deployment of scalable on- and off-line data analytics.

    Get PDF
    The ability to exploit the intelligence concealed in bulk data to generate actionable insights is increasingly providing competitive advantages to businesses, government agencies, and charitable organisations. The burgeoning field of Data Science, and its related applications in the field of Data Analytics, finds broader applicability with each passing year. This expansion of users and applications is matched by an explosion in tools, platforms, and techniques designed to exploit more types of data in larger volumes, with more techniques, and at higher frequencies than ever before. This diversity in platforms and tools presents a new challenge for organisations aiming to integrate Data Science into their daily operations. Designing an analytic for a particular platform necessarily involves “lock-in” to that specific implementation – there are few opportunities for algorithmic portability. It is increasingly challenging to find engineers with experience in the diverse suite of tools available as well as understanding the precise details of the domain in which they work: the semantics of the data, the nature of queries and analyses to be executed, and the interpretation and presentation of results. The work presented in this thesis addresses these challenges by introducing a number of techniques to facilitate the creation of analytics for equivalent deployment across a variety of runtime frameworks and capabilities. In the first instance, this capability is demonstrated using the first Domain Specific Language and associated runtime environments to target multiple best-in-class frameworks for data analysis from the streaming and off-line paradigms. This capability is extended with a new approach to modelling analytics based around a semantically rich type system. An analytic planner using this model is detailed, thus empowering domain experts to build their own scalable analyses, without any specific programming or distributed systems knowledge. This planning technique is used to assemble complex ensembles of hybrid analytics: automatically applying multiple frameworks in a single workflow. Finally, this thesis demonstrates a novel approach to the speculative construction, compilation, and deployment of analytic jobs based around the observation of user interactions with an analytic planning system

    Planning-Based Reasoning for Automated Large-Scale Data Analysis

    No full text
    In this paper, we apply planning-based reasoning to orchestrate the data analysis process automatically, with a focus on two applications: early detection of health complications in critical care, and detection of anomalous behaviors of network hosts in enterprise networks. Our system uses expert knowledge and AI planning to reason about possibly incomplete, noisy, or inconsistent observations, derived from data by deploying an open set of analytics, to generate plausible and consistent hypotheses about the state of the world. From these hypotheses, relevant actions are triggered leading to the deployment of additional analytics, or adaptation of existing analytics, that produce new observations for further reasoning. Planning-based reasoning is enabled by knowledge models obtained from domain experts that describe entities in the world, their states, and relationship to observations. To address the associated knowledge engineering challenges, we propose a modeling language named LTS++ and build an Integrated Development Environment. We also develop a process that provides support and guidance to domain experts, with no planning expertise, in defining and constructing models. We use this modeling process to capture knowledge for the two applications and to collect user feedback. Furthermore, we conduct empirical evaluation to demonstrate the feasibility of our approach and the benefits of using planning-based reasoning in these applications, at large real-world scales. Specifically, in the network monitoring scenario, we show that the system can dynamically deploy and manage analytics for the effective detection of anomalies and malicious behaviors with lead times of over 15 minutes, in an enterprise network with over 2 million hosts (entities)
    corecore