94,646 research outputs found

    Interpretable Categorization of Heterogeneous Time Series Data

    Get PDF
    Understanding heterogeneous multivariate time series data is important in many applications ranging from smart homes to aviation. Learning models of heterogeneous multivariate time series that are also human-interpretable is challenging and not adequately addressed by the existing literature. We propose grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs extend decision trees with a grammar framework. Logical expressions derived from a context-free grammar are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. In particular, when a grammar based on temporal logic is used, we show that GBDTs can be used for the interpretable classi cation of high-dimensional and heterogeneous time series data. Furthermore, we show how GBDTs can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply GBDTs to analyze the classic Australian Sign Language dataset as well as data on near mid-air collisions (NMACs). The NMAC data comes from aircraft simulations used in the development of the next-generation Airborne Collision Avoidance System (ACAS X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data Mining (SDM) 201

    Addressing the needs of traumatic brain injury with clinical proteomics.

    Get PDF
    BackgroundNeurotrauma or injuries to the central nervous system (CNS) are a serious public health problem worldwide. Approximately 75% of all traumatic brain injuries (TBIs) are concussions or other mild TBI (mTBI) forms. Evaluation of concussion injury today is limited to an assessment of behavioral symptoms, often with delay and subject to motivation. Hence, there is an urgent need for an accurate chemical measure in biofluids to serve as a diagnostic tool for invisible brain wounds, to monitor severe patient trajectories, and to predict survival chances. Although a number of neurotrauma marker candidates have been reported, the broad spectrum of TBI limits the significance of small cohort studies. Specificity and sensitivity issues compound the development of a conclusive diagnostic assay, especially for concussion patients. Thus, the neurotrauma field currently has no diagnostic biofluid test in clinical use.ContentWe discuss the challenges of discovering new and validating identified neurotrauma marker candidates using proteomics-based strategies, including targeting, selection strategies and the application of mass spectrometry (MS) technologies and their potential impact to the neurotrauma field.SummaryMany studies use TBI marker candidates based on literature reports, yet progress in genomics and proteomics have started to provide neurotrauma protein profiles. Choosing meaningful marker candidates from such 'long lists' is still pending, as only few can be taken through the process of preclinical verification and large scale translational validation. Quantitative mass spectrometry targeting specific molecules rather than random sampling of the whole proteome, e.g., multiple reaction monitoring (MRM), offers an efficient and effective means to multiplex the measurement of several candidates in patient samples, thereby omitting the need for antibodies prior to clinical assay design. Sample preparation challenges specific to TBI are addressed. A tailored selection strategy combined with a multiplex screening approach is helping to arrive at diagnostically suitable candidates for clinical assay development. A surrogate marker test will be instrumental for critical decisions of TBI patient care and protection of concussion victims from repeated exposures that could result in lasting neurological deficits

    NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings

    Full text link
    Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44\% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations.Comment: This paper will appear on SCC'19 (IEEE International Conference on Services Computing) on July 1
    corecore