74,094 research outputs found

    Finding Temporal Patterns in Noisy Longitudinal Data: A Study in Diabetic Retinopathy

    Get PDF
    This paper describes an approach to temporal pattern mining using the concept of user defined temporal prototypes to define the nature of the trends of interests. The temporal patterns are defined in terms of sequences of support values associated with identified frequent patterns. The prototypes are defined mathematically so that they can be mapped onto the temporal patterns. The focus for the advocated temporal pattern mining process is a large longitudinal patient database collected as part of a diabetic retinopathy screening programme, The data set is, in itself, also of interest as it is very noisy (in common with other similar medical datasets) and does not feature a clear association between specific time stamps and subsets of the data. The diabetic retinopathy application, the data warehousing and cleaning process, and the frequent pattern mining procedure (together with the application of the prototype concept) are all described in the paper. An evaluation of the frequent pattern mining process is also presented

    Interpretable Categorization of Heterogeneous Time Series Data

    Get PDF
    Understanding heterogeneous multivariate time series data is important in many applications ranging from smart homes to aviation. Learning models of heterogeneous multivariate time series that are also human-interpretable is challenging and not adequately addressed by the existing literature. We propose grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs extend decision trees with a grammar framework. Logical expressions derived from a context-free grammar are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. In particular, when a grammar based on temporal logic is used, we show that GBDTs can be used for the interpretable classi cation of high-dimensional and heterogeneous time series data. Furthermore, we show how GBDTs can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply GBDTs to analyze the classic Australian Sign Language dataset as well as data on near mid-air collisions (NMACs). The NMAC data comes from aircraft simulations used in the development of the next-generation Airborne Collision Avoidance System (ACAS X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data Mining (SDM) 201

    Understanding research dynamics

    Get PDF
    Rexplore leverages novel solutions in data mining, semantic technologies and visual analytics, and provides an innovative environment for exploring and making sense of scholarly data. Rexplore allows users: 1) to detect and make sense of important trends in research; 2) to identify a variety of interesting relations between researchers, beyond the standard co-authorship relations provided by most other systems; 3) to perform fine-grained expert search with respect to detailed multi-dimensional parameters; 4) to detect and characterize the dynamics of interesting communities of researchers, identified on the basis of shared research interests and scientific trajectories; 5) to analyse research performance at different levels of abstraction, including individual researchers, organizations, countries, and research communities

    Conceptual graph-based knowledge representation for supporting reasoning in African traditional medicine

    Get PDF
    Although African patients use both conventional or modern and traditional healthcare simultaneously, it has been proven that 80% of people rely on African traditional medicine (ATM). ATM includes medical activities stemming from practices, customs and traditions which were integral to the distinctive African cultures. It is based mainly on the oral transfer of knowledge, with the risk of losing critical knowledge. Moreover, practices differ according to the regions and the availability of medicinal plants. Therefore, it is necessary to compile tacit, disseminated and complex knowledge from various Tradi-Practitioners (TP) in order to determine interesting patterns for treating a given disease. Knowledge engineering methods for traditional medicine are useful to model suitably complex information needs, formalize knowledge of domain experts and highlight the effective practices for their integration to conventional medicine. The work described in this paper presents an approach which addresses two issues. First it aims at proposing a formal representation model of ATM knowledge and practices to facilitate their sharing and reusing. Then, it aims at providing a visual reasoning mechanism for selecting best available procedures and medicinal plants to treat diseases. The approach is based on the use of the Delphi method for capturing knowledge from various experts which necessitate reaching a consensus. Conceptual graph formalism is used to model ATM knowledge with visual reasoning capabilities and processes. The nested conceptual graphs are used to visually express the semantic meaning of Computational Tree Logic (CTL) constructs that are useful for formal specification of temporal properties of ATM domain knowledge. Our approach presents the advantage of mitigating knowledge loss with conceptual development assistance to improve the quality of ATM care (medical diagnosis and therapeutics), but also patient safety (drug monitoring)

    A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems

    Full text link
    We propose a set of compositional design patterns to describe a large variety of systems that combine statistical techniques from machine learning with symbolic techniques from knowledge representation. As in other areas of computer science (knowledge engineering, software engineering, ontology engineering, process mining and others), such design patterns help to systematize the literature, clarify which combinations of techniques serve which purposes, and encourage re-use of software components. We have validated our set of compositional design patterns against a large body of recent literature.Comment: 12 pages,55 reference

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System
    • …
    corecore