6 research outputs found

    Knowledge-based spatiotemporal linear abstraction

    Full text link
    We present a theoretical framework and a case study for reusing the same conceptual and computational methodology for both temporal abstraction and linear (unidimensional) space abstraction, in a domain (evaluation of traffic-control actions) significantly different from the one (clinical medicine) in which the method was originally used. The method, known as knowledge-based temporal abstraction, abstracts high-level concepts and patterns from time-stamped raw data using a formal theory of domain-specific temporal-abstraction knowledge. We applied this method, originally used to interpret time-oriented clinical data, to the domain of traffic control, in which the monitoring task requires linear pattern matching along both space and time. First, we reused the method for creation of unidimensional spatial abstractions over highways, given sensor measurements along each highway measured at the same time point. Second, we reused the method to create temporal abstractions of the traffic behavior, for the same space segments, but during consecutive time points. We defined the corresponding temporal-abstraction and spatial-abstraction domain-specific knowledge. Our results suggest that (1) the knowledge-based temporal-abstraction method is reusable over time and unidimensional space as well as over significantly different domains; (2) the method can be generalized into a knowledge-based linear-abstraction method, which solves tasks requiring abstraction of data along any linear distance measure; and (3) a spatiotemporal-abstraction method can be assembled from two copies of the generalized method and a spatial-decomposition mechanism, and is applicable to tasks requiring abstraction of time-oriented data into meaningful spatiotemporal patterns over a linear, decomposable space, such as traffic over a set of highways

    Approximate Data Mining Techniques on Clinical Data

    Get PDF
    The past two decades have witnessed an explosion in the number of medical and healthcare datasets available to researchers and healthcare professionals. Data collection efforts are highly required, and this prompts the development of appropriate data mining techniques and tools that can automatically extract relevant information from data. Consequently, they provide insights into various clinical behaviors or processes captured by the data. Since these tools should support decision-making activities of medical experts, all the extracted information must be represented in a human-friendly way, that is, in a concise and easy-to-understand form. To this purpose, here we propose a new framework that collects different new mining techniques and tools proposed. These techniques mainly focus on two aspects: the temporal one and the predictive one. All of these techniques were then applied to clinical data and, in particular, ICU data from MIMIC III database. It showed the flexibility of the framework, which is able to retrieve different outcomes from the overall dataset. The first two techniques rely on the concept of Approximate Temporal Functional Dependencies (ATFDs). ATFDs have been proposed, with their suitable treatment of temporal information, as a methodological tool for mining clinical data. An example of the knowledge derivable through dependencies may be "within 15 days, patients with the same diagnosis and the same therapy usually receive the same daily amount of drug". However, current ATFD models are not analyzing the temporal evolution of the data, such as "For most patients with the same diagnosis, the same drug is prescribed after the same symptom". To this extent, we propose a new kind of ATFD called Approximate Pure Temporally Evolving Functional Dependencies (APEFDs). Another limitation of such kind of dependencies is that they cannot deal with quantitative data when some tolerance can be allowed for numerical values. In particular, this limitation arises in clinical data warehouses, where analysis and mining have to consider one or more measures related to quantitative data (such as lab test results and vital signs), concerning multiple dimensional (alphanumeric) attributes (such as patient, hospital, physician, diagnosis) and some time dimensions (such as the day since hospitalization and the calendar date). According to this scenario, we introduce a new kind of ATFD, named Multi-Approximate Temporal Functional Dependency (MATFD), which considers dependencies between dimensions and quantitative measures from temporal clinical data. These new dependencies may provide new knowledge as "within 15 days, patients with the same diagnosis and the same therapy receive a daily amount of drug within a fixed range". The other techniques are based on pattern mining, which has also been proposed as a methodological tool for mining clinical data. However, many methods proposed so far focus on mining of temporal rules which describe relationships between data sequences or instantaneous events, without considering the presence of more complex temporal patterns into the dataset. These patterns, such as trends of a particular vital sign, are often very relevant for clinicians. Moreover, it is really interesting to discover if some sort of event, such as a drug administration, is capable of changing these trends and how. To this extent, we propose a new kind of temporal patterns, called Trend-Event Patterns (TEPs), that focuses on events and their influence on trends that can be retrieved from some measures, such as vital signs. With TEPs we can express concepts such as "The administration of paracetamol on a patient with an increasing temperature leads to a decreasing trend in temperature after such administration occurs". We also decided to analyze another interesting pattern mining technique that includes prediction. This technique discovers a compact set of patterns that aim to describe the condition (or class) of interest. Our framework relies on a classification model that considers and combines various predictive pattern candidates and selects only those that are important to improve the overall class prediction performance. We show that our classification approach achieves a significant reduction in the number of extracted patterns, compared to the state-of-the-art methods based on minimum predictive pattern mining approach, while preserving the overall classification accuracy of the model. For each technique described above, we developed a tool to retrieve its kind of rule. All the results are obtained by pre-processing and mining clinical data and, as mentioned before, in particular ICU data from MIMIC III database

    Erweiterung von Konzepten des complex event processings zur informationslogistischen Verarbeitung telemedizinischer Ereignisse

    Get PDF
    Erste Abschätzungen für das Gesundheitswesen prognostizieren einen Anstieg an Daten von 500 Petabytes im Jahr 2012 auf 25.000 Petabytes im Jahr 2020. Der BITKOM untermauert dieses und benennt eine jährliche Wachstumsrate an Daten von 40-50%. Frost & Sullivan haben die Daten innerhalb von Krankenhäusern auf 1 Milliarde Terabytes geschätzt und prognostizieren für das Jahr 2016 eine Datenmenge von 1.8 Zetabytes. Die zur Verfügung stehenden Daten zeichnen sich durch ein hohes Maß an Heterogenität aus. Insbesondere hochfrequente Echtzeitdaten, wie sie beim Vitalwertmonitoring entstehen, besitzen einen hohen medizinischen Wert, sind jedoch gleichzeitig nur schwer zu erschließen. Im Rahmen dieser Arbeit werden deshalb Konzepte entwickelt, die eine intelligente Verarbeitung von heterogen verteilten Vitalwerten ermöglichen. Zielsetzung ist es, hierbei eben solche Daten derart zu filtern und verdichten, dass hieraus entscheidungsunterstützende Informationen entstehen und das Maß der Informationsüberversorgung reduziert wird. Hierzu werden Konzepte aus den beiden Forschungsfeldern Informationslogistik und Complex Event Processing betrachtet und zu einem ereignisverarbeitenden System für telemedizinische Ereignisse zusammengeführt. Mithilfe der temporalen Abstraktion werden aus einer zeitlich geordneten Menge von einfachen Ereignissen komplexe Ereignisse - sog. Trendpattern - erzeugt. Unter Anwendung des formalisierten Informationsbedarfs eines Anwenders, werden aus diesen Pattern bedarfsgerechte Informationen erzeugt. Die wesentliche Eigenschaft des zu konzipierenden und implementierenden Systems ist die Modularisierung der Verarbeitungsroutinen, zur einfachen Adaption an sich verändernde Gesundheitszustände und somit eine Reduzierung notwendiger Implementierungsaufwendungen. Die konzeptionellen und implementatorischen Ergebnisse dieser Arbeit werden im Rahmen einer Evaluation unter Anwendung großer, heterogener Datenbestände bewertet. Im Fokus steht hierbei der Nachweis einer bedarfsgerechten Verdichtung von Daten zu Informationen sowie einer Minimierung von Implementierungsaufwendungen
    corecore