189 research outputs found

    Using rules of thumb to repair inconsistent knowledge

    Get PDF

    Scalable and Declarative Information Extraction in a Parallel Data Analytics System

    Get PDF
    Informationsextraktions (IE) auf sehr großen Datenmengen erfordert hochkomplexe, skalierbare und anpassungsfähige Systeme. Obwohl zahlreiche IE-Algorithmen existieren, ist die nahtlose und erweiterbare Kombination dieser Werkzeuge in einem skalierbaren System immer noch eine große Herausforderung. In dieser Arbeit wird ein anfragebasiertes IE-System für eine parallelen Datenanalyseplattform vorgestellt, das für konkrete Anwendungsdomänen konfigurierbar ist und für Textsammlungen im Terabyte-Bereich skaliert. Zunächst werden konfigurierbare Operatoren für grundlegende IE- und Web-Analytics-Aufgaben definiert, mit denen komplexe IE-Aufgaben in Form von deklarativen Anfragen ausgedrückt werden können. Alle Operatoren werden hinsichtlich ihrer Eigenschaften charakterisiert um das Potenzial und die Bedeutung der Optimierung nicht-relationaler, benutzerdefinierter Operatoren (UDFs) für Data Flows hervorzuheben. Anschließend wird der Stand der Technik in der Optimierung nicht-relationaler Data Flows untersucht und herausgearbeitet, dass eine umfassende Optimierung von UDFs immer noch eine Herausforderung ist. Darauf aufbauend wird ein erweiterbarer, logischer Optimierer (SOFA) vorgestellt, der die Semantik von UDFs mit in die Optimierung mit einbezieht. SOFA analysiert eine kompakte Menge von Operator-Eigenschaften und kombiniert eine automatisierte Analyse mit manuellen UDF-Annotationen, um die umfassende Optimierung von Data Flows zu ermöglichen. SOFA ist in der Lage, beliebige Data Flows aus unterschiedlichen Anwendungsbereichen logisch zu optimieren, was zu erheblichen Laufzeitverbesserungen im Vergleich mit anderen Techniken führt. Als Viertes wird die Anwendbarkeit des vorgestellten Systems auf Korpora im Terabyte-Bereich untersucht und systematisch die Skalierbarkeit und Robustheit der eingesetzten Methoden und Werkzeuge beurteilt um schließlich die kritischsten Herausforderungen beim Aufbau eines IE-Systems für sehr große Datenmenge zu charakterisieren.Information extraction (IE) on very large data sets requires highly complex, scalable, and adaptive systems. Although numerous IE algorithms exist, their seamless and extensible combination in a scalable system still is a major challenge. This work presents a query-based IE system for a parallel data analysis platform, which is configurable for specific application domains and scales for terabyte-sized text collections. First, configurable operators are defined for basic IE and Web Analytics tasks, which can be used to express complex IE tasks in the form of declarative queries. All operators are characterized in terms of their properties to highlight the potential and importance of optimizing non-relational, user-defined operators (UDFs) for dataflows. Subsequently, we survey the state of the art in optimizing non-relational dataflows and highlight that a comprehensive optimization of UDFs is still a challenge. Based on this observation, an extensible, logical optimizer (SOFA) is introduced, which incorporates the semantics of UDFs into the optimization process. SOFA analyzes a compact set of operator properties and combines automated analysis with manual UDF annotations to enable a comprehensive optimization of data flows. SOFA is able to logically optimize arbitrary data flows from different application areas, resulting in significant runtime improvements compared to other techniques. Finally, the applicability of the presented system to terabyte-sized corpora is investigated. Hereby, we systematically evaluate scalability and robustness of the employed methods and tools in order to pinpoint the most critical challenges in building an IE system for very large data sets

    A unifying approach to reasoning by analogy

    Get PDF
    There has been, and still is, much interest in several disciplines in reasoning using analogy and similarity. Recent efforts in psychology and artificial intelligence have seen the development of general analogical reasoning mechanisms, which work on a variety of symbolic analogies from various domains. It is in the context of this work that this dissertation presents a unifying framework for analogy and similarity, which is designed to accomodate all current general theories and models of analogy. The approach places models of analogy into a unifying framework comprised of seven stages and four types of similarity. This framework allows current models to be assessed and compared, and deficiences observed. A new general model for analogy, which fits within this framework, is presented, overcoming many of the observed deficiences with other models. A computer program which embodies most of the key features of the new model is described, and the results of its application to several example analogies shown

    Engineering Agile Big-Data Systems

    Get PDF
    To be effective, data-intensive systems require extensive ongoing customisation to reflect changing user requirements, organisational policies, and the structure and interpretation of the data they hold. Manual customisation is expensive, time-consuming, and error-prone. In large complex systems, the value of the data can be such that exhaustive testing is necessary before any new feature can be added to the existing design. In most cases, the precise details of requirements, policies and data will change during the lifetime of the system, forcing a choice between expensive modification and continued operation with an inefficient design.Engineering Agile Big-Data Systems outlines an approach to dealing with these problems in software and data engineering, describing a methodology for aligning these processes throughout product lifecycles. It discusses tools which can be used to achieve these goals, and, in a number of case studies, shows how the tools and methodology have been used to improve a variety of academic and business systems

    Engineering Agile Big-Data Systems

    Get PDF
    To be effective, data-intensive systems require extensive ongoing customisation to reflect changing user requirements, organisational policies, and the structure and interpretation of the data they hold. Manual customisation is expensive, time-consuming, and error-prone. In large complex systems, the value of the data can be such that exhaustive testing is necessary before any new feature can be added to the existing design. In most cases, the precise details of requirements, policies and data will change during the lifetime of the system, forcing a choice between expensive modification and continued operation with an inefficient design.Engineering Agile Big-Data Systems outlines an approach to dealing with these problems in software and data engineering, describing a methodology for aligning these processes throughout product lifecycles. It discusses tools which can be used to achieve these goals, and, in a number of case studies, shows how the tools and methodology have been used to improve a variety of academic and business systems

    Decision Support Systems

    Get PDF
    Decision support systems (DSS) have evolved over the past four decades from theoretical concepts into real world computerized applications. DSS architecture contains three key components: knowledge base, computerized model, and user interface. DSS simulate cognitive decision-making functions of humans based on artificial intelligence methodologies (including expert systems, data mining, machine learning, connectionism, logistical reasoning, etc.) in order to perform decision support functions. The applications of DSS cover many domains, ranging from aviation monitoring, transportation safety, clinical diagnosis, weather forecast, business management to internet search strategy. By combining knowledge bases with inference rules, DSS are able to provide suggestions to end users to improve decisions and outcomes. This book is written as a textbook so that it can be used in formal courses examining decision support systems. It may be used by both undergraduate and graduate students from diverse computer-related fields. It will also be of value to established professionals as a text for self-study or for reference

    Fourth Conference on Artificial Intelligence for Space Applications

    Get PDF
    Proceedings of a conference held in Huntsville, Alabama, on November 15-16, 1988. The Fourth Conference on Artificial Intelligence for Space Applications brings together diverse technical and scientific work in order to help those who employ AI methods in space applications to identify common goals and to address issues of general interest in the AI community. Topics include the following: space applications of expert systems in fault diagnostics, in telemetry monitoring and data collection, in design and systems integration; and in planning and scheduling; knowledge representation, capture, verification, and management; robotics and vision; adaptive learning; and automatic programming

    Cyber-security Risk Assessment

    Get PDF
    Cyber-security domain is inherently dynamic. Not only does system configuration changes frequently (with new releases and patches), but also new attacks and vulnerabilities are regularly discovered. The threat in cyber-security is human, and hence intelligent in nature. The attacker adapts to the situation, target environment, and countermeasures. Attack actions are also driven by attacker's exploratory nature, thought process, motivation, strategy, and preferences. Current security risk assessment is driven by cyber-security expert's theories about this attacker behavior. The goal of this dissertation is to automatically generate the cyber-security risk scenarios by: * Capturing diverse and dispersed cyber-security knowledge * Assuming that there are unknowns in the cyber-security domain, and new knowledge is available frequently * Emulating the attacker's exploratory nature, thought process, motivation, strategy, preferences and his/her interaction with the target environment * Using the cyber-security expert's theories about attacker behavior The proposed framework is designed by using the unique cyber-security domain requirements identified in this dissertation and by overcoming the limitations of current risk scenario generation frameworks. The proposed framework automates the risk scenario generation by using the knowledge as it becomes available (or changes). It supports observing, encoding, validating, and calibrating cyber-security expert's theories. It can also be used for assisting the red-teaming process. The proposed framework generates ranked attack trees and encodes the attacker behavior theories. These can be used for prioritizing vulnerability remediation. The proposed framework is currently being extended for developing an automated threat response framework that can be used to analyze and recommend countermeasures. This framework contains behavior driven countermeasures that uses the attacker behavior theories to lead the attacker away from the system to be protected

    Deeper Down the Rabbit-Hole: Unfolding the Dynamics of Imagination Acts

    Get PDF
    Estudiem les dinàmiques dels actes d'imaginació des d'un punt de vista filosòfic, formal i aplicat. Partim de tres teories que identifiquen els mecanismes involucrats en els actes d'imaginació i mostrem que comparteixen una estructura semblant. Definim la lògica dels escenaris imaginaris, en què creem una capa per als actes d'imaginació sobre una lògica epistèmica per a un sol agent. Tot analitzant les propietats de la lògica, veiem que la manera com els mons imaginaris es desenvolupen està massa simplificada. Una anàlisi més profunda porta a la definició d¿una nova teoria especialment dissenyada per a les dinàmiques dels actes d'imaginació: el marc comú per als actes d'imaginació i el rombe de la imaginació. Partint d'aquesta nova teoria, definim la lògica dels actes d'imaginació, en què introduïm quatre algorismes diferents que comporten una representació molt més modular de la imaginació. Finalment, presentem la implementació del prototip d'un programa informàtic que captura els algorismes definits en la lògica anterior.Estudiamos las dinámicas de los actos de imaginación desde un punto de vista filosófico, formal y aplicado. Partimos de tres teorías que identifican los mecanismos involucrados en los actos de imaginación y mostramos que comparten una estructura similar. Definimos la lógica de los escenarios imaginarios, en la que creamos una capa para actos de imaginación partiendo de una lógica epistémica para un solo agente. Al discutir las propiedades de la lógica, vemos que el modo en que los mundos imaginarios se desarrollan está demasiado simplificado. Un análisis más profundo nos lleva a la definición de una nueva teoría especialmente diseñada para las dinámicas de los actos de imaginación: el marco común para actos de imaginación y el rombo de la imaginación. Partiendo de esta nueva teoría, definimos la lógica de los actos de imaginación, en la que introducimos cuatro algoritmos distintos que conllevan una representación mucho más modular de la imaginación. Finalmente, presentamos la implementación del prototipo de un programa informático que captura los algoritmos definidos en la lógica anterior.We study the dynamics of imagination acts at a philosophical, formal and applied level. Our research is based on three theories that identify the mechanisms involved in imagination acts and show how all of them share a similar structure. We define the Logic of Imaginary Scenarios, in which we create a layer for imagination acts upon a single-agent epistemic logic. While discussing the properties of logic, we note that the way in which imaginary worlds are developed is oversimplified. A deeper analysis leads to the definition of a new theory especially suited for the dynamics of imagination acts, called the Common Frame for Imagination Acts, and the Rhombus of Imagination. With this new theory at hand, we define the Logic of Imagination Acts, in which we introduce four different algorithms that allow for a much more modular account of imagination. Finally, we provide an implementation of a computer programme prototype that captures the algorithms defined by our latter logic
    corecore