3,506 research outputs found

    Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

    Get PDF
    We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

    STEP : Überblick über eine zukünftige Schnittstelle zum Produktdatenaustausch

    Get PDF
    STEP (Standard for the Exchange of Product Model Data) ist ein von der ISO entwickeltes Standardformat zur Abbildung produktdefinierender Daten (ISO TC 184/SC 4, NAM 96.4) im Gesamtkomplex der CIM-Techniken (Computer Integrated Manufacturing), der 1993 weltweiter Standard werden soll. In diesem Bericht wird ein Überblick über den derzeitigen Entwicklungsstand von STEP gegeben. Dabei werden die bereits weitgehend stabilen Teile detailliert beschrieben

    Ein Multiagentenansatz zum Lösen von Fleet-Scheduling-Problemen

    Get PDF

    Theoretical consideration of goal recognition aspects for understanding information in business letters

    Get PDF
    Businesses are drowning in information - paper forms, e-mail, phone-calls and other media do struggle the speed of managers in handling and processing information. Traditional computer systems do not support business flow because of their inflexibility and their lack in understanding information. A sophisticated understanding of the meaning of a business letter requires an understanding of why the sender wrote it. This paper describes some ideas to use goal recognition techniques as one possibility, or method to initiate information understanding. It brings together two areas of cognition: goal recognition and document understanding. To do so, it gives an overview of the application of goal recognition techniques to the discovery of the overall purpose of a letter and a coherent explanation of how the individual sentences are meant to achieve that purpose

    Identifying landscape relevant natural language using actively crowdsourced landscape descriptions and sentence-transformers

    Full text link
    Natural language has proven to be a valuable source of data for various scientific inquiries including landscape perception and preference research. However, large high quality landscape relevant corpora are scare. We here propose and discuss a natural language processing workflow to identify landscape relevant documents in large collections of unstructured text. Using a small curated high quality collection of actively crowdsourced landscape descriptions we identify and extract similar documents from two different corpora (Geograph and WikiHow) using sentence-transformers and cosine similarity scores. We show that 1) sentence-transformers combined with cosine similarity calculations successfully identify similar documents in both Geograph and WikiHow effectively opening the door to the creation of new landscape specific corpora, 2) the proposed sentence-transformer approach outperforms traditional Term Frequency - Inverse Document Frequency based approaches and 3) the identified documents capture similar topics when compared to the original high quality collection. The presented workflow is transferable to various scientific disciplines in need of domain specific natural language corpora as underlying data

    Terminological reasoning with constraint handling rules

    Get PDF
    Constraint handling rules (CHRs) are a flexible means to implement \u27user-defined\u27 constraints on top of existing host languages (like Prolog and Lisp). Recently, M. Schmidt-Schauß and G. Smolka proposed a new methodology for constructing sound and complete inference algorithms for terminological knowledge representation formalisms in the tradition of KLONE. We propose CHRs as a flexible implementation language for the consistency test of assertions, which is the basis for all terminological reasoning services. The implementation results in a natural combination of three layers: (i) a constraint layer that reasons in well- understood domains such as rationals or finite domains, (ii) a terminological layer providing a tailored, validated vocabulary on which (iii) the application layer can rely. The flexibility of the approach will be illustrated by extending the formalism, its implementation and an application example (solving configuration problems) with attributes, a new quantifier and concrete domains

    The use of abstraction concepts for representing and structuring documents

    Get PDF
    Due to the amount of documents available in modern offices, it is necessary to provide a multitude of methods for the structuring of knowledge, i.e., abstraction concepts. In order to achieve their uniform representation, such concepts should be considered in an integrated fashion to allow concise descriptions free of redundancy. In this paper, we present our approach towards an integration of methods of knowledge structuring. For this purpose, our view of abstraction concepts is briefly introduced using examples of the document world and compared with some existing systems. The main focus of this paper is to show the applicability of an integration of these abstraction concepts as well as their built-in reasoning facilities in supporting document processing and management

    Self-adapting structuring and representation of space

    Get PDF
    The objective of this report is to propose a syntactic formalism for space representation. Beside the well known advantages of hierarchical data structure, the underlying approach has the additional strength of self-adapting to a spatial structure at hand. The formalism is called puzzletree because its generation results in a number of blocks which in a certain order -- like a puzzle - reconstruct the original space. The strength of the approach does not lie only in providing a compact representation of space (e.g. high compression), but also in attaining an ideal basis for further knowledge-based modeling and recognition of objects. The approach may be applied to any higher-dimensioned space (e.g. images, volumes). The report concentrates on the principles of puzzletrees by explaining the underlying heuristic for their generation with respect to 2D spaces, i.e. images, but also schemes their application to volume data. Furthermore, the paper outlines the use of puzzletrees to facilitate higher-level operations like image segmentation or object recognition. Finally, results are shown and a comparison to conventional region quadtrees is done

    Digital Forensics AI: Evaluating, Standardizing and Optimizing Digital Evidence Mining Techniques

    Get PDF
    The impact of AI on numerous sectors of our society and its successes over the years indicate that it can assist in resolving a variety of complex digital forensics investigative problems. Forensics analysis can make use of machine learning models’ pattern detection and recognition capabilities to uncover hidden evidence in digital artifacts that would have been missed if conducted manually. Numerous works have proposed ways for applying AI to digital forensics; nevertheless, scepticism regarding the opacity of AI has impeded the domain’s adequate formalization and standardization. We present three critical instruments necessary for the development of sound machine-driven digital forensics methodologies in this paper. We cover various methods for evaluating, standardizing, and optimizing techniques applicable to artificial intelligence models used in digital forensics. Additionally, we describe several applications of these instruments in digital forensics, emphasizing their strengths and weaknesses that may be critical to the methods’ admissibility in a judicial process

    Eine Übersicht über Information Retrieval (IR) und NLP-Verfahren zur Klassifikation von Texten

    Get PDF
    Die vorliegende Arbeit soll einen kurzen Überblick über gängige Ansätze aus dem Information Retrieval (IR) und der Natürlichsprachlichen Verarbeitung (NLP) zur Informationsextraktion geben. Diese Untersuchung wurde primär mit dem Ziel durchgeführt, statistische und wissensbasierte Techniken auf ihre Einsetzbarkeit zur Klassifikation von Texten zu evaluieren. Wir unterscheiden zwischen statistischen, regelbasierten, konzeptbasierten, probabilistischen sowie konnektionistischen Verfahren und stellen exemplarisch hierfür bekannte Systeme vor. Sowohl Information Retrieval- als auch NLP-Systeme gehen von korrekten ASCII-Texten als Eingabe aus. Diese Voraussetzung gilt jedoch in der Dokumentanalyse nicht. Nach dem optischen Abtasten eines Dokuments, der Strukturanalyse und der nachfolgenden Texterkennung treten Wortalternativen mit Erkennungswahrscheinlichkeiten auf, die bei der partiellen inhaltlichen Analyse, d. h. der Informationsextraktion aus Texten, berücksichtigt werden müssen. Deshalb gehen wir am Schluß der Arbeit darauf ein, inwieweit die oben genannten Verfahren prinzipiell auf die Dokumentanalyse übertragbar sind. Vorab soll betont werden, daß die vorliegende Studie zwei im Rahmen des ALV-Projektes am DFKI entwickelte Prototypen zur inhaltsbasierten Klassifikation von Dokumenten motiviert: einer verwendet statistische Methoden zur automatischen Indexierung; der andere beruht auf einem Regelinterpreter, der die bewerteten Worthypothesen als Evidenzen für Konzepte durch ein hierarchisches Netzwerk propagiert
    corecore