22 research outputs found

    Approximate text generation from non-hierarchical representations in a declarative framework

    Get PDF
    This thesis is on Natural Language Generation. It describes a linguistic realisation system that translates the semantic information encoded in a conceptual graph into an English language sentence. The use of a non-hierarchically structured semantic representation (conceptual graphs) and an approximate matching between semantic structures allows us to investigate a more general version of the sentence generation problem where one is not pre-committed to a choice of the syntactically prominent elements in the initial semantics. We show clearly how the semantic structure is declaratively related to linguistically motivated syntactic representation — we use D-Tree Grammars which stem from work on Tree-Adjoining Grammars. The declarative specification of the mapping between semantics and syntax allows for different processing strategies to be exploited. A number of generation strategies have been considered: a pure topdown strategy and a chart-based generation technique which allows partially successful computations to be reused in other branches of the search space. Having a generator with increased paraphrasing power as a consequence of using non-hierarchical input and approximate matching raises the issue whether certain 'better' paraphrases can be generated before others. We investigate preference-based processing in the context of generation

    Acta Cybernetica : Tomus 7. Fasciculus 1.

    Get PDF

    Entwurf und Implementation einer auf Graph-Grammatiken beruhenden Sprache zur Funktions-Struktur-Modellierung von Pflanzen

    Get PDF
    Increasing biological knowledge requires more and more elaborate methods to translate the knowledge into executable model descriptions, and increasing computational power allows to actually execute these descriptions. Such a simulation helps to validate, extend and question the knowledge. For plant modelling, the well-established formal description language of Lindenmayer systems reaches its limits as a method to concisely represent current knowledge and to conveniently assist in current research. On one hand, it is well-suited to represent structural and geometric aspects of plant models - of which units is a plant composed, how are these connected, what is their location in 3D space -, but on the other hand, its usage to describe functional aspects - what internal processes take place in the plant structure, how does this interact with the structure - is not as convenient as desirable. This can be traced back to the underlying representation of structure as a linear chain of units, while the intrinsic nature of the structure is a tree or even a graph. Therefore, we propose to use graphs and graph grammars as a basis for plant modelling which combines structural and functional aspects. In the first part of this thesis, we develop the necessary theoretical framework. Starting with a presentation of the state of the art concerning Lindenmayer systems and graph grammars, we develop the formalism of relational growth grammars as a variant of graph grammars. We show that this formalism has a natural embedding of Lindenmayer systems which keeps all relevant properties, but represents branched structures directly as axial trees and not as linear chains with indirect encoding of branches. In the second part, we develop the main practical result, the XL programming language as an extension of the Java programming language by very general rule-based features. Short examples illustrate the application of the new language features. We describe the built-in pattern matching algorithm of the implemented run-time system for the XL programming language, and we sketch a possible implementation of an XL compiler. The third part is an application of relational growth grammars and the XL programming language. We show how the general XL interfaces can be customized for relational growth grammars. On top of this customization, several examples from a variety of disciplines demonstrate the usefulness of the developed formalism and language to describe plant growth, especially functional-structural plant models, but also artificial life, architecture or interactive games. Some examples operate on custom graphs like XML DOM trees or scene graphs of commercial 3D modellers, while the majority uses the 3D modelling platform GroIMP, a software developed in conjunction with this thesis. The appendix gives an overview of the GroIMP software. The practical usage of its plug-in for relational growth grammars is also illustrated.Das zunehmende Wissen über biologische Prozesse verlangt nach geeigneten Methoden, es in ausführbare Modelle zu übersetzen, und die zunehmende Rechenleistung der Computer ermöglicht es, diese Modelle auch tatsächlich auszuführen. Solche Simulationen dienen zur Validierung, Erweiterung und Hinterfragung des Wissens. Speziell für die Pflanzenmodellierung wurden Lindenmayer-Systeme mit Erfolg eingesetzt, jedoch stoßen diese bei aktuellen Modellierungsproblemen und Forschungsvorhaben an ihre Grenzen. Zwar sind sie gut geeignet, Pflanzenstruktur und Geometrie abzubilden - aus welchen Einheiten setzt sich eine Pflanze zusammen, wie sind diese verbunden, wie ist ihre räumliche Lage -, aber die lineare Datenstruktur erschwert die Integration von Funktionsmodellen, welche Prozesse innerhalb der verzweigten Struktur und des beanspruchten Raumes beschreiben. Daher wird in dieser Arbeit vorgeschlagen, anstelle der linearen Stuktur Graphen und Graph-Grammatiken als Grundlage für die kombinierte Funktions-Struktur-Modellierung von Pflanzen zu verwenden. Im ersten Teil der Dissertation wird der theoretische Unterbau entwickelt. Nach einer Vorstellung des aktuellen Wissensstandes auf dem Gebiet der Lindenmayer-Systeme und Graph-Grammatiken werden relationale Wachstumsgrammatiken eingeführt, die auf bekannten Mechanismen für parallele Graph-Grammatiken aufbauen und Lindenmayer-Systeme als Spezialfall enthalten, dabei jedoch verzweigte Strukturen direkt als axiale Bäume darstellen. Zur praktischen Anwendung wird im zweiten Teil die Programmiersprache XL entwickelt, die Java um allgemein gehaltene Sprachkonstrukte für Graph-Grammatiken erweitert. Kurze Beispiele zeigen die Anwendung der neuen Sprachmerkmale. Der Algorithmus zur Mustersuche wird erläutert, und die Implementation des XL-Compilers wird vorgestellt. Im dritten Teil werden mögliche Anwendungen relationaler Wachstumsgrammatiken aufgezeigt. Dazu werden zunächst die allgemeinen XL-Schnittstellen für relationale Wachstumsgrammatiken konkretisiert, um dieses System dann für Modelle aus verschiedenen Bereichen zu nutzen, darunter Funktions-Struktur-Modelle von Pflanzen, Künstliches Leben, Architektur und interaktive Spiele. Einige Beispiele nutzen spezifische Graphen wie XML-DOM-Bäume oder Szenengraphen kommerzieller 3D-Modellierprogramme, aber der überwiegende Teil baut auf der 3D-Plattform GroIMP auf, die zusammen mit dieser Dissertation entwickelt wurde. Im Anhang wird die Software GroIMP kurz vorgestellt und ihre praktische Anwendung für relationale Wachstumsgrammatiken erläutert

    Content-based Information Retrieval via Nearest Neighbor Search

    Get PDF
    Content-based information retrieval (CBIR) has attracted significant interest in the past few years. When given a search query, the search engine will compare the query with all the stored information in the database through nearest neighbor search. Finally, the system will return the most similar items. We contribute to the CBIR research the following: firstly, Distance Metric Learning (DML) is studied to improve retrieval accuracy of nearest neighbor search. Additionally, Hash Function Learning (HFL) is considered to accelerate the retrieval process. On one hand, a new local metric learning framework is proposed - Reduced-Rank Local Metric Learning (R2LML). By considering a conical combination of Mahalanobis metrics, the proposed method is able to better capture information like data\u27s similarity and location. A regularization to suppress the noise and avoid over-fitting is also incorporated into the formulation. Based on the different methods to infer the weights for the local metric, we considered two frameworks: Transductive Reduced-Rank Local Metric Learning (T-R2LML), which utilizes transductive learning, while Efficient Reduced-Rank Local Metric Learning (E-R2LML)employs a simpler and faster approximated method. Besides, we study the convergence property of the proposed block coordinate descent algorithms for both our frameworks. The extensive experiments show the superiority of our approaches. On the other hand, *Supervised Hash Learning (*SHL), which could be used in supervised, semi-supervised and unsupervised learning scenarios, was proposed in the dissertation. By considering several codewords which could be learned from the data, the proposed method naturally derives to several Support Vector Machine (SVM) problems. After providing an efficient training algorithm, we also study the theoretical generalization bound of the new hashing framework. In the final experiments, *SHL outperforms many other popular hash function learning methods. Additionally, in order to cope with large data sets, we also conducted experiments running on big data using a parallel computing software package, namely LIBSKYLARK

    Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design

    Get PDF
    The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface

    Proceedings of the Fifth Meeting on Mathematics of Language : MOL5

    Get PDF

    Proceedings of the Fifth Meeting on Mathematics of Language : MOL5

    Get PDF

    Learning without labels and nonnegative tensor factorization

    Get PDF
    Supervised learning tasks like building a classifier, estimating the error rate of the predictors, are typically performed with labeled data. In most cases, obtaining labeled data is costly as it requires manual labeling. On the other hand, unlabeled data is available in abundance. In this thesis, we discuss methods to perform supervised learning tasks with no labeled data. We prove consistency of the proposed methods and demonstrate its applicability with synthetic and real world experiments. In some cases, small quantities of labeled data maybe easily available and supplemented with large quantities of unlabeled data (semi-supervised learning). We derive the asymptotic efficiency of generative models for semi-supervised learning and quantify the effect of labeled and unlabeled data on the quality of the estimate. Another independent track of the thesis is efficient computational methods for nonnegative tensor factorization (NTF). NTF provides the user with rich modeling capabilities but it comes with an added computational cost. We provide a fast algorithm for performing NTF using a modified active set method called block principle pivoting method and demonstrate its applicability to social network analysis and text mining.M.S.Committee Chair: Lebanon, Guy; Committee Co-Chair: Park, Haesun; Committee Member: Gray, Alexande
    corecore