143 research outputs found

    Multi-dimensional dependency grammar as multigraph description

    Get PDF
    Extensible Dependency Grammar (XDG) is new, modular grammar formalism for natural language. An XDG analysis is a multi-dimensional dependency graph, where each dimension represents a different aspect of natural language, e.g. syntactic function, predicate-argument structure, information structure etc. Thus, XDG brings together two recent trends in computational linguistics: the increased application of ideas from dependency grammar and the idea of multi-layered linguistic description. In this paper, we tackle one of the stumbling blocks of XDG so far - its incomplete formalization. We present the first complete formalization of XDG, as a description language for multigraphs based on simply typed lambda calculus

    Extensible Dependency Grammar: a modular grammar formalism based on multigraph description

    Get PDF
    This thesis develops Extensible Dependency Grammar (XDG), a new grammar formalism combining dependency grammar, model-theoretic syntax, and Jackendoff\u27;s parallel grammar architecture. The design of XDG is strongly geared towards modularity: grammars can be modularly extended by any linguistic aspect such as grammatical functions, word order, predicate-argument structure, scope, information structure and prosody, where each aspect is modeled largely independently on a separate dimension. The intersective demands of the dimensions make many complex linguistic phenomena such as extraction in syntax, scope ambiguities in the semantics, and control and raising in the syntax-semantics interface simply fall out as by-products without further stipulation. This thesis makes three main contributions: 1. The first formalization of XDG as a multigraph description language in higher order logic, and investigations of its expressivity and computational complexity. 2. The first implementation of XDG, the XDG Development Kit (XDK), an extensive grammar development environment built around a constraint parser for XDG. 3. The first application of XDG to natural language, modularly modeling a fragment of English

    Coalgebraic Reasoning with Global Assumptions in Arithmetic Modal Logics

    Get PDF
    We establish a generic upper bound ExpTime for reasoning with global assumptions (also known as TBoxes) in coalgebraic modal logics. Unlike earlier results of this kind, our bound does not require a tractable set of tableau rules for the instance logics, so that the result applies to wider classes of logics. Examples are Presburger modal logic, which extends graded modal logic with linear inequalities over numbers of successors, and probabilistic modal logic with polynomial inequalities over probabilities. We establish the theoretical upper bound using a type elimination algorithm. We also provide a global caching algorithm that potentially avoids building the entire exponential-sized space of candidate states, and thus offers a basis for practical reasoning. This algorithm still involves frequent fixpoint computations; we show how these can be handled efficiently in a concrete algorithm modelled on Liu and Smolka's linear-time fixpoint algorithm. Finally, we show that the upper complexity bound is preserved under adding nominals to the logic, i.e. in coalgebraic hybrid logic.Comment: Extended version of conference paper in FCT 201

    An efficient algorithm for learning to rank from preference graphs

    Get PDF
    In this paper, we introduce a framework for regularized least-squares (RLS) type of ranking cost functions and we propose three such cost functions. Further, we propose a kernel-based preference learning algorithm, which we call RankRLS, for minimizing these functions. It is shown that RankRLS has many computational advantages compared to the ranking algorithms that are based on minimizing other types of costs, such as the hinge cost. In particular, we present efficient algorithms for training, parameter selection, multiple output learning, cross-validation, and large-scale learning. Circumstances under which these computational benefits make RankRLS preferable to RankSVM are considered. We evaluate RankRLS on four different types of ranking tasks using RankSVM and the standard RLS regression as the baselines. RankRLS outperforms the standard RLS regression and its performance is very similar to that of RankSVM, while RankRLS has several computational benefits over RankSVM

    A Socio-mathematical and Structure-Based Approach to Model Sentiment Dynamics in Event-Based Text

    Get PDF
    Natural language texts are often meant to express or impact the emotions of individuals. Recognizing the underlying emotions expressed in or triggered by textual content is essential if one is to arrive at an understanding of the full meaning that textual content conveys. Sentiment analysis (SA) researchers are becoming increasingly interested in investigating natural language processing techniques as well as emotion theory in order to detect, extract, and classify the sentiments that natural language text expresses. Most SA research is focused on the analysis of subjective documents from the writer’s perspective and their classification into categorical labels or sentiment polarity, in which text is associated with a descriptive label or a point on a continuum between two polarities. Researchers often perform sentiment or polarity classification tasks using machine learning (ML) techniques, sentiment lexicons, or hybrid-based approaches. Most ML methods rely on count-based word representations that fail to take word order into account. Despite the successful use of these flat word representations in topic-modelling problems, SA problems require a deeper understanding of sentence structure, since the entire meaning of words can be reversed through negations or word modifiers. On the other hand, approaches based on semantic lexicons are limited by the relatively small number of words they contain, which do not begin to embody the extensive and growing vocabulary on the Internet. The research presented in this thesis represents an effort to tackle the problem of sentiment analysis from a different viewpoint than those underlying current mainstream studies in this research area. A cross-disciplinary approach is proposed that incorporates affect control theory (ACT) into a structured model for determining the sentiment polarity of event-based articles from the perspectives of readers and interactants. A socio-mathematical theory, ACT provides valuable resources for handling interactions between words (event entities) and for predicting situational sentiments triggered by social events. ACT models human emotions arising from social event terms through the use of multidimensional representations that have been verified both empirically and theoretically. To model human emotions regarding textual content, the first step was to develop a fine-grained event extraction algorithm that extracts events and their entities from event-based textual information using semantic and syntactic parsing techniques. The results of the event extraction method were compared against a supervised learning approach on two human-coded corpora (a grammatically correct and a grammatically incorrect structured corpus). For both corpora, the semantic-syntactic event extraction method yielded a higher degree of accuracy than the supervised learning approach. The three-dimensional ACT lexicon was also augmented in a semi-supervised fashion using graph-based label propagation built from semantic and neural network word embeddings. The word embeddings were obtained through the training of commonly used count-based and neural-network-based algorithms on a single corpus, and each method was evaluated with respect to the reconstruction of a sentiment lexicon. The results show that, relative to other word embeddings and state-of-the-art methods, combining both semantic and neural word embeddings yielded the highest correlation scores and lowest error rates. Using the augmented lexicon and ACT mathematical equations, human emotions were modelled according to different levels of granularity (i.e., at the sentence and document levels). The initial stage involved the development of a proposed entity-based SA approach that models reader emotions triggered by event-based sentences. The emotions are modelled in a three-dimensional space based on reader sentiment toward different entities (e.g., subject and object) in the sentence. The new approach was evaluated using a human-annotated news-headline corpus; the results revealed the proposed method to be competitive with benchmark ML techniques. The second phase entailed the creation of a proposed ACT-based model for predicting the temporal progression of the emotions of the interactants and their optimal behaviour over a sequence of interactions. The model was evaluated using three different corpora: fairy tales, news articles, and a handcrafted corpus. The results produced by the proposed model demonstrate that, despite the challenging sentence structure, a reasonable agreement was achieved between the estimated emotions and behaviours and the corresponding ground truth

    Methods for taking semantic graphs apart and putting them back together again

    Get PDF
    The thesis develops a competitive compositional semantic parser for Abstract Meaning Representation (AMR). This approach combines a neural model with mechanisms that echo ideas from compositional semantic construction in a new, simple dependency structure. The thesis first tackles the task of generating structured training data necessary for a compositional approach, by developing the linguistically motivated AM algebra. Encoding the terms over the AM algebra as dependency trees yields a simple semantic parsing model where neural tagging and dependency models predict interpretable, meaningful operations that construct the AMR.Diese Dissertation entwickelt einen kompositionellen semantischen Parser für den Graphformalismus Abstract Meaning Representation (AMR). Der Ansatz kombiniert ein neuronales Modell mit Mechanismen, die Ideen der klassischen kompositionellen semantischen Konstruktion widerspiegeln. Die Arbeit geht zunächst das Problem an, strukturierte latente Trainingsdaten zu erzeugen die für den kompositionellen Ansatz nötig sind. Für diesen Zweck wird die linguistisch motivierte AM Algebra entwickelt. Indem die Terme der AM Algebra als Dependenzbäume ausgedrückt werden, erhalten wir ein Modell für semantisches Parsen, in dem neuronale Tagging- und Dependenzmodelle interpretierbare, aussagekräftige Operationen vorhersagen die dann den AMR Graphen erzeugen. Damit erreicht das Modell starke Evaluationsergebnisse und deutliche Verbesserungen gegenüber einem weniger strukturierten Vergleichsmodell.DF

    Gramatička evolucija tehničkih procesa

    Get PDF
    Teorija tehničkih sustava objašnjava tehničku evoluciju, konstruiranje i razvoj proizvoda kao odgovor na potrebe društva koje se mogu ostvariti tehničkim procesima. Takvo teleološko shvaćanje nalaže kao početni korak u razvoju koncepta novog proizvoda utvrđivanje tehničkog procesa kao procesa unutar kojega se sudjelovanjem tehničkoga proizvoda ostvaruju efekti potrebni za svrhovitu transformaciju operanada sukladno radnim principima na kojima se tehnički proces temelji. Cilj istraživanja u okviru izrade doktorskog rada jest kreiranje računalne podrške upravo za taj početni korak konceptualne faze razvoja proizvoda. Generiranje varijanti transformacije operanada računalnom mogu stvoriti osnovu koja će poslužiti za temeljitije razmatranje mogućnosti za realizaciju tehničkoga proizvoda. Sukladno znanstveno-istraživačkoj metodologiji prisutnoj unutar područja znanosti o konstruiranju, istraživanje u okviru ovoga rada provedeno je unutar dvije faze: teoretska faza koja obuhvaća definiranje metode za generiranje varijanti transformacije operanda temeljem poznatih radnih principa, i praktična faza koja obuhvaća razvitak računalnog alata na osnovu definirane metode do razine koja će omogućiti potvrđivanje rezultata istraživanja. Teoretska faza istraživanja zaključena je sa glavnim znanstvenim doprinosima ove disertacije: (1) definiran je formalni model tehničkog procesa, (2) definiran je formalni model sinteze tehničkih procesa temeljen na graf-gramatikama, (3) uvedena je mogućnost pretraživanja varijanti transformacije koristeći se algoritmom gramatičke evolucije [3]. Praktična faza ovoga istraživanja rezultirala je računalnom implementacijom definirane metode za generiranje varijanti transformacije operanada u okruženju za tu svrhu osmišljenog i razvijenoga računalnoga alata. Tijekom istraživanja utvrđeno je da generalizirano i sistematizirano znanje o tehničkim procesima i radnim principima unutar područja još uvijek nije dostupno u obliku dovoljno detaljne taksonomije ili ontologije za razinu koju zahtijeva definirana metoda. Iz tog razloga predložene su smjernice za graf-gramatičku formalizaciju znanja o tehničkim procesima i radnim principima (4)

    Extremal colorings and extremal satisfiability

    Full text link
    Combinatorial problems are often easy to state and hard to solve. A whole bunch of graph coloring problems falls into this class as well as the satisfiability problem. The classical coloring problems consider colorings of objects such that two objects which are in a relation receive different colors, e.g., proper vertex-colorings, proper edge-colorings, or proper face-colorings of plane graphs. A generalization is to color the objects such that some predefined patterns are not monochromatic. Ramsey theory deals with questions under what conditions such colorings can occur. A more restrictive version of colorings forces some substructures to be polychromatic, i.e., to receive all colors used in the coloring at least once. Also a true-false-assignment to the boolean variables of a formula can be seen as a 2-coloring of the literals where there are restrictions that complementary literals receive different colors. Mostly, the hardness of such problems is been made explicit by proving that they are NP-hard. This indicates that there might be no simple characterization of all solvable instances. Extremal questions then become quite handy, because they do not aim at a complete characteriziation, but rather focus on one parameter and ask for its minimum or maximum value. The goal of this thesis is to demonstrate this general way on different problems in the area of graph colorings and satisfiability of boolean formulas. First, we consider graphs where all edge-2-colorings contain a monochromatic copy of some fixed graph H. Such graphs are called H-Ramsey graphs and we concentrate on their minimum degree. Its minimization is the question we are going to answer for H being a biregular bipartite graph, a forest, or a bipartite graph where the size of both partite sets are equal. Second, vertex-colorings of plane multigraphs are studied such that each face is polychromatic. A natural parameter to upper bound the number of colors which can be used in such a coloring is the size g of the smallest face. We show that every graph can be polychromatically colored with \floor{3g-5}{4} colors and there are examples for which this bound is almost tight. Third, we consider a variant of the satisfiability problem where only some (not necessarily all) assignments are allowed. A natural way to choose such a set of allowed assignments is to use a context-free language. If in addition the number of all allowed assignments of length n is lower bounded by Ω(αn)\Omega(\alpha^n) (an) for some α>1\alpha > 1, then this restricted satisfiability problem will be shown to be NP-hard. Otherwise, there are only polynomially many allowed assignments and the restricted satisfiability problem is proven to be polynomially solvable

    Tool-supported identification of functional concerns in object-oriented code

    Get PDF
    Concern identification aims to find the implementation of a functional concern in existing source code. In this work, concerns are described, using the Hierarchic Concern Model, as gray-boxes containing subconcerns, inputs, and outputs. The inputs and outputs are used as concern seeds to identify data-oriented abstractions of concern implementations, called concern skeletons. The identification approach is based on context free language reachability and supported by a tool, called CoDEx
    corecore