158 research outputs found

    Amalia -- A Unified Platform for Parsing and Generation

    Full text link
    Contemporary linguistic theories (in particular, HPSG) are declarative in nature: they specify constraints on permissible structures, not how such structures are to be computed. Grammars designed under such theories are, therefore, suitable for both parsing and generation. However, practical implementations of such theories don't usually support bidirectional processing of grammars. We present a grammar development system that includes a compiler of grammars (for parsing and generation) to abstract machine instructions, and an interpreter for the abstract machine language. The generation compiler inverts input grammars (designed for parsing) to a form more suitable for generation. The compiled grammars are then executed by the interpreter using one control strategy, regardless of whether the grammar is the original or the inverted version. We thus obtain a unified, efficient platform for developing reversible grammars.Comment: 8 pages postscrip

    An Abstract Machine for Unification Grammars

    Full text link
    This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

    Linguagem e ambiente para modelação da interface com o utilizador de aplicações de software orientadas aos dados

    Get PDF
    O desenvolvimento de sistemas interativos envolve normalmente a modelação, construção e integração de diferentes componentes separados. A User Interface (UI) é o componente através do qual o utilizador acede às funcionalidades do sistema e o seu desenvolvimento tipicamente envolve também a criação de modelos. Na Engenharia de Software a prática comum é a criação de um modelo do sistema utilizando a Unified Modelling Language (UML). Contudo, a linguagem UML não apresenta um suporte concreto para a modelação abstrata da UI pelo que as abordagens existentes, fora do âmbito da UML, não complementam os modelos tipicamente usados na Engenharia de Software, ou seja não existe uma correspondência concreta entre os seus elementos. De forma a colmatar este problema este trabalho de mestrado apresenta a linguagem xCAP, para modelação de interfaces com o utilizador, baseada na linguagem Canonical Abstract Prototype e na correspondência dos seus elementos concretos com os elementos abstratos ou conceitos de um User Interface Metamodel selecionado da literatura. Como complemento à criação da linguagem xCAP, e como parte integrante deste trabalho, foi desenvolvida a aplicação web MetaCAP cujo objetivo é permitir a criação e edição de modelos da UI de software orientado aos dados baseados na linguagem xCAP, e integrados com os modelos UML mais usados. Em suma, a criação da linguagem xCAP e da aplicação MetaCAP tem como objetivo permitir o relacionamento/associação entre os diferentes modelos que descrevem um sistema e a adoção de nomenclaturas semelhantes na sua construção.The development of interactive systems typically comprises the modeling, development and integration of different separate components. The User Interface (UI) is the component through which the user accesses system functionalities and its development also typically involves the creation of models. In Software Engineering the common practice is to create a system model using the Unified Modelling Language (UML). However, the UML does not present a concrete notation for abstract UI modeling. Other approaches, out of the scope of UML, do not complement the models typically used in Software Engineering, so there is no concrete match between them. In order to overcome this problem, this Master’s project proposes the xCAP language for modeling user interfaces, based on Canonical Abstract Prototype language and its correspondence with a User Interface Metamodel proposed in the literature. As a complement to the creation of xCAP language, and as part of this project, the METACAP web application has been developed, with the goal of allowing the creation and edition of UI models of data-oriented software applications based on the xCAP language. In short, the creation of xCAP language and METACAP application aim to allow the relationship/association between the different models that describe a system and the adoption of a similar language in its construction.Mestrado em Engenharia do Software na Escola Superior de Tecnologia e Gestão do Instituto Politécnico de Viana do Castel

    MOLIERE: Automatic Biomedical Hypothesis Generation System

    Get PDF
    Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI). These objects include but are not limited to scientific papers, keywords, genes, proteins, diseases, and diagnoses. We model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered within this network, and demonstrate the effectiveness of MOLIERE by performing hypothesis generation on historical data. Our network, implementation, and resulting data are all publicly available for the broad scientific community

    MOLIERE: Automatic Biomedical Hypothesis Generation System

    Get PDF
    Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI). These objects include but are not limited to scientific papers, keywords, genes, proteins, diseases, and diagnoses. We model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered within this network, and demonstrate the effectiveness of MOLIERE by performing hypothesis generation on historical data. Our network, implementation, and resulting data are all publicly available for the broad scientific community

    Semantic modelling for dynamic system recognition in non-intrusive industrial monitoring systems

    Get PDF
    Industrial monitoring systems play important role in decision making on all levels of a factory from the shop floor to ERP systems influencing overall efficiency of production. Together with a trend for mass customization and constantly increasing tempo of introduction of new products, equipment and technologies to manufacturing, contemporary monitoring systems should provide enough flexibility to be up to date with manufacturing system. Such monitoring systems as the one offered in European Commission project PLANTCockpit, offer the approach of extensively reconfigurable, loosely coupled systems. Unfortunately, configuration of the monitoring system which could work on all levels of automation hierarchy requires the knowledge of all those levels together with knowledge of integration technologies and tedious work related with creation of configuration itself. Present thesis work offers an approach which automates the configuration process employing knowledge bases. This approach includes employment of SOA on device level, with semantically enhanced services descriptions (and possibility to employ the gateway devices for non-intrusiveness), definition of the metrics to be monitored by the system in the knowledge base, as well as set of algorithms and standards required to create configuration of the monitoring system. Reusability of knowledge defined on devices and in knowledge base simplifies the process of introduction of new devices, metrics or other reconfiguration of the monitoring system. The system implementing proposed approach has been developed in this thesis and was able to configure monitoring system for a test bed

    Can humain association norm evaluate latent semantic analysis?

    Get PDF
    This paper presents the comparison of word association norm created by a psycholinguistic experiment to association lists generated by algorithms operating on text corpora. We compare lists generated by Church and Hanks algorithm and lists generated by LSA algorithm. An argument is presented on how those automatically generated lists reflect real semantic relations

    Intonational Analysis of Polar Questions: A Comparative Investigation Between Standard Modern Greek and Standard British English

    Get PDF
    Τα υπερτεμαχιακά φαινόμενα έχουν εκτενώς μελετηθεί ως προς τη δημιουργία νοήματος, τόσο σε λεξιλογικό όσο και σε προτασιακό επίπεδο (Nespor and Vogel 2007; Ladd 2008; Nolan 2022). Στη διαδικασία προσέγγισης του νοήματος μιας πρότασης, ο επιτονισμός θεωρείται το ακριβέστερο εργαλείο φωνολογικής ανάλυσης, καθότι παρέχει όλες τις αναγκαίες γραμματικές και πραγματολογικές πληροφορίες μιας πρότασης (Baltazani 2003; Levis 2012; Nagy 2015). Συγκρίνοντας δύο επιτονικά διαφορετικές μεταξύ τους γλώσσες, την Κοινή Νέα Ελληνική και την Κοινή Αγγλική, η παρούσα έρευνα έχει ως στόχο να παρουσιάσει την επιτονική διάσταση των ευθείων πολικών ερωτήσεων στις δύο γλώσσες. Με αυτό εννοούνται όλες οι επιτονικές επιλογές (δηλαδή το είδος και η θέση τόνου μέσα στην ερώτηση, καθώς και το είδος της παύσης εντός της ερώτησης) κατά την παραγωγή πολικών ερωτήσεων, ενώ βάση της παρούσας έρευνας αποτελούν η τοποθέτηση του πυρηνικού τόνου καθώς και το είδος των φραστικών και οριακών τόνων. Μέσω της παρουσίασης των επιτονικών μοτίβων που προτιμώνται κατά την εκφορά αυτού του είδους ερωτήσεων τόσο στην Κοινή Νέα Ελληνική όσο και στην Κοινή Αγγλική, σταχυολογούνται οι κύριες ομοιότητες και διαφορές μεταξύ των δύο γλωσσών. Επιπροσθέτως, η ακροαματική και ακουστική ανάλυση τριάντα (30) αυθεντικών δεδομένων, όπως αυτά επιλέχθηκαν από διαδικτυακές πηγές και από απευθείας ηχητικές καταγραφές, παρέχουν καίριες αποδείξεις για την ακριβή περιγραφή των επιτονικών επιλογών των πολικών ερωτήσεων στην Κοινή Νέα Ελληνική και στην Κοινή Αγγλική. Η επιλογή των δεδομένων, όμοια και για τις δύο γλώσσες, έγινε με βάση το ύφος του λόγου (καταγραφή και ανάλυση 6 πολικών ερωτήσεων προσχεδιασμένου λόγου και 6 πολικών ερωτήσεων αυθόρμητου λόγου ανά γλώσσα), με σκοπό να εντοπιστεί πιθανή διαφοροποίηση του επιτονικού μοτίβου ερωτήσεων σε συνάρτηση με το ύφος εκφοράς. Η συγκεκριμένη ανάλυση επικεντρώνεται σε εκείνα τα προσωδιακά στοιχεία τα οποία αποτέλεσαν κέντρο έρευνας της επιτονικής θεωρίας, όπως αυτή εκφράστηκε από τον Cruttenden (1997) και τον Wells (2006), ενώ είναι σε πλήρη συνάρτηση με το επιτονικό μοντέλο που εντοπίζεται στην Αυτοτεμαχιακή-Μετρική Ανάλυση όπως αυτή συναντάται στους Arvaniti & Baltazani (2000) και Arvaniti, Ladd & Mennen (2006). Η πειραματική επεξεργασία και ανάλυση των δεδομένων γίνεται με τη χρήση του εργαλείου επιτονικής επισημείωσης Praat (Boersma & Weenick 2001), το οποίο χαίρει ευρείας αποδοχής σε αντίστοιχες έρευνες. Τέλος, έμφαση δίνεται επίσης στην πραγματολογική ερμηνεία των προσωδιακών χαρακτηριστικών των αυθεντικών δεδομένων μελέτης των δύο γλωσσών, με στόχο τις παιδαγωγικές προεκτάσεις αυτής της έρευνας. Με αυτόν τον τρόπο, εντοπίζονται περιπτώσεις προσωδιακών παρεμβολών από την Κοινή Νέα Ελληνική στην Κοινή Αγγλική, και αναζητείται η έκταση αυτών των παρεμβολών κατά την παραγωγή προφορικού λόγου στα Αγγλικά από φυσικούς ομιλητές της Κοινής Νέας Ελληνικής (6 πολικές ερωτήσεις προσχεδιασμένου λόγου στα Αγγλικά). Τα αποτελέσματα συντείνουν στην πιο σύγχρονη προσέγγιση του επιτονισμού ως πτυχής της έκφρασης προφορικού λόγου που είναι άρρηκτα δεμένη με την πραγματολογική ερμηνεία, απεγδυόμενη από μια a priori ύπαρξη νορμών (Papazachariou 2004; Kotsifas 2009; Arvaniti 2022).There has been a considerable amount of research regarding the importance of suprasegmental phenomena in meaning-making, either on the word or sentence level (Nespor and Vogel 2007; Ladd 2008; Nolan 2022). For the approximation of sentence meaning, intonation is considered the most significant part of phonological analysis, providing all the necessary grammatical and pragmatic information (Baltazani 2003; Levis 2012; Nagy 2015). Drawing on a comparison between two intonationally dissimilar languages, Standard Modern Greek and Standard British English, this study attempts to make a contrastive investigation of the tonal realization of polar questions in the two languages. All intonational choices (tone choice, placement of tone, and type of pausing) are considered tonal, while the basis of the research is the nuclear tone placement and the tonal movement before and at the end of the phrase (phrasal and boundary ending tones, respectively). Firstly, an extensive presentation of the intonational patterns followed in the production of yes/no questions in Standard Modern Greek, and Standard British English sheds light on the major similarities and differences between the two intonational languages. In addition, the auditory and acoustic analysis of thirty (30) authentic data extracted from online and offline sources provides up-to-date evidence for an accurate description of the intonational patterns of yes/no questions in Standard Modern Greek and Standard British English. The process of data selection, similar for both languages, was based on the style of speech (6 polar questions of instructed speech and 6 polar questions of spontaneous speech per language) to pinpoint any probable alternation of questions’ intonational patterns based on style. This form of analysis is focused on the prosodic aspects in accordance with the intonational theory formed by Cruttenden (1997) and Wells (2006), along with the Autosegmental-Metrical Model of Analysis found in Arvaniti & Baltazani (2000) and Arvaniti, Ladd & Mennen (2006). The experimental data processing and analysis were conducted via the Praat tool of intonational annotation (Boersma & Weenick 2001), which is highly esteemed in the corresponding studies. Emphasis is given on the pragmatic analysis of the prosodic features of the two languages found in this research to retrieve further pedagogical implications. Finally, 6 Native Speakers of Standard Modern Greek were recorded producing one English polar question of instructed speech. This way, it is shown whether and to what extent intonational interference plays a catalytic role when native speakers of Standard Modern Greek communicate orally in English. The findings of the study attest to the informed approach of intonation as a means of oral expression that is highly associated with pragmatic interpretation, disregarding an a priori set of norms (Papazachariou 2004; Kotsifas 2009; Arvaniti 2022)
    corecore