4,123 research outputs found

    Lightweight Call-Graph Construction for Multilingual Software Analysis

    Full text link
    Analysis of multilingual codebases is a topic of increasing importance. In prior work, we have proposed the MLSA (MultiLingual Software Analysis) architecture, an approach to the lightweight analysis of multilingual codebases, and have shown how it can be used to address the challenge of constructing a single call graph from multilingual software with mutual calls. This paper addresses the challenge of constructing monolingual call graphs in a lightweight manner (consistent with the objective of MLSA) which nonetheless yields sufficient information for resolving language interoperability calls. A novel approach is proposed which leverages information from a compiler-generated AST to provide the quality of call graph necessary, while the program itself is written using an Island Grammar that parses the AST providing the lightweight aspect necessary. Performance results are presented for a C/C++ implementation of the approach, PAIGE (Parsing AST using Island Grammar Call Graph Emitter) showing that despite its lightweight nature, it outperforms Doxgen, is robust to changes in the (Clang) AST, and is not restricted to C/C++.Comment: 10 page

    Partially-commutative context-free languages

    Get PDF
    The paper is about a class of languages that extends context-free languages (CFL) and is stable under shuffle. Specifically, we investigate the class of partially-commutative context-free languages (PCCFL), where non-terminal symbols are commutative according to a binary independence relation, very much like in trace theory. The class has been recently proposed as a robust class subsuming CFL and commutative CFL. This paper surveys properties of PCCFL. We identify a natural corresponding automaton model: stateless multi-pushdown automata. We show stability of the class under natural operations, including homomorphic images and shuffle. Finally, we relate expressiveness of PCCFL to two other relevant classes: CFL extended with shuffle and trace-closures of CFL. Among technical contributions of the paper are pumping lemmas, as an elegant completion of known pumping properties of regular languages, CFL and commutative CFL.Comment: In Proceedings EXPRESS/SOS 2012, arXiv:1208.244

    Wh-copying, phases, and successive cyclicity

    Get PDF

    Where the eye takes you: the processing of gender in codeswitching

    Get PDF
    Producción CientíficaLa alternancia de códigos posee gran potencial para explorar cómo interactúan dos sistemas lingüísticos en la mente del bilingüe. Exploramos esta situación de lenguas en contacto a través de datos de seguimiento ocular de bilingües de español L1 e inglés L2. Dado que las comunidades bilingües inglés-español muestran una clara tendencia a producir alternancia entre determinante y nombre (la window / the ventana), desde un punto de vista formal analizamos la direccionalidad de la alternancia y el tipo de mecanismo de concordancia de género implícita que se produce en el caso del determinante español (la/el window // el/la book). Los resultados muestran que se tardan más en procesar tanto la alternancia con determinante español como la del determinante español sin género analógico. Interpretamos estos resultados a la luz de propuestas formales de representación del género y argumentamos que la gramaticalidad del género en la L1 de los participantes determina los costes de procesamiento en este tipo de alternancia.Junta de Castilla y León - FEDER (project VA009P17)Ministerio de Ciencia, Innovación y Universidades - FEDER (project PGC2018-097693-B-I00

    Simulating the Machine Translation of Low-Resource Languages by Designing a Translator Between English and an Artificially Constructed Language

    Get PDF
    Natural language processing (NLP), or the use of computers to analyze natural language, is a field that relies heavily on syntax. It would seem intuitive that computers would thrive in this area due to their strict syntax requirements, but the syntax of natural languages leaves them unable to properly parse and generate sentences that seem normal to the average speaker. A subfield of NLP, machine translation, works mainly to computerize translation between different languages. Unfortunately, such translation is not without its weaknesses; language documentation is not created equal, and many low-resource languages—languages with relatively few kinds of documentation, most often written—are left with no way to effectively benefit from machine translation. As a step toward better translation processors for low-resource languages, this thesis examined the possibility of machine translation between high resource languages and low resource languages through an analysis of different machine learning techniques, and ultimately constructing a simple translator between English and an artificially constructed language using a context-free grammar (CFG)

    Correspondence in OT syntax and minimal link effects

    Get PDF
    The aim of this paper is the exploration of an optimality theoretic architecture for syntax that is guided by the concept of "correspondence": syntax is understood as the mechanism of "translating" underlying representations into a surface form. In minimalism, this surface form is called "Phonological Form" (PF). Both semantic and abstract syntactic information are reflected by the surface form. The empirical domain where this architecture is tested are minimal link effects, especially in the case of "wh"-movement. The OT constraints require the surface form to reflect the underlying semantic and syntactic representations as maximally as possible. The means by which underlying relations and properties are encoded are precedence, adjacency, surface morphology and prosodic structure. Information that is not encoded in one of these ways remains unexpressed, and gets lost unless it is recoverable via the context. Different kinds of information are often expressed by the same means. The resulting conflicts are resolved by the relative ranking of the relevant correspondence constraints

    Application of shape grammar theory to underground rail station design and passenger evacuation

    Get PDF
    This paper outlines the development of a computer design environment that generates station ‘reference’ plans for analysis by designers at the project feasibility stage. The developed program uses the theoretical concept of shape grammar, based upon principles of recognition and replacement of a particular shape to enable the generation of station layouts. The developed novel shape grammar rules produce multiple plans of accurately sized infrastructure faster than by traditional means. A finite set of station infrastructure elements and a finite set of connection possibilities for them, directed by regulations and the logical processes of station usage, allows for increasingly complex composite shapes to be automatically produced, some of which are credible station layouts at ‘reference’ block plan level. The proposed method of generating shape grammar plans is aligned to London Underground standards, in particular to the Station Planning Standards and Guidelines 5th edition (SPSG5 2007) and the BS-7974 fire safety engineering process. Quantitative testing is via existing evacuation modelling software. The prototype system, named SGEvac, has both the scope and potential for redevelopment to any other country’s design legislation
    corecore