36,979 research outputs found

    Towards the integration of functions, relations and types in an AI programming language

    Get PDF
    This paper describes the design and implementation of the programming language PC-Life. This language integrates the functional and the Logic-oriented programming style and feature types supporting inheritance. This combination yields a language particularly suited to knowledge representation, especially for application in computational linguistics

    Language Processing and the Artificial Mind: Teaching Code Literacy in the Humanities

    Get PDF
    Humanities majors often find themselves in jobs where they either manage programmers or work with them in close collaboration. These interactions often pose difficulties because specialists in literature, history, philosophy, and so on are not usually code literate. They do not understand what tasks computers are best suited to, or how programmers solve problems. Learning code literacy would be a great benefit to humanities majors, but the traditional computer science curriculum is heavily math oriented, and students outside of science and technology majors are often math averse. Yet they are often interested in language, linguistics, and science fiction. This thesis is a case study to explore whether computational linguistics and artificial intelligence provide a suitable setting for teaching basic code literacy. I researched, designed, and taught a course called “Language Processing and the Artificial Mind.” Instead of math, it focuses on language processing, artificial intelligence, and the formidable challenges that programmers face when trying to create machines that understand natural language. This thesis is a detailed description of the material, how the material was chosen, and the outcome for student learning. Student performance on exams indicates that students learned code literacy basics and important linguistics issues in natural language processing. An exit survey indicates that students found the course to be valuable, though a minority reacted negatively to the material on programming. Future studies should explore teaching code literacy with less programming and new ways to make coding more interesting to the target audience

    Concurrent Lexicalized Dependency Parsing: The ParseTalk Model

    Full text link
    A grammar model for concurrent, object-oriented natural language parsing is introduced. Complete lexical distribution of grammatical knowledge is achieved building upon the head-oriented notions of valency and dependency, while inheritance mechanisms are used to capture lexical generalizations. The underlying concurrent computation model relies upon the actor paradigm. We consider message passing protocols for establishing dependency relations and ambiguity handling.Comment: 90kB, 7pages Postscrip

    Generating Aspect-oriented Multi-document Summarization with Event-Aspect Model

    Get PDF
    In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of semantically related sentences and sentence ranking based on extension of random walk model. Also, we implement a new sentence compression algorithm which use dependency tree instead of parser tree. We compare our method with four baseline methods. Quantitative evaluation based on Rouge metric demonstrates the effectiveness and advantages of our method.

    Concurrent Lexicalized Dependency Parsing: A Behavioral View on ParseTalk Events

    Full text link
    The behavioral specification of an object-oriented grammar model is considered. The model is based on full lexicalization, head-orientation via valency constraints and dependency relations, inheritance as a means for non-redundant lexicon specification, and concurrency of computation. The computation model relies upon the actor paradigm, with concurrency entering through asynchronous message passing between actors. In particular, we here elaborate on principles of how the global behavior of a lexically distributed grammar and its corresponding parser can be specified in terms of event type networks and event networks, resp.Comment: 68kB, 5pages Postscrip

    Teaching machine translation and translation technology: a contrastive study

    Get PDF
    The Machine Translation course at Dublin City University is taught to undergraduate students in Applied Computational Linguistics, while Computer-Assisted Translation is taught on two translator-training programmes, one undergraduate and one postgraduate. Given the differing backgrounds of these sets of students, the course material, methods of teaching and assessment all differ. We report here on our experiences of teaching these courses over a number of years, which we hope will be of interest to lecturers of similar existing courses, as well as providing a reference point for others who may be considering the introduction of such material

    Semantic Source Code Models Using Identifier Embeddings

    Full text link
    The emergence of online open source repositories in the recent years has led to an explosion in the volume of openly available source code, coupled with metadata that relate to a variety of software development activities. As an effect, in line with recent advances in machine learning research, software maintenance activities are switching from symbolic formal methods to data-driven methods. In this context, the rich semantics hidden in source code identifiers provide opportunities for building semantic representations of code which can assist tasks of code search and reuse. To this end, we deliver in the form of pretrained vector space models, distributed code representations for six popular programming languages, namely, Java, Python, PHP, C, C++, and C#. The models are produced using fastText, a state-of-the-art library for learning word representations. Each model is trained on data from a single programming language; the code mined for producing all models amounts to over 13.000 repositories. We indicate dissimilarities between natural language and source code, as well as variations in coding conventions in between the different programming languages we processed. We describe how these heterogeneities guided the data preprocessing decisions we took and the selection of the training parameters in the released models. Finally, we propose potential applications of the models and discuss limitations of the models.Comment: 16th International Conference on Mining Software Repositories (MSR 2019): Data Showcase Trac
    • 

    corecore