344 research outputs found

    Can Subcategorisation Probabilities Help a Statistical Parser?

    Full text link
    Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy.Comment: 9 pages, uses colacl.st

    Vocabulary Through Affixes and Word Families - A Computer-Assisted Language Learning Program for Adult ELL Students

    Full text link
    Vocabulary plays an important role in language learning of ELL (English Language Learner) students. This work discusses the importance of metalinguistic awareness in teaching vocabulary to adult English Language Learners at an intermediate- or advanced-level of English language proficiency with an emphasis on learning vocabulary through word families and increased morphological awareness. The main contribution is a computer-based program that guides users through a series of interactive reading and vocabulary practice exercises which allow them to explore and learn how certain words are connected through word families and how some of the most common affixes in English can affect the meaning and grammatical function of words. Unlike most existing Computer-Assisted Language Learning systems, the number of vocabulary practice exercises it can produce is unlimited, as is the range of reading materials it can analyze, including text supplied by the users

    Collateral adjectives in English and related Issues

    Get PDF

    An investigation into lemmatization in Southern Sotho

    Get PDF
    Lemmatization refers to the process whereby a lexicographer assigns a specific place in a dictionary to a word which he regards as the most basic form amongst other related forms. The fact that in Bantu languages formative elements can be added to one another in an often seemingly interminable series till quite long words are produced, evokes curiosity as far as lemmatization is concerned. Being aware of the productive nature of Southern Sotho it is interesting to observe how lexicographers go about handling the question of morphological complexities they are normally faced with in the process of arranging lexical items. This study has shown that some difficulties are encountered as far as adhering to the traditional method of alphabetization is concerned. It does not aim at proposing solutions but does point out some considerations which should be borne in mind in the process of lemmatization.African LanguagesM.A. (African Languages

    Investigar a variação linguística em inglês com a Internet e Corpora

    Get PDF
    Mestrado em Estudos InglesesA preocupação desta dissertação está em acompanhar as variações linguísticas que têm surgido em grande parte através do uso da Internet. Estas também involvem mudanças culturais que são expressas através de novas formas de palavras. Os processos de formação das palavras são estudados para analizar as mudanças apresentadas nos casos prácticos implementados tal como exemplificou Aitchison (1994). Discute-se também a corpora informatizada e o recurso à produção de corpora específicas de modo a analisar a variação linguística. Os resultados da pesquisa levada a cabo têm implicações quer para os professores, quer para os estudantes de linguagem que precisam estar aptos a descobrir o uso do Inglês moderno. Contudo, os resultados vão muito para além disto e mostram as implicações educacionais envolvidas nas variações linguísticas aqui estudadas. ABSTRACT: This dissertation is concerned with tracking language changes which are seen to have come about largely through the use of the internet. They also involve cultural changes which are expressed through new word forms. The processes of word formation are examined in order to analyse the changes presented in the case studies carried out as exemplified by Aitchison (1994). There is also a discussion of computer corpora and recourse to the production of specific corpora in order to examine language change. The results of the research carried out have implications for both teachers and language learners who need to be able to find out about modern English usage. The results go much further than this however and show the wider educational implications involved in the language changes studied here

    Baltic Journal of English Language, Literature and Culture, Vol.10

    Get PDF
    Kontorslokaler nyttjas generellt cirka 2500 av årets 8760 timmar. Ett vanligt problem med kontorslokaler är det termiska klimatet, antingen är det för varmt, för kallt, eller så drar det. Höga temperaturer, över ca 26°C, bidrar till trötthet, nedsatt koncentration och gör att luften känns mindre fräsch. Stora variationen av lasten mellan dag och nattetid kan också resultera i att lokalerna överventileras under nattetid och underventileras under dagtid. Syftet med examensarbetet var att undersöka och jämföra Ecoclimes komforttaks lösning med andra olika värme och kylsystem i kontorslokaler. Att undersöka vilka eventuella fördelar Ecoclimes komforttak har gällande komfort, kyla, ventilation och ur energisynpunkt. Simuleringsprogrammet IDA ICE har använts för att simulera komforten och rumstemperaturer för ett kontor och ett konferensrum i en byggnad placerad i centrala Umeå. Resultaten från simuleringar indikerar att Ecoclimes komforttak, sänker den operativa temperaturen och höjer komforten med en mindre andel missnöjda i sitt rum jämfört med andra system trots samma rumstemperatur. För att bedömma andelen missnöjda i ett rum har komfortindexet PMV(Predicted mean vote) och PPD(Predicted percentage dissatisfied) använts. Den höga passiva effekten bidrar också till mindre energianvändning av ventilationsfläktar ifall ett VAV-system med rumstempertaurreglering används. Vidare har en känslighetsanalys genomförts på komforttaken där det undersöks hur kyleffekten påverkar kyltider, temperatur och komfort. Känslighetsanalysen visar att en ökning eller minskning av kyleffekten med 10% påverkar resultaten mest under en mycket varm dag jämfört med en normalvarm. Skillnaden i komfort var dock liten, endast 0,2 procentenheter från grundfallet

    D6.1: Technologies and Tools for Lexical Acquisition

    Get PDF
    This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)

    D7.1. Criteria for evaluation of resources, technology and integration.

    Get PDF
    This deliverable defines how evaluation is carried out at each integration cycle in the PANACEA project. As PANACEA aims at producing large scale resources, evaluation becomes a critical and challenging issue. Critical because it is important to assess the quality of the results that should be delivered to users. Challenging because we prospect rather new areas, and through a technical platform: some new methodologies will have to be explored or old ones to be adapted

    Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn
    corecore