1,727 research outputs found

    Towards an integrated representation of multiple layers of linguistic annotation in multilingual corpora

    Get PDF
    In the proposed talk we discuss the application of a set of computational text analysis techniques for the analysis of the linguistic features of translations. The goal of this analysis is to test two hypotheses about the specific properties of translations: Baker's hypothesis of normalization (Baker, 1995) and Toury's law of interference (Toury, 1995). The corpus we analyze consists of English and German original texts and translations of those texts into German and English, respectively. The analysis task is complex in a number of respects. First, a multi-level analysis (clause, phrases, words) has to be carried out; second, among the linguistic features selected for analysis are some rather abstract ones, ranging from functional-grammatical features, e.g., Subject, Adverbial of Time, etc, to semantic features, e.g., semantic roles, such as Agent, Goal, Locative, etc.; third, monolingual and contrastive analyses are involved. This places certain requirements on the computational techniques to be employed both regarding corpus encoding, linguistic annotation and information extraction. We show how a combination of commonly available techniques can fulfill these requirements to a large degree and point out their limitations for application to the research questions raised. These techniques range from document encoding (TEI, XML) over automatic corpus annotation (notably part-of-speech tagging; Brants, 2000) and semi-automatic annotation (O'Donnell, 1995) to query systems as implemented in e.g., the IMS Corpus Workbench (Christ, 1994), the MATE system (Mengel & Lezius, 2000) and the Gsearch system (Keller et al., 1999).Hosted by the Scholarly Text and Imaging Service (SETIS), the University of Sydney Library, and the Research Institute for Humanities and Social Sciences (RIHSS), the University of Sydney

    Towards an integrated representation of multiple layers of linguistic annotation in multilingual corpora

    Get PDF
    In the proposed talk we discuss the application of a set of computational text analysis techniques for the analysis of the linguistic features of translations. The goal of this analysis is to test two hypotheses about the specific properties of translations: Baker's hypothesis of normalization (Baker, 1995) and Toury's law of interference (Toury, 1995). The corpus we analyze consists of English and German original texts and translations of those texts into German and English, respectively. The analysis task is complex in a number of respects. First, a multi-level analysis (clause, phrases, words) has to be carried out; second, among the linguistic features selected for analysis are some rather abstract ones, ranging from functional-grammatical features, e.g., Subject, Adverbial of Time, etc, to semantic features, e.g., semantic roles, such as Agent, Goal, Locative, etc.; third, monolingual and contrastive analyses are involved. This places certain requirements on the computational techniques to be employed both regarding corpus encoding, linguistic annotation and information extraction. We show how a combination of commonly available techniques can fulfill these requirements to a large degree and point out their limitations for application to the research questions raised. These techniques range from document encoding (TEI, XML) over automatic corpus annotation (notably part-of-speech tagging; Brants, 2000) and semi-automatic annotation (O'Donnell, 1995) to query systems as implemented in e.g., the IMS Corpus Workbench (Christ, 1994), the MATE system (Mengel & Lezius, 2000) and the Gsearch system (Keller et al., 1999).Hosted by the Scholarly Text and Imaging Service (SETIS), the University of Sydney Library, and the Research Institute for Humanities and Social Sciences (RIHSS), the University of Sydney

    Computer assisted audiometric evaluation system

    Get PDF
    A computer-based audiometric evaluation system has been developed. The system makes use of an IBM PC/XT/AT compatible personal computer to perform pure tone and speech tests and · comprises a plug-in card and custom software. The card contains pure tone and masking noise generators, together with amplifiers for a. set of headphones .and bone conduction transducer, patient and audiologist microphone amplifiers and a hand-held infra-red remote-control unit. A voice-operated gain-adjusting device on the audiologist's microphone eliminates the need for a sound pressure level meter during speech tests. The software-based user-interface makes use.of overlaid pop-up menus, context sensitive assistance.and a text editor on a graphics screen. Pure tone and speech data are acquired and displayed on a dynamic audiogram and speech discrimination gram respectively. This data may be stored and later retrieved from a patient data base. Further audiometric tests may be incorporated at a later stage

    Spartan Daily, February 8, 1978

    Get PDF
    Volume 70, Issue 6https://scholarworks.sjsu.edu/spartandaily/6297/thumbnail.jp

    Cascading Cache Layer in Content Management System

    Get PDF
    Caching involves the temporal storing of data in a separate folder. Cascading is the arrangement of something in sequence from top to bottom. Cascading cache layer in content management system places data in layers and sequence in order of importance. The cached data are also removed based on their order of importance. Caching is majorly about input and output of content and data, this brings the need for cascading management system to make accessing data easier than usual. This work takes a look into caching and how it works. It considers various levels of caching in the content management systems. It tries to explain what cascading is in a content management system as well as its importance. This work explains how cascading cache in layers would make it faster and more efficient to access data

    TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies

    Get PDF
    This paper presents an extended work on the trilingual spoken language translation corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC), namely TriECCC. TriECCC is a simultaneously spoken language translation corpus with parallel resources of speech and text in three languages: Khmer, English, and French. This corpus has approximately [Formula: see text] thousand utterances, approximately [Formula: see text], [Formula: see text], and [Formula: see text] h in length of speech, and [Formula: see text], [Formula: see text] and [Formula: see text] million words in text, in Khmer, English, and French, respectively. We first report the baseline results of machine translation (MT), and speech translation (ST) systems, which show reasonable performance. We then investigate the use of the ROVER method to combine multiple MT outputs and fine-tune the pre-trained English–French MT models to enhance the Khmer MT systems. Experimental results show that the ROVER is effective for combining English-to-Khmer and French-to-Khmer systems. Fine-tuning from both single and multiple parents shows the effective improvement on the BLEU scores for Khmer-to-English/French and English/French-to-Khmer MT systems

    Capacity Development and Knowledge Management Toolkit—Africa RISING in the Ethiopian Highlands

    Get PDF
    United States Agency for International Developmen

    Taylorism, targets and the pursuit of quantity and quality by call centre management

    Get PDF
    The paper locates the rise of the call centre within the context of the development of Taylorist methods and technological change in office work in general. Managerial utilisation of targets to impose and measure employees' quantitative and qualitative performance is analysed in four case-study organisations. The paper concludes that call centre work reflects a pardigmic re-configuration of customer servicing operations, and that the continuing application of Taylorist methods appears likely

    Razlike u učinkovitosti pristupa leksiku kod dvojezičnih govornika hrvatskog i engleskog jezika

    Get PDF
    Studies to date have shown that both bilingual adults and bilingual children score lower than their monolingual peers on standardized productive and receptive vocabulary measures. Since there is no signifi cant diff erence in the size of the conceptual vocabulary between these groups, such results imply a more eff ortful lexical access in bilingual speakers. Several models1 have been constructed in att empts to explain these discrepancies, focusing on spreading activation and selection mechanisms in lexical access. There is, however, no conclusive evidence supporting either one of the models, and there are no studies which test for diff erences in performance of simultaneous and sequential bilingual speakers. We therefore tested the effi ciency of lexical access in Croatian-English simultaneous and sequential bilinguals, using a picture vocabulary test, in order to determine whether there is a signifi cant diff erence in receptive vocabulary performance between these two groups and in comparison to their monolingual peers.Dosadašnje su studije pokazale da dvojezične odrasle osobe, kao i dvojezična djeca, postižu slabije rezultate od svojih jednojezičnih vršnjaka na standardiziranim testovima, kojima se ispituje produktivni i receptivni rječnik. S obzirom na to da ne postoji bitna razlika u opsegu konceptualnog leksika kod dvojezičnih i jednojezičnih osoba, pretpostavlja se da je uzrok slabijih rezultata teži pristup leksiku kod dvojezičnih govornika. Postoji nekoliko modela (v. bilješku 1) koji nastoje objasniti te diskrepancije, a u čijem su središtu istraživanja širenje aktivacije i mehanizmi odabira kod pristupa leksiku. Ipak, zasad ne postoji dovoljno čvrstih dokaza koji bi poduprli bilo koji od postojećih modela, kao ni studija koje bi ispitivale razliku u postignuću između simultanih i sekvencijalnih dvojezičnih govornika. Stoga je provedeno testiranje učinkovitosti pristupa leksiku kod simultanih i sekvencijalih dvojezičnih govornika hrvatskog i engleskog jezika uz pomoć slikovnog testa rječnika. U testiranju su sudjelovale dvije grupe dvojezičnih govornika, deset simultanih i deset sekvencijalnih, te dvije kontrolne grupe s po deset izvornih jednojezičnih govornika hrvatskog i engleskog jezika. Svakoj dvojezičnoj grupi dan je test i na hrvatskom i na engleskom jeziku. Cilj ispitivanja bilo je utvrditi postoji li bitna razlika u postignuću između dviju grupa dvojezičnih govornika te između dvojezičnih govornika i njihovih jednojezičnih vršnjaka, a rezultati su pokazali da su jednojezične kontrolne skupine imale podjednak rezultat (91,4% za hrvatski i 92,1% za engleski) te su imale bolji rezultat u usporedbi s dvije dvojezične grupe. Najlošiji rezultat postigli su sekvencijalni dvojezični govornici na testu koji se odnosi na nedominantan jezik, odnosno onaj koji su usvojili kao drugi (L2). Rezultati koje su postigli na testu jezika koji smatraju prvim (L1) na razini su rezultata simultanih dvojezičnih govornika. Rezultati ispitivanja također pokazuju da kod sekvencijalnih dvojezičnih govornika postoji bitna negativna korelacija između postignuća na testu i dobi u kojoj su počeli usvajati drugi jezik. Može se zaključiti da dvojezični govornici doista postižu nešto slabije rezultate na testovima receptivnog rječnika, odnosno pri mjerenju pristupa leksiku, u odnosu na jednojezične govornike, no u isto vrijeme sekvencijalni dvojezični govornici pokazuju bolje rezultate na prvom jeziku čak i od simultanih dvojezičnih govornika. Imajući to u vidu, smatramo frekvencijski model (Gollan/ Acenas 2004; Gollan/Montoya/Werner 2002) najprikladnijim modelom zato što takve razlike u postignuću pripisuje smanjenoj frekvenciji uporabe svih riječi u leksiku dvojezičnih govornika, a smanjena frekvencija uporabe uzrokuje smanjenu funkcionalnu frekvenciju, odnosno sporiji pristup leksiku. Kada je riječ o sekvencijalnim dvojezičnim govornicima, čini se da osim funkcionalne frekvencije utjecaj na postignuća pri mjerenju pristupa leksiku imaju i dob u kojoj je počelo usvajanje drugog jezika, kao i vrijeme korištenja pojedinog jezika, što znači da pri istraživanju pristupa leksiku kod dvojezičnih govornika svakako treba uzeti u obzir i razliku između simultanih i sekvencijalnih dvojezičnih govornika
    corecore