74 research outputs found

    An autoencoder-based neural network model for selectional preference: evidence from pseudo-disambiguation and cloze tasks

    Get PDF
    Intuitively, some predicates have a better fit with certain arguments than others. Usage-based models of language emphasize the importance of semantic similarity in shaping the structuring of constructions (form and meaning). In this study, we focus on modeling the semantics of transitive constructions in Finnish and present an autoencoder-based neural network model trained on semantic vectors based on Word2vec. This model builds on the distributional hypothesis according to which semantic information is primarily shaped by contextual information. Specifically, we focus on the realization of the object. The performance of the model is evaluated in two tasks: a pseudo-disambiguation and a cloze task. Additionally, we contrast the performance of the autoencoder with a previously implemented neural model. In general, the results show that our model achieves an excellent performance on these tasks in comparison to the other models. The results are discussed in terms of usage-based construction grammar.Kokkuvõte. Aki-Juhani Kyröläinen, M. Juhani Luotolahti ja Filip Ginter: Autokoodril põhinev närvivõrkude mudel valikulisel eelistamisel. Intuitiivselt tundub, et mõned argumendid sobivad teatud predikaatidega paremini kokku kui teised. Kasutuspõhised keelemudelid rõhutavad konstruktsioonide struktuuri (nii vormi kui tähenduse) kujunemisel tähendusliku sarnasuse olulisust. Selles uurimuses modelleerime soome keele transitiivsete konstruktsioonide semantikat ja esitame närvivõrkude mudeli ehk autokoodri. Mudel põhineb distributiivse semantika hüpoteesil, mille järgi kujuneb semantiline info peamiselt konteksti põhjal. Täpsemalt keskendume uurimuses objektile. Mudelit hindame nii valeühestamise kui ka lünkülesande abil. Kõrvutame autokoodri tulemusi varem välja töötatud neurovõrgumudelitega ja tõestame, et meie mudel töötab võrreldes teiste mudelitega väga hästi. Tulemused esitame kasutuspõhise konstruktsioonigrammatika kontekstis.Võtmesõnad: neurovõrk; autokooder; tähendusvektor; kasutuspõhine mudel; soome kee

    Dependency profiles as a tool for big data analysis of linguistic constructions: a case study of emoticons

    Get PDF
    This study presents a methodological toolbox for big data analysis of linguistic constructions by introducing dependency profiles, i.e., co-occurrences of linguistic elements with syntax information. These were operationalized by reconstructing sentences as delexicalized syntactic biarcs, subtrees of dependency analyses. As a case study, we utilize these dependency profiles to explore usage patterns associated with emoticons, the graphic representations of facial expressions. These are said to be characteristic of Computer-Mediated Communication, but typically studied only in restricted corpora. To analyze the 3.7-billion token Finnish Internet Parsebank we use as data, we apply clustering and support vector machines. The results show that emoticons are associated with three typical usage patterns: stream of the writer’s consciousness, narrative constructions and elements guiding the interaction and expressing the writer’s reactions by means of interjections and discourse particles. Additionally, the more frequent emoticons, such as :), are used differently than the less frequent ones, such as ^_^.Kokkuvõte. Veronika Laippala, Aki-Juhani Kyröläinen, Jenna Kanerva, Juhani Luotolahti ja Filip Ginter: Sõltuvusprofiilid kui vahend suurandmete keeleliste konstruktsioonide analüüsimiseks: uurimus emotikonidest. Uurimuses esitame metodoloogilise “tööriistakomplekti” keelekonstruktsioonide analüüsimiseks suurandmete põhjal, rakendades sõltuvusprofiile. Sõltuvusprofiil on lingvistiliste elementide koosesinemise esitusviis, kuhu on kaasatud süntaktiline informatsioon. Selleks on laused konstrueeritud sõltuvusanalüüsi alampuudena, kus süntaktiline info on esitatud sõnadevaheliste (kaksik-)kaarte abil. Artiklis rakendame sõltuvusprofiile selleks, et selgitada välja emotikonide kasutusmustrid. Näomiimika graafilised esitused on iseloomulikud arvuti suhtlusele, mida tavaliselt uuritakse piiratud korpuse põhjal, kuid meie kasutame klasterdamist ja tugivektor-masinaid 3,7 miljardi sõna suuruse Soome Interneti Puudepangal. Selgub, et emotikonide kasutus seostub kolme peamise kasutusmustriga: kirjutaja teadvuse vooluga, narratiivsete konstruktsioonidega ning hüüdsõnade ja diskursusepartiklitega, mis juhivad suhtlust ja väljendavad kirjutaja reaktsioone. Lisaks selgub, et sagedastel emotikonidel nagu :), on rohkem erinevaid kasutusi kui harvadel emotikonidel nagu ^_^.Võtmesõnad: sõltuvusprofiilid; kasutuspõhine süntaks; arvutisuhtlus; emotikonid; veebikorpus; soome kee

    Proceedings of the First Conference on Machine Translation (WMT)

    Get PDF

    Proceedings of the Third Workshop on Discourse in Machine Translation

    Get PDF

    Dependency profiles as a tool for big data analysis of linguistic constructions: A case study of emoticons

    Get PDF
    This study presents a methodological toolbox for big data analysis of linguistic constructions by introducing dependency profiles, i.e., co-occurrences of linguistic elements with syntax information. These were operationalized by reconstructing sentences as delexicalized syntactic biarcs, subtrees of dependency analyses. As a case study, we utilize these dependency profiles to explore usage patterns associated with emoticons, the graphic representations of facial expressions. These are said to be characteristic of Computer-Mediated Communication, but typically studied only in restricted corpora. To analyze the 3.7-billion token Finnish Internet Parsebank we use as data, we apply clustering and support vector machines. The results show that emoticons are associated with three typical usage patterns: stream of the writer’s consciousness, narrative constructions and elements guiding the interaction and expressing the writer’s reactions by means of interjections and discourse particles. Additionally, the more frequent emoticons, such as :), are used differently than the less frequent ones, such as ^_^.</p

    Exercise training improves biventricular oxidative metabolism and left ventricular efficiency in patients with dilated cardiomyopathy

    Get PDF
    AbstractObjectivesThe aim of this study was to determine the effect of exercise training on myocardial oxidative metabolism and efficiency in patients with idiopathic dilated cardiomyopathy (DCM) and mild heart failure (HF).BackgroundExercise training is known to improve exercise tolerance and quality of life in patients with chronic HF. However, little is known about how exercise training may influence myocardial energetics.MethodsTwenty clinically stable patients with DCM (New York Heart Association classes I through III) were prospectively separated into a training group (five-month training program; n = 9) and a non-trained control group (n = 11). Oxidative metabolism in both the right and left ventricles (RV and LV) was measured using [11C]acetate and positron emission tomography. Myocardial work power was measured using echocardiography. Myocardial efficiency for forward work was calculated as myocardial work power per mass/LV oxidative metabolism.ResultsSignificant improvements were noted in exercise capacity (Vo2) and ejection fraction in the training group, whereas no changes were observed in the non-trained group. Exercise training reduced both RV and LV oxidative metabolism and elicited a significant increase in LV forward work efficiency, although no significant changes were observed in the non-trained group.ConclusionsExercise training improves exercise tolerance and LV function. This is accompanied by a decrease in biventricular oxidative metabolism and enhanced forward work efficiency. Therefore, exercise training elicits an energetically favorable improvement in myocardial function and exercise tolerance in patients with DCM

    Circulating N-terminal brain natriuretic peptide and cardiac function in response to acute systemic hypoxia in healthy humans

    Get PDF
    Background: As it remains unclear whether hypoxia of cardiomyocytes could trigger the release of brain natriuretic peptide (BNP) in humans, we investigated whether breathing normobaric hypoxic gas mixture increases the circulating NT-proBNP in healthy male subjects.Methods: Ten healthy young men (age 29 ± 5 yrs, BMI 24.7 ± 2.8 kg/m2) breathed normobaric hypoxic gas mixture (11% O2/89% N2) for one hour. Venous blood samples were obtained immediately before, during, and 2 and 24 hours after hypoxic exposure. Cardiac function and flow velocity profile in the middle left anterior descending coronary artery (LAD) were measured by Doppler echocardiography.Results: Arterial oxygen saturation decreased steadily from baseline value of 99 ± 1% after the initiation hypoxia challenge and reached steady-state level of 73 ± 6% within 20-30 minutes. Cardiac output increased from 6.0 ± 1.2 to 8.1 ± 1.6 L/min and ejection fraction from 67 ± 4% to 75 ± 6% (both p < 0.001). Peak diastolic flow velocity in the LAD increased from 0.16 ± 0.04 to 0.28 ± 0.07 m/s, while its diameter remained unchanged. In the whole study group, NT-proBNP was similar to baseline (60 ± 32 pmol/ml) at all time points. However, at 24 h, concentration of NT-proBNP was higher (34 ± 18%) in five subjects and lower (17 ± 17%), p = 0.002 between the groups) in f

    Proceedings of the 12th Web as Corpus Workshop

    Get PDF
    The web presents unprecedented opportunities for large-scale collection of text in many languages. However, two critical steps in the development of web corpora remain challenging: the identification of clean text from source HTML and the assignment of genre or register information to the documents. In this paper, we evaluate a multilingual approach to this end. Our starting points are the Swedish and French Common Crawl datasets gathered for the 2017 CoNLL shared task, particularly the URLs. We 1) fetch HTML pages based on the URLs and run boilerplate removal, 2) train a classifier to further clean out undesired text fragments, and 3) annotate text registers. We compare boilerplate removal against the CoNLL texts, and find an improvement. For the further cleaning of undesired material, the best results are achieved using Multilingual BERT with monolingual fine-tuning. However, our results are promising also in a cross-lingual setting, without fine-tuning on the target language. Finally, the register annotations show that most of the documents belong to a relatively small set of registers, which are relatively similar in the two languages. A number of additional flags in the annotation are, however, necessary to reflect the wide range of linguistic variation associated with the documents.</p
    • …
    corecore