3,506 research outputs found

    Eesti keele üldvaldkonna tekstide laia kattuvusega automaatne sündmusanalüüs

    Get PDF
    Seoses tekstide suuremahulise digitaliseerimisega ning digitaalse tekstiloome järjest laiema levikuga on tohutul hulgal loomuliku keele tekste muutunud ja muutumas masinloetavaks. Masinloetavus omab potentsiaali muuta tekstimassiivid inimeste jaoks lihtsamini hallatavaks, nt lubada rakendusi nagu automaatne sisukokkuvõtete tegemine ja tekstide põhjal küsimustele vastamine, ent paraku ei ulatu praegused automaatanalüüsi võimalused tekstide sisu tegeliku mõistmiseni. Oletatakse, tekstide sisu mõistvale automaatanalüüsile viib meid lähemale sündmusanalüüs – kuna paljud tekstid on narratiivse ülesehitusega, tõlgendatavad kui „sündmuste kirjeldused”, peaks tekstidest sündmuste eraldamine ja formaalsel kujul esitamine pakkuma alust mitmete „teksti mõistmist” nõudvate keeletehnoloogia rakenduste loomisel. Käesolevas väitekirjas uuritakse, kuivõrd saab eestikeelsete tekstide sündmusanalüüsi käsitleda kui avatud sündmuste hulka ja üldvaldkonna tekste hõlmavat automaatse lingvistilise analüüsi ülesannet. Probleemile lähenetakse eesti keele automaatanalüüsi kontekstis uudsest, sündmuste ajasemantikale keskenduvast perspektiivist. Töös kohandatakse eesti keelele TimeML märgendusraamistik ja luuakse raamistikule toetuv automaatne ajaväljendite tuvastaja ning ajasemantilise märgendusega (sündmusviidete, ajaväljendite ning ajaseoste märgendusega) tekstikorpus; analüüsitakse korpuse põhjal inimmärgendajate kooskõla sündmusviidete ja ajaseoste määramisel ning lõpuks uuritakse võimalusi ajasemantika-keskse sündmusanalüüsi laiendamiseks geneeriliseks sündmusanalüüsiks sündmust väljendavate keelendite samaviitelisuse lahendamise näitel. Töö pakub suuniseid tekstide ajasemantika ja sündmusstruktuuri märgenduse edasiarendamiseks tulevikus ning töös loodud keeleressurssid võimaldavad nii konkreetsete lõpp-rakenduste (nt automaatne ajaküsimustele vastamine) katsetamist kui ka automaatsete märgendustööriistade edasiarendamist.  Due to massive scale digitalisation processes and a switch from traditional means of written communication to digital written communication, vast amounts of human language texts are becoming machine-readable. Machine-readability holds a potential for easing human effort on searching and organising large text collections, allowing applications such as automatic text summarisation and question answering. However, current tools for automatic text analysis do not reach for text understanding required for making these applications generic. It is hypothesised that automatic analysis of events in texts leads us closer to the goal, as many texts can be interpreted as stories/narratives that are decomposable into events. This thesis explores event analysis as broad-coverage and general domain automatic language analysis problem in Estonian, and provides an investigation starting from time-oriented event analysis and tending towards generic event analysis. We adapt TimeML framework to Estonian, and create an automatic temporal expression tagger and a news corpus manually annotated for temporal semantics (event mentions, temporal expressions, and temporal relations) for the language; we analyse consistency of human annotation of event mentions and temporal relations, and, finally, provide a preliminary study on event coreference resolution in Estonian news. The current work also makes suggestions on how future research can improve Estonian event and temporal semantic annotation, and the language resources developed in this work will allow future experimentation with end-user applications (such as automatic answering of temporal questions) as well as provide a basis for developing automatic semantic analysis tools

    Learning Sentence-internal Temporal Relations

    Get PDF
    In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

    How Free is Free Indirect Discourse? Empirical Approaches to the Anchoring Mechanisms of Perspective-taking

    Get PDF
    This dissertation presents a discussion and empirical investigation of the anchoring mechanisms of free indirect discourse. Its main focus is on the claim that a discourse referent must be sufficiently activated in a linguistic context in order to serve as the anchor for a sentence in free indirect discourse mode. This issue becomes particularly pressing whenever more than one discourse referent is available as the perspectival center. I want to argue that whenever several referents compete, the referent with the highest activation is preferred as the perspectival center, while a sentence in FID mode anchored to a less activated referent sounds rather unnatural. To approach this claim, I provide a number of examples that illustrate that the anchoring of free indirect discourse is related to linguistic activation. The observations indicate that: (i) referents in subject position are preferred as anchors over referents in object position, (ii) referents that are introduced with a proper name are preferred as anchors over referents that are introduced with an indefinite noun phrase, (iii) referents that are activated in a larger context are preferred over referents that are activated in the sentence preceding the free indirect discourse, and (iv) referents that are assigned particular verbal features are preferred over competing referents. In order to account for these observations, I present the results of a series of psycholinguistic experiments that indicate an effect of grammatical function, referential expression, global activation, and verbal features assigned to the referents by the verb in the preceding context on the anchoring of free indirect discourse. Ultimately, the findings presented in this thesis indicate that the anchoring of free indirect discourse is not arbitrary but determined by referential activation

    Chapter From the Lab to the Real World: Affect Recognition Using Multiple Cues and Modalities

    Get PDF
    Interdisciplinary concept of dissipative soliton is unfolded in connection with ultrafast fibre lasers. The different mode-locking techniques as well as experimental realizations of dissipative soliton fibre lasers are surveyed briefly with an emphasis on their energy scalability. Basic topics of the dissipative soliton theory are elucidated in connection with concepts of energy scalability and stability. It is shown that the parametric space of dissipative soliton has reduced dimension and comparatively simple structure that simplifies the analysis and optimization of ultrafast fibre lasers. The main destabilization scenarios are described and the limits of energy scalability are connected with impact of optical turbulence and stimulated Raman scattering. The fast and slow dynamics of vector dissipative solitons are exposed

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Beat gestures influence which speech sounds you hear

    Get PDF
    Beat gestures - spontaneously produced biphasic movements of the hand - are among the most frequently encountered co-speech gestures in human communication. They are closely temporally aligned to the prosodic characteristics of the speech signal, typically occurring on lexically stressed syllables. Despite their prevalence across speakers of the world's languages, how beat gestures impact spoken word recognition is unclear. Can these simple 'flicks of the hand' influence speech perception? Across six experiments, we demonstrate that beat gestures influence the explicit and implicit perception of lexical stress (e.g., distinguishing OBject from obJECT), and in turn, can influence what vowels listeners hear. Thus, we provide converging evidence for a manual McGurk effect: even the simplest 'flicks of the hands' influence which speech sounds we hear

    Doing time : inducing temporal graphs

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (leaves 51-53).We consider the problem of constructing a directed acyclic graph that encodes temporal relations found in a text. The unit of our analysis is a temporal segment, a fragment of text that maintains temporal coherence. The strength of our approach lies in its ability to simultaneously optimize pairwise ordering preferences and global constraints on the graph topology. Our learning method achieves 83% F-measure in temporal segmentation and 84% accuracy in inferring temporal relations between two segments.by Philip James Bramsen.M.Eng
    corecore