    Određivanje anafore kod djece i odraslih: analiza pokreta očiju

    This eyeā€“tracking study uses the visual world paradigm to explore the anaphora resolution strategies in children and adults. The proā€“drop feature of Croatian was used to indicate the switch in topic while visual cues on the accompanying pictures provided different contexts. The results suggest that both children and adults rely more on visual cues, but when these cues are not available, the two groups behave differently. Adults tend to choose the Agent as the antecedent, while childrenā€™s behaviour is unclear as they perform at chance. These results are in line with similar studies in Italian and Spanish regarding the proā€“drop feature as an indicator of the change in topic, as well as with French and Greek studies regarding developmental changes in the reliance on linguistic and nonā€“linguistic information.Metoda vizualne paradigme metoda je u kojoj se jezična informacija ispitaniku prikazuje istovremeno vizualnim i sluÅ”nim kanalom. U vezu dovodi usmjerenje ispitanikove pažnje i sluÅ”no predstavljenu jezičnu informaciju. U ovome su se istraživanju proučavale strategije koje odrasli govornici hrvatskoga i djeca upotrebljavaju kako bi odredili antecedent anafore. Eksperiment se sastojao od dviju rečenica i odgovarajuće slike. Prva je rečenica uvodila dva referenta i zajedničku radnju. Druga je dala dodatnu informaciju koja je odgovarala sadržaju vizualnog ključa (Ā»Unuk je pozvao djeda da zajedno beru Å”ljive. Na glavu je stavio kapu da se zaÅ”titi od sunca.Ā«, s kapom kao vizualnim ključem). Manipuliralo se informacijskom strukturom i vizualnim kontekstom. Promjena u informacijskoj strukturi rečenice temeljila se na proā€“drop obilježju hrvatskoga, tj. promjenu u topikalizaciji signalizirala je uporaba zamjenice (Ā»On je na glavu stavio kapu...Ā«). Vizualni je ključ bio na jednom od referenata ili na obama, tj. nije bio dostupan. Rezultati analize pokreta očiju pokazali su s jedne strane da u prisutnosti vizualnoga ključa razlike u određivanju antecedenta anafore među skupinama ispitanika nema: u tome se slučaju i odrasli ispitanici i djeca vode upravo njime. S druge strane, u odsutnosti vizualnoga ključa odrasli se govornici viÅ”e oslanjaju na informacijsku strukturu, dok ponaÅ”anje djece ne otkriva posebnu strategiju. Proā€“drop obilježje hrvatskoga nije se pokazalo značajnim ni u jednome eksperimentalnom uvjetu. U pogledu razlike u informacijskoj strukturi signalizirane uporabom zamjenice ti se rezultati mogu usporediti sa sličnim istraživanjima u talijanskome i Å”panjolskome jeziku. U pogledu razlika između odraslih i djece rezultati su u skladu s onima iz istraživanja u grčkome i francuskome

    Robustness in Coreference Resolution

    Coreference resolution is the task of determining different expressions of a text that refer to the same entity. The resolution of coreferring expressions is an essential step for automatic interpretation of the text. While coreference information is beneficial for various NLP tasks like summarization, question answering, and information extraction, state-of-the-art coreference resolvers are barely used in any of these tasks. The problem is the lack of robustness in coreference resolution systems. A coreference resolver that gets higher scores on the standard evaluation set does not necessarily perform better than the others on a new test set. In this thesis, we introduce robustness in coreference resolution by (1) introducing a reliable evaluation framework for recognizing robust improvements, and (2) proposing a solution that results in robust coreference resolvers. As the first step of setting up the evaluation framework, we introduce a reliable evaluation metric, called LEA, that overcomes the drawbacks of the existing metrics. We analyze LEA based on various types of errors in coreference outputs and show that it results in reliable scores. In addition to an evaluation metric, we also introduce an evaluation setting in which we disentangle coreference evaluations from parsing complexities. Coreference resolution is affected by parsing complexities for detecting the boundaries of expressions that have complex syntactic structures. We reduce the effect of parsing errors in coreference evaluation by automatically extracting a minimum span for each expression. We then emphasize the importance of out-of-domain evaluations and generalization in coreference resolution and discuss the reasons behind the poor generalization of state-of-the-art coreference resolvers. Finally, we show that enhancing state-of-the-art coreference resolvers with linguistic features is a promising approach for making coreference resolvers robust across domains. The incorporation of linguistic features with all their values does not improve the performance. However, we introduce an efficient pattern mining approach, called EPM, that mines all feature-value combinations that are discriminative for coreference relations. We then only incorporate feature-values that are discriminative for coreference relations. By employing EPM feature-values, performance improves significantly across various domains

    What to talk about, and how: studies on prominence and patterns of coreference

    The concept of prominence has been variously defined, and it overlaps with other ideas in both theoretical and cognitive linguistics, such as activation, emphasis, or accessibility. Moreover, prominence has an important role in the interpretation and production of language, influencing what anaphoric patterns are produced and/or seen as mostly likely, and what referring expressions are chosen to express coreference. This thesis presents psycholinguistic, crosslinguistic studies on prominence and coreference, grouping them in two parts respectively on the surface form and repercussions of prominence and on prominence as seen in different components of meaning. The first study, on English, surveys how prominence is expressed in cleft constructions by extracting emphasis markers and "formal" features within clefts from two corpora at different registers, exploring the patterns in which syntactic marking, graphical emphasis markers, and the variants of contraction, pronoun and complementiser are used in a synergy to express prominence. The second study uses the same structure of the cleft in Italian, and focusses on two factors affecting prominence: information structure and sentence boundary. It then analyses the next-mention choices that writers make, and how this choice is carried on with referring expressions. Moving to prominence in smaller linguistic components, the studies in the third section analyse event and entity coreference in English, French, German, Italian, and Spanish, using different referring expressions and features of the verb (aspect and causative-inchoative alternation) as proxies to manipulate the prominence of entities versus the events in which they are involved. Finally, the fourth and last section investigates number conceptualisation in named entities in the same five languages: in coreference, speakers have to choose whether to index the entity according to its morphosyntactic or notional number, marking agreement on the pronoun consequently. The prominence of grammatical and semantic number in the speakers' indexing of referents is shown to change crosslinguistically and with the formality of a text, as well as with features of the entity. Overall, the results of this research show a varied interplay between prominence and patterns of coreference, with different manifestations at different levels of linguistic structure and results that can sometimes be extended crosslinguistically