61,017 research outputs found

    Extracting protein-protein interactions from text using rich feature vectors and feature selection

    Get PDF
    Because of the intrinsic complexity of natural language, automatically extracting accurate information from text remains a challenge. We have applied rich featurevectors derived from dependency graphs to predict protein-protein interactions using machine learning techniques. We present the first extensive analysis of applyingfeature selection in this domain, and show that it can produce more cost-effective models. For the first time, our technique was also evaluated on several large-scalecross-dataset experiments, which offers a more realistic view on model performance. During benchmarking, we encountered several fundamental problems hindering comparability with other methods. We present a set of practical guidelines to set up ameaningful evaluation. Finally, we have analysed the feature sets from our experiments before and after feature selection, and evaluated the contribution of both lexical and syntacticinformation to our method. The gained insight will be useful to develop better performing methods in this domain

    Can monolinguals be like bilinguals? Evidence from dialect switching

    Get PDF
    Bilinguals rely on cognitive control mechanisms like selective activation and inhibition of lexical entries to prevent intrusions from the non-target language. We present cross-linguistic evidence that these mechanisms also operate in bidialectals. Thirty-two native German speakers who sometimes use the Öcher Platt dialect, and thirty-two native English speakers who sometimes use the Dundonian Scots dialect completed a dialect-switching task. Naming latencies were higher for switch than for non-switch trials, and lower for cognate compared to non-cognate nouns. Switch costs were symmetrical, regardless of whether participants actively used the dialect or not. In contrast, sixteen monodialectal English speakers, who performed the dialectswitching task after being trained on the Dundonian words, showed asymmetrical switch costs with longer latencies when switching back into Standard English. These results are reminiscent of findings for balanced vs. unbalanced bilinguals, and suggest that monolingual dialect speakers can recruit control mechanisms in similar ways as bilinguals

    WHERE IS THE LOCUS OF DIFFICULTY IN RECOGNIZING FOREIGN-ACCENTED WORDS? NEIGHBORHOOD DENSITY AND PHONOTACTIC PROBABILITY EFFECTS ON THE RECOGNITION OF FOREIGN-ACCENTED WORDS BY NATIVE ENGLISH LISTENERS

    Get PDF
    This series of experiments (1) examined whether native listeners experience recognition difficulty in all kinds of foreign-accented words or only in a subset of words with certain lexical and sub-lexical characteristics-- neighborhood density and phonotactic probability; (2) identified the locus of foreign-accented word recognition difficulty, and (3) investigated how accent-induced mismatches impact the lexical retrieval process. Experiments 1 and 4 examined the recognition of native-produced and foreign-accented words varying in neighborhood density with auditory lexical decision and perceptual identification tasks respectively, which emphasize the lexical level of processing. Findings from Experiment 1 revealed increased accent-induced processing cost in reaction times, especially for words with many similar sounding words, implying that native listeners increase their reliance on top-down lexical knowledge during foreign-accented word recognition. Analysis of perception errors from Experiment 4 found the misperceptions in the foreign-accented condition to be more similar to the target words than those in the native-produced condition. This suggests that accent-induced mismatches tend to activate similar sounding words as alternative word candidates, which possibly pose increased lexical competition for the target word and result in greater processing costs for foreign-accented word recognition at the lexical level. Experiments 2 and 3 examined the sub-lexical processing of the foreign-accented words varying in neighborhood density and phonotactic probability respectively with a same-different matching task, which emphasizes the sub-lexical level of processing. Findings from both experiments revealed no extra processing costs , in either reaction times or accuracy rates, for the foreign-accented stimuli, implying that the sub-lexical processing of the foreign-accented words is as good as that of the native-produced words. Taken together, the overall recognition difficulty of foreign-accented stimuli, as well as the differentially increased processing difficulty for accented dense words (observed in Experiment 1), mainly stems from the lexical level, due to the increased lexical competition posed by the similar sounding word candidates

    Communicating with Cost-based Implicature: a Game-Theoretic Approach to Ambiguity

    Get PDF
    A game-theoretic approach to linguistic communication predicts that speakers can meaningfully use ambiguous forms in a discourse context in which only one of several available referents has a costly unambiguous form and in which rational interlocutors share knowledge of production costs. If a speaker produces a low-cost ambiguous form to avoid using the high-cost unambiguous form, a rational listener will infer that the high-cost entity was the intended entity, or else the speaker would not have risked ambiguity. We report data from two studies in which pairs of speakers show alignment of their use of ambiguous forms based on this kind of shared knowledge. These results extend the analysis of cost-based pragmatic inferencing beyond that previously associated only with fixed lexical hosts.
    • …
    corecore