622 research outputs found
Lexical coverage in ELF
The aim of this study was to determine how much vocabulary is needed to understand English in contexts where it is spoken internationally as a lingua franca (ELF). This information is critical to inform vocabulary size targets for second language (L2) learners of English. The current research consensus, based on native-English-speaker data, is that 6,000–7,000 word families plus proper nouns are needed. However, since English has become a global lingua franca, native speakers of English have become a minority: in fact, today, there are around two billion speakers of English worldwide, of which less than a quarter are native speakers. This means that non-native speakers of English are more likely to interact with other non-native speakers than with native speakers. Thus, using findings based on solely native-speaker data may not provide the most accurate information needed to inform vocabulary size targets for L2 learners of English. Indeed, this information needs to be supplemented with data from competent non-native speakers of English who can represent a legitimate model for L2 learners of English.
This study uses the largest freely available corpus of general, spoken ELF in Europe: the one million-word Vienna-Oxford International Corpus of English (VOICE). The word family was used as a lexical counting unit, and the lexical coverage of VOICE was calculated for various thresholds of the most frequent word families in the corpus. A comparative analysis was carried out to determine the lexical coverage of VOICE provided by frequency ranked word lists based on data from the British National Corpus of English and the Contemporary Corpus of American English.
The main findings of this study indicate that fewer than 3,000–4,000 word families plus proper nouns can provide the lexical resources needed to understand English in international contexts where it is spoken as a lingua franca. This is approximately half the number of word families (i.e. 6,000–7,000 word families plus proper nouns) which scholars have claimed are needed to understand spoken English. The findings of this study represent a substantial saving in vocabulary size targets for L2 learners of English who wish to be functional in understanding English spoken as an international lingua franca
Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications
Understanding person acquisition using an interactive activation and competition network
Face perception is one of the most developed visual skills that humans display, and recent work has attempted to examine the mechanisms involved in face perception through noting how neural networks achieve the same performance. The purpose of the present paper is to extend this approach to look not just at human face recognition, but also at human face acquisition. Experiment 1 presents empirical data to describe the acquisition over time of appropriate representations for newly encountered faces. These results are compared with those of Simulation 1, in which a modified IAC network capable of modelling the acquisition process is generated. Experiment 2 and Simulation 2 explore the mechanisms of learning further, and it is demonstrated that the acquisition of a set of associated new facts is easier than the acquisition of individual facts in isolation of one another. This is explained in terms of the advantage gained from additional inputs and mutual reinforcement of developing links within an interactive neural network system. <br/
SuperIdentity: fusion of identity across real and cyber domains
Under both benign and malign circumstances, people now manage a spectrum of identities across both real-world and cyber domains. Our belief, however, is that all these instances ultimately track back for an individual to reflect a single ‘SuperIdentity’. This paper outlines the assumptions underpinning the SuperIdentity Project, describing the innovative use of data fusion to incorporate novel real-world and cyber cues into a rich framework appropriate for modern identity. The proposed combinatorial model will support a robust identification or authentication decision, with confidence indexed both by the level of trust in data provenance, and the diagnosticity of the identity factors being used. Additionally, the exploration of correlations between factors may underpin the more intelligent use of identity information so that known information may be used to predict previously hidden information. With modern living supporting the ‘distribution of identity’ across real and cyber domains, and with criminal elements operating in increasingly sophisticated ways in the hinterland between the two, this approach is suggested as a way forwards, and is discussed in terms of its impact on privacy, security, and the detection of threa
Towards automated eyewitness descriptions: describing the face, body and clothing for recognition
A fusion approach to person recognition is presented here outlining the automated recognition of targets from human descriptions of face, body and clothing. Three novel results are highlighted. First, the present work stresses the value of comparative descriptions (he is taller than…) over categorical descriptions (he is tall). Second, it stresses the primacy of the face over body and clothing cues for recognition. Third, the present work unequivocally demonstrates the benefit gained through the combination of cues: recognition from face, body and clothing taken together far outstrips recognition from any of the cues in isolation. Moreover, recognition from body and clothing taken together nearly equals the recognition possible from the face alone. These results are discussed with reference to the intelligent fusion of information within police investigations. However, they also signal a potential new era in which automated descriptions could be provided without the need for human witnesses at all
Unfamiliar voice identification: effect of post-event information on accuracy and voice ratings
This study addressed the effect of misleading post-event information (PEI) on voice ratings, identification accuracy, and confidence, as well as the link between verbal recall and accuracy. Participants listened to a dialogue between male and female targets, then read misleading information about voice pitch. Participants engaged in verbal recall, rated voices on a feature checklist, and made a lineup decision. Accuracy rates were low, especially on target-absent lineups. Confidence and accuracy were unrelated, but the number of facts recalled about the voice predicted later lineup accuracy. There was a main effect of misinformation on ratings of target voice pitch, but there was no effect on identification accuracy or confidence ratings. As voice lineup evidence from earwitnesses is used in courts, the findings have potential applied relevance
The modulatory effect of semantic familiarity on the audiovisual integration of face-name pairs
Who am I? : Representing the self offline and in different online contexts
The present paper examines the extent to which self-presentation may be affected by the context in which is it undertaken. Individuals were asked to complete the Twenty Statements Test both privately and publicly, but were given an opportunity to withhold any of their personal information before it was made public. Four contexts were examined: an offline context (face-to-face), an un-contextualized general online context, or two specific online contexts (dating or job-seeking). The results suggested that participants were willing to disclose substantially less personal information online than offline. Moreover, disclosure decreased as the online context became more specific, and those in the job-seeking context disclosed the least amount of information. Surprisingly, individual differences in personality did not predict disclosure behavior. Instead, the results are set in the context of audience visibility and social norms, and implications for self-presentation in digital contexts are discussed
May I Speak Freely?:The Difficulty in Vocal Identity Processing across Free and Scripted Speech
- …
