2,970 research outputs found

    Collaborative tagging as a knowledge organisation and resource discovery tool

    Get PDF
    The purpose of the paper is to provide an overview of the collaborative tagging phenomenon and explore some of the reasons for its emergence. Design/methodology/approach - The paper reviews the related literature and discusses some of the problems associated with, and the potential of, collaborative tagging approaches for knowledge organisation and general resource discovery. A definition of controlled vocabularies is proposed and used to assess the efficacy of collaborative tagging. An exposition of the collaborative tagging model is provided and a review of the major contributions to the tagging literature is presented. Findings - There are numerous difficulties with collaborative tagging systems (e.g. low precision, lack of collocation, etc.) that originate from the absence of properties that characterise controlled vocabularies. However, such systems can not be dismissed. Librarians and information professionals have lessons to learn from the interactive and social aspects exemplified by collaborative tagging systems, as well as their success in engaging users with information management. The future co-existence of controlled vocabularies and collaborative tagging is predicted, with each appropriate for use within distinct information contexts: formal and informal. Research limitations/implications - Librarians and information professional researchers should be playing a leading role in research aimed at assessing the efficacy of collaborative tagging in relation to information storage, organisation, and retrieval, and to influence the future development of collaborative tagging systems. Practical implications - The paper indicates clear areas where digital libraries and repositories could innovate in order to better engage users with information. Originality/value - At time of writing there were no literature reviews summarising the main contributions to the collaborative tagging research or debate

    From corpus-based collocation frequencies to readability measure

    Get PDF
    This paper provides a broad overview of three separate but related areas of research. Firstly, corpus linguistics is a growing discipline that applies analytical results from large language corpora to a wide variety of problems in linguistics and related disciplines. Secondly, readability research, as the name suggests, seeks to understand what makes texts more or less comprehensible to readers, and aims to apply this understanding to issues such as text rating and matching of texts to readers. Thirdly, collocation is a language feature that occurs when particular words are used frequently together for other than purely grammatical reasons. The intersection of these three aspects provides the basis for on-going research within the Department of Computer and Information Sciences at the University of Strathclyde and is the motivation for this overview. Specifically, we aim through analysis of collocation frequencies in major corpora, to afford valuable insight on the content of texts, which we believe will, in turn, provide a novel basis for estimating text readability

    Clasificación de errores gramaticales colocacionales en textos de estudiantes de español

    Get PDF
    Arbitrary recurrent word combinations (collocations) are a key in language learning. However, even advanced students have difficulties when using them. Efficient collocation aiding tools would be of great help. Still, existing “collocation checkers” still struggle to offer corrections to miscollocations. They attempt to correct without making any distinction between the different types of errors, providing, as a consequence, heterogeneous lists of collocations as suggestions. Besides, they focus solely on lexical errors, leaving aside grammatical ones. The former attract more attention, but the latter cannot be ignored either if the goal is to develop a comprehensive collocation aiding tool, able to correct all kinds of miscollocations. We propose an approach to automatically classify grammatical collocation errors made by US learners of Spanish as a starting point for the design of specific correction strategies targeted for each type of error.Las combinaciones recurrentes y arbitrarias de palabras (colocaciones) son clave para el aprendizaje de lenguas pero presentan dificultades incluso a los estudiantes m as avanzados. El uso de herramientas eficientes destinadas al aprendizaje de colocaciones supondría una gran ayuda, sin embargo, las que existen actualmente intentan corregir colocaciones erróneas sin diferenciar entre los distintos tipos de errores ofreciendo, como consecuencia, largas listas de colocaciones de muy diversa naturaleza. Además, sólo se consideran los errores léxicos, dejando de lado los gramaticales que, aunque menos frecuentes, no pueden ignorarse si el objetivo es desarrollar una herramienta capaz de corregir cualquier colocación errónea. En el presente trabajo se propone un método de clasificación automática de errores colocacionales gramaticales cometidos por estudiantes de español estadounidenses, como punto de partida para el diseño de estrategias de corrección específicas para cada tipo de error.This work has been funded by the Spanish Ministry of Science and Competitiveness (MINECO), through a predoctoral grant with reference BES-2012-057036, in the framework of the project HARenES, under the contract number FFI2011-30219-C02-02

    ColloCaid: A real-time tool to help academic writers with English collocations”

    Get PDF
    Writing is a cognitively challenging activity that can benefit from lexicographic support. Academic writing in English presents a particular challenge, given the extent of use of English for this purpose. The ColloCaid tool, currently under development, responds to this challenge. It is intended to assist academic English writers by providing collocation suggestions, as well as alerting writers to unconventional collocational choices as they write. The underlying collocational data are based on a carefully curated set of about 500 collocational bases (nouns, verbs, and adjectives) characteristic of academic English, and their collocates with illustrative examples. These data have been derived from state-of-the-art corpora of academic English and academic vocabulary lists. The manual curation by expert lexicographers and reliance on specifically Academic English textual resources are what distinguishes ColloCaid from existing collocational resources. A further characteristic of ColloCaid is its strong emphasis on usability. The tool draws on dictionary-user research, findings in information visualization, as well as usability testing specific to ColloCaid in order to find an optimal amount of collocation prompts, and the best way to present them to the user

    Candidate knowledge? Exploring epistemic claims in scientific writing:a corpus-driven approach

    Get PDF
    In this article I argue that the study of the linguistic aspects of epistemology has become unhelpfully focused on the corpus-based study of hedging and that a corpus-driven approach can help to improve upon this. Through focusing on a corpus of texts from one discourse community (that of genetics) and identifying frequent tri-lexical clusters containing highly frequent lexical items identified as keywords, I undertake an inductive analysis identifying patterns of epistemic significance. Several of these patterns are shown to be hedging devices and the whole corpus frequencies of the most salient of these, candidate and putative, are then compared to the whole corpus frequencies for comparable wordforms and clusters of epistemic significance. Finally I interviewed a ‘friendly geneticist’ in order to check my interpretation of some of the terms used and to get an expert interpretation of the overall findings. In summary I argue that the highly unexpected patterns of hedging found in genetics demonstrate the value of adopting a corpus-driven approach and constitute an advance in our current understanding of how to approach the relationship between language and epistemology

    To What Extent is Collocation Knowledge Associated with Oral Proficiency? A Corpus-Based Approach to Word Association

    Get PDF
    This study examined the relationship between second language (L2) learners’ collocation knowledge and oral proficiency. A new approach to measuring collocation was adopted by eliciting responses through a word association task and using corpus-based measures (absolute frequency count, t-score, MI score) to analyze the degree to which stimulus words and responses were collocated. Oral proficiency was measured using human judgements and objective measures of fluency (articulation rate, silent pause ratio, filled pause ratio) and lexical richness (diversity, frequency, range). Forty Japanese university students completed a word association task and a spontaneous speaking task (picture narrative). Results indicated that speakers who used more low-frequency collocations in the word association task (i.e., lower collocation frequency scores) spoke faster with fewer silent pauses and were perceived to be more fluent. Speakers who provided more strongly associated collocations (as measured by MI) used more sophisticated lexical items and were perceived to be lexically proficient. Collocation knowledge remained as a unique predictor after the influence of learners’ vocabulary size (i.e., knowledge of single-word items) was considered. These findings support the key role that collocation plays in oral proficiency and provide important insights into understanding L2 speech development from the perspective of phraseological competence

    非英語母語話者のためのインタラクティブな書き換え

    Get PDF
    Tohoku University博士(情報科学)thesi

    What can screen capture reveal about students’ use of software tools when undertaking a paraphrasing task?

    Get PDF
    Previous classroom observations, and examination of students’ written drafts, had suggested that when summarising or paraphrasing source texts, some of our students were using software tools (for example the copy-paste function and synonym lookup) in possibly unhelpful ways. To test these impressions we used screen capture software to record 20 university students paraphrasing a short text using the word-processing package on a networked PC, and analysed how they utilised software to fulfil the task. Participants displayed variable proficiency in using word-processing tools, and very few accessed external sites. The most frequently enlisted tool was the synonym finder. Some of the better writers (assessed in terms of their paraphrase quality) availed themselves little of software aids. We discuss how teachers of academic writing could help students make more efficient and judicious use of commonly available tools, and suggest further uses of screen capture in teaching and researching academic writing

    Fifty years of spellchecking

    Get PDF
    A short history of spellchecking from the late 1950s to the present day, describing its development through dictionary lookup, affix stripping, correction, confusion sets, and edit distance to the use of gigantic databases
    corecore