917 research outputs found

    Analyzing collaborative learning processes automatically

    Get PDF
    In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in

    Bilingual newsgroups in Catalonia: a challenge for machine translation

    Get PDF
    This paper presents a linguistic analysis of a corpus of messages written in Catalan and Spanish, which come from several informal newsgroups on the Universitat Oberta de Catalunya (Open University of Catalonia; henceforth, UOC) Virtual Campus. The surrounding environment is one of extensive bilingualism and contact between Spanish and Catalan. The study was carried out as part of the INTERLINGUA project conducted by the UOC's Internet Interdisciplinary Institute (IN3). Its main goal is to ascertain the linguistic characteristics of the e-mail register in the newsgroups in order to assess their implications for the creation of an online machine translation environment. The results shed empirical light on the relevance of characteristics of the e-mail register, the impact of language contact and interference, and their implications for the use of machine translation for CMC data in order to facilitate cross-linguistic communication on the Internet

    Internet... the final frontier: an ethnographic account: exploring the cultural space of the Net from the inside

    Get PDF
    The research project The Internet as a space for interaction, which completed its mission in Autumn 1998, studied the constitutive features of network culture and network organisation. Special emphasis was given to the dynamic interplay of technical and social conventions regarding both the Net’s organisation as well as its change. The ethnographic perspective chosen studied the Internet from the inside. Research concentrated upon three fields of study: the hegemonial operating technology of net nodes (UNIX) the network’s basic transmission technology (the Internet Protocol IP) and a popular communication service (Usenet). The project’s final report includes the results of the three branches explored. Drawing upon the development in the three fields it is shown that changes that come about on the Net are neither anarchic nor arbitrary. Instead, the decentrally organised Internet is based upon technically and organisationally distributed forms of coordination within which individual preferences collectively attain the power of developing into definitive standards. --

    Informatics Research Institute (IRIS) March 2009 newsletter

    Get PDF
    This is the first newsletter following the outcome of the Research Assessment Exercise which confirmed IRIS as one of the leading multidisciplinary research institutes that brings together expertise in social, technological and computational aspects of information systems. Research Fortnight ranked IRIS activities in the top two submissions in the country, with 75% of activities at international level and 25% at world leading level. The reviewers were particularly impressed with the Research Environment, which was highlighted as having 50% of activities at world leading level. I’d like to thank all members of IRIS whose commitment to pursuing high quality research has contributed to this success. This newsletter highlights some activities immediately following the RAE, showing that we are not content with the excellent RAE results but building further on our successful research. It includes examples of important research events that we are organising, publications in major outlets, funded projects and students who have successfully completed their PhDs

    Grouping related attributes

    Get PDF
    Grouping objects that are described by attributes, or clustering is a central notion in data mining. On the other hand, similarity or relationships between attributes themselves is equally important but relatively unexplored. Such groups of attributes are also known as directories, concept hierarchies or topics depending on the underlying data domain. The similarities between the two problems of grouping objects and attributes might suggest that traditional clustering techniques are applicable. This thesis argues that traditional clustering techniques fail to adequately capture the solution we seek. It also explores domain-independent techniques for grouping attributes. The notion of similarity between attributes and therefore clustering in categorical datasets has not received adequate attention. This issue has seen renewed interest in the knowledge discovery community, spurred on by the requirements of personalization of information and online search technology. The problem is broken down into (a) quantification of this notion of similarity and (b) the subsequent formation of groups, retaining attributes similar enough in the same group based on metrics that we will attempt to derive. Both aspects of the problem are carefully studied. The thesis also analyzes existing domainindependent approaches to building distance measures, proposing and analyzing iii several such measures for quantifying similarity, thereby providing a foundation for future work in grouping relevant attributes. The theoretical results are supported by experiments carried out on a variety of datasets from the text-mining, web-mining, social networks and transaction analysis domains. The results indicate that traditional clustering solutions are inadequate within this problem framework. They also suggest a direction for the development of distance measures for the quantification of the concept of similarity between categorical attributes

    CEAI: CCM based Email Authorship Identification Model

    Full text link
    In this paper we present a model for email authorship identification (EAI) by employing a Cluster-based Classification (CCM) technique. Traditionally, stylometric features have been successfully employed in various authorship analysis tasks; we extend the traditional feature-set to include some more interesting and effective features for email authorship identification (e.g. the last punctuation mark used in an email, the tendency of an author to use capitalization at the start of an email, or the punctuation after a greeting or farewell). We also included Info Gain feature selection based content features. It is observed that the use of such features in the authorship identification process has a positive impact on the accuracy of the authorship identification task. We performed experiments to justify our arguments and compared the results with other base line models. Experimental results reveal that the proposed CCM-based email authorship identification model, along with the proposed feature set, outperforms the state-of-the-art support vector machine (SVM)-based models, as well as the models proposed by Iqbal et al. [1, 2]. The proposed model attains an accuracy rate of 94% for 10 authors, 89% for 25 authors, and 81% for 50 authors, respectively on Enron dataset, while 89.5% accuracy has been achieved on authors' constructed real email dataset. The results on Enron dataset have been achieved on quite a large number of authors as compared to the models proposed by Iqbal et al. [1, 2]

    Proceedings of the fifth UK/BCS symposium on knowledge discovery and data mining

    Get PDF
    This is the proceedings of a one day symposium on Knowledge Discovery and Data Mining held at the Salford Lowry in 2009. The topics covered included some of the most important and exciting issues in the field. There were presentations on fundamental research topics such as how data mining is changing the very nature of scientific methods, the challenges of time series data mining, use of social network analysis for classification of messages, knowledge discovery from case data, and development of a unifying framework for feature selection methods. There were also presentations describing the lessons learned from real world case studies in detecting financial crime, profiling electricity usage, image processing, credit scoring, and predicting internet shopping pattern

    Collaborative trails in e-learning environments

    Get PDF
    This deliverable focuses on collaboration within groups of learners, and hence collaborative trails. We begin by reviewing the theoretical background to collaborative learning and looking at the kinds of support that computers can give to groups of learners working collaboratively, and then look more deeply at some of the issues in designing environments to support collaborative learning trails and at tools and techniques, including collaborative filtering, that can be used for analysing collaborative trails. We then review the state-of-the-art in supporting collaborative learning in three different areas – experimental academic systems, systems using mobile technology (which are also generally academic), and commercially available systems. The final part of the deliverable presents three scenarios that show where technology that supports groups working collaboratively and producing collaborative trails may be heading in the near future

    Internet... the final frontier: an ethnographic account ; exploring the cultural space of the net from the inside

    Full text link
    "The research project 'The Internet as a space for interaction', which completed its mission in Autumn 1998, studied the constitutive features of network culture and network organisation. Special emphasis was given to the dynamic interplay of technical and social conventions regarding both the net's organisation as well as its change. The ethnographic perspective chosen studied the Internet from the inside. Research concentrated upon three fields of study: the hegemonial operating technology of net nodes (UNIX) the network’s basic transmission technology (the Internet Protocol IP) and a popular communication service (Usenet). The project's final report includes the results of the three branches explored. Drawing upon the development in the three fields it is shown that changes that come about on the Net are neither anarchic nor arbitrary. Instead, the decentrally organised Internet is based upon technically and organisationally distributed forms of coordination within which individual preferences collectively attain the power of developing into definitive standards." (author's abstract)"Das im Herbst 1998 abgeschlossene Forschungsprojekt 'Interaktionsraum Internet' hat sich mit den konstitutiven Merkmalen der Netzkultur und Netzwerkorganisation beschäftigt. Im Vordergrund des Interesses stand das dynamische Zusammenspiel technischer und gesellschaftlicher Konventionen in der Organisation wie auch im Wandel des Netzes. Die ethnographisch angeleitete Binnenperspektive auf das Internet konzentrierte sich auf drei ausgewählte Bereiche, um Prozesse der Institutionenbildung und die Formen ihrer Transformation zu studieren: die hegemoniale Betriebstechnik der Netzknoten (UNIX), die grundlegende Übertragungstechnik im Netz (das Internet Protokoll IP) und einen populären Kommunikationsdienst (Usenet). Der Schlußbericht des Projekts enthält die Ergebnisse der drei Untersuchungsstränge. Gezeigt wird anhand der Entwicklung in den drei Feldern, daß sich der Wandel des Netzes weder beliebig noch anarchisch vollzieht. Das dezentral organisierte Internet beruht vielmehr auf technisch wie organisatorisch verteilten Formen der Koordination, in denen individuelle Handlungspräferenzen kollektiv definitionsmächtig werden." (Autorenreferat

    The Information Practices of People Living with Depression: Constructing Credibility and Authority

    Get PDF
    Depressive episodes and chronic depression often provide the impetus for both online and offline everyday life information-seeking and sharing and the seeking of support. While allopathic medication, psychiatric, and other biomedical services are the standard treatments for depression, people often use complementary and alternative medicine (CAM) to supplement or supplant biomedical treatments. Depression is a nebulous disorder with varying causes, illness trajectories, and a wide variety of potentially effective treatments. Often, treating and managing depression forms a project for life (Wikgren, 2001) where the need for information is continuous. In the present study, I have used a constructionist, discourse analytic approach as outlined by Potter (1996) and Wooffitt (1992) to analyze the messages posted to three online newsgroups devoted to depression, CAM, and the practices of biomedicine and to analyze the transcripts from 10 semi-structured interviews with participants who self-identified as currently having depression or who have suffered from depression in the past. I have sought to understand how people justify using, or not using, CAM to treat depression. Specifically, I have investigated how people with depression use information in discourse to justify healthcare decisions and to create credible and authoritative accounts; how people with depression conceptualize CAM therapies, mainstream medicine, and depression and how these conceptualizations are represented in the discursive constructions of individuals as competent information-seekers and users; and I have investigated the information practices (e.g., everyday life information-seeking, sharing, and use) of people living with depression. My findings show that while expert, biomedical information sources and knowledge are most often drawn upon and referred to by newsgroup posters and interviewees to warrant claims, people used a variety of discursive strategies and regular speech patterns to create credible and authoritative accounts, to portray themselves as competent information-seekers and users, to support their claims for either using or foregoing a certain treatment, and to counter the authoritative knowledge of biomedicine. In addition, my findings emphasize the importance of orienting information discussed in Savolainen’s (1995) everyday life information-seeking (ELIS) model. For many people with depression, information was used to maintain a sense of coherence (related to “mastery of life” within the ELIS model) and to create meaning in addition to solving practical problems. My findings suggest that an additional information-seeking principle to those outlined by Harris and Dewdney (1994) deserves further research attention: people seek information that is congruent with their worldview and values
    • …
    corecore