4,017 research outputs found
Recommended from our members
Improving automated email tagging with implicit feedback
Machine learning systems are generally trained offline using ground truth data that has been labeled by experts. However, these batch training methods are not a good fit for many applications, especially in the cases where complete ground truth data is not available for offline training. In addition, batch methods do not perform well in applications where the learning system is expected to quickly adapt to changes in the data with a non-stationary distribution and also remain resistant to label noise. Online learning algorithms provide solutions to these challenges, but these algorithms often assume that the ground truth is available after making every prediction.
In this thesis, we describe the 'online email tagging' problem where an underlying algorithm predicts a set of user-defined tags for an incoming email message. The email client user interface displays the predicted tags for the message, and the user doesn't need to do anything unless those predictions are wrong (in which case, the user can delete the incorrect tags and add the missing tags). This means that the learning algorithm never receives confirmation that its predictions are correct - it only receives feedback when it makes a mistake. This violates the assumption of most online learning algorithms, and can lead to slower and less effective learning. In many cases, the learning algorithm would benefit from positive feedback, i.e., confirmation of correct predictions.
One could assume that if the user never changes any tag, then the predictions are correct. But users sometimes forget to correct the tags, presumably because they are focused on the content of the email messages and fail to notice incorrect and missing tags. The aim of this thesis is to determine whether implicit feedback can provide useful additional training examples to the email prediction subsystem of TaskTracer, known as TAPE (Tag Assistant for Productive Email). Our hypothesis is that, the more time a user spends working on an email message, the more likely it is that the user will notice tag errors and correct them. If, after the user has spent enough time working on an email message, no corrections have been made, then perhaps it is safe for the learning system to treat the predicted tags as being correct and train accordingly. We propose four algorithms (and three baselines) for incorporating implicit feedback into the TAPE email tag predictor. These algorithms are then evaluated using (i) email interaction and tag correction events collected from 14 user-study participants as they performed email-directed tasks while using TAPE, and (ii) case studies on real knowledge workers using TAPE to manage their own email messages. The results show that implicit feedback produces important increases in training feedback, and therefore, significantly reduces subsequent prediction errors despite the fact that implicit feedback is not perfect. We conclude that implicit feedback mechanisms can provide a useful performance boost for online email tagging systems. Finally, we perform a simulation study to show how tags could provide services to help with information re-finding and several common tasks that the users often need to perform within the email system. Our simulation results show that tag services have potential to greatly reduce the number of clicks required to perform these tasks.Keywords: email tagging, implicit feedback, TaskTrace
Emergent Capabilities for Collaborative Teams in the Evolving Web Environment
This paper reports on our investigation of the latest advances for the Social Web, Web 2.0 and the Linked Data Web. These advances are discussed in terms of the latest capabilities that are available (or being made available) on the Web at the time of writing this paper. Such capabilities can be of significant benefit to teams, especially those comprised of multinational, geographically-dispersed team members. The specific context of coalition members in a rapidly formed diverse military context such as disaster relief or humanitarian aid is considered, where close working between non-government organisations and non-military teams will help to achieve results as quickly and efficiently as possible. The heterogeneity one finds in such teams, coupled with a lack of dedicated private network infrastructure, poses a number of challenges for collaboration, and the current paper represents an attempt to assess whether nascent Web-based capabilities can support such teams in terms of both their collaborative activities and their access to (and sharing of) information resources
Local search engine with global content based on domain specific knowledge
In the growing need for information we have come to rely on search engines. The use of large scale search engines, such as Google, is as common as surfingthe World Wide Web. We are impressed with the capabilities of these search engines but still there is a need for improvment. A common problem withsearching is the ambiguity of words. Their meaning often depends on the context in which they are used or varies across specific domains. To resolve this we propose a domain specific search engine that is globally oriented. We intend to provide content classification according to the target domain concepts, access to privileged information, personalization and custom rankingfunctions. Domain specific concepts have been formalized in the form ofontology. The paper describes our approach to a centralized search service for domain specific content. The approach uses automated indexing for various content sources that can be found in the form of a relational database, we! b service, web portal or page, various document formats and other structured or unstructured data. The gathered data is tagged with various approaches and classified against the domain classification. The indexed data is accessible through a highly optimized and personalized search service
ECSCW 2013 Adjunct Proceedings The 13th European Conference on Computer Supported Cooperative Work 21 - 25. September 2013, Paphos, Cyprus
This volume presents the adjunct proceedings of ECSCW 2013.While the proceedings published by Springer Verlag contains the core of the technical program, namely the full papers, the adjunct proceedings includes contributions on work in progress, workshops and master classes, demos and videos, the doctoral colloquium, and keynotes, thus indicating what our field may become in the future
Analyzing collaborative learning processes automatically
In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in
- …