238,465 research outputs found

    Cross-language Wikipedia Editing of Okinawa, Japan

    Full text link
    This article analyzes users who edit Wikipedia articles about Okinawa, Japan, in English and Japanese. It finds these users are among the most active and dedicated users in their primary languages, where they make many large, high-quality edits. However, when these users edit in their non-primary languages, they tend to make edits of a different type that are overall smaller in size and more often restricted to the narrow set of articles that exist in both languages. Design changes to motivate wider contributions from users in their non-primary languages and to encourage multilingual users to transfer more information across language divides are presented.Comment: In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2015. AC

    A Controllable Model of Grounded Response Generation

    Full text link
    Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses. Attempts to boost informativeness alone come at the expense of factual accuracy, as attested by pretrained language models' propensity to "hallucinate" facts. While this may be mitigated by access to background knowledge, there is scant guarantee of relevance and informativeness in generated responses. We propose a framework that we call controllable grounded response generation (CGRG), in which lexical control phrases are either provided by a user or automatically extracted by a control phrase predictor from dialogue context and grounding knowledge. Quantitative and qualitative results show that, using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.Comment: AAAI 202

    Algorithms and Architecture for Real-time Recommendations at News UK

    Full text link
    Recommendation systems are recognised as being hugely important in industry, and the area is now well understood. At News UK, there is a requirement to be able to quickly generate recommendations for users on news items as they are published. However, little has been published about systems that can generate recommendations in response to changes in recommendable items and user behaviour in a very short space of time. In this paper we describe a new algorithm for updating collaborative filtering models incrementally, and demonstrate its effectiveness on clickstream data from The Times. We also describe the architecture that allows recommendations to be generated on the fly, and how we have made each component scalable. The system is currently being used in production at News UK.Comment: Accepted for presentation at AI-2017 Thirty-seventh SGAI International Conference on Artificial Intelligence. Cambridge, England 12-14 December 201

    Enhanced Search for Educational Resources - A Perspective and a Prototype from ccLearn

    Get PDF
    Users of search tools who seek educational materials on the Internet are typically presented with either a web-scale search (e.g., Google or Yahoo) or a specialized, site-specific tool. The specialized search tools often rely upon custom data fields, such as user-entered ratings, to provide additional value. As currently designed, these systems are generally too labor intensive to manage and scale up beyond a single site or set of resources.However, custom (or structured) data of some form is necessary if search outcomes foreducational materials are to be improved. For example, design criteria and evaluative metrics are crucial attributes for educational resources, and these currently require human labeling and verification. Thus, one challenge is to design a search tool that capitalizes on available structured data (also called metadata) but is not crippled if the data are missing. This information should be amenable to repurposing by anyone, which means that it must be archived in a manner that can be discovered and leveraged easily.In this paper, we describe the extent to which DiscoverEd, a prototype developed by ccLearn, meets the design challenge of a scalable, enhanced search platform for educational resources. We then explore some of the key challenges regarding enhanced search for topic-specific Internet resources generally. We conclude by illustrating some possible future developments and third-party enhancements to the DiscoverEd prototype

    Towards Knowledge-Based Personalized Product Description Generation in E-commerce

    Full text link
    Quality product descriptions are critical for providing competitive customer experience in an e-commerce platform. An accurate and attractive description not only helps customers make an informed decision but also improves the likelihood of purchase. However, crafting a successful product description is tedious and highly time-consuming. Due to its importance, automating the product description generation has attracted considerable interests from both research and industrial communities. Existing methods mainly use templates or statistical methods, and their performance could be rather limited. In this paper, we explore a new way to generate the personalized product description by combining the power of neural networks and knowledge base. Specifically, we propose a KnOwledge Based pErsonalized (or KOBE) product description generation model in the context of e-commerce. In KOBE, we extend the encoder-decoder framework, the Transformer, to a sequence modeling formulation using self-attention. In order to make the description both informative and personalized, KOBE considers a variety of important factors during text generation, including product aspects, user categories, and knowledge base, etc. Experiments on real-world datasets demonstrate that the proposed method out-performs the baseline on various metrics. KOBE can achieve an improvement of 9.7% over state-of-the-arts in terms of BLEU. We also present several case studies as the anecdotal evidence to further prove the effectiveness of the proposed approach. The framework has been deployed in Taobao, the largest online e-commerce platform in China.Comment: KDD 2019 Camera-ready. Website: https://sites.google.com/view/kobe201

    Firefox Extension to Add Contacts, Events, and View Addresses

    Get PDF
    Users of the Firefox browser have the ability to download plugins to manage their contacts. This usually involves typing or copying the details from some source to add contacts. Event and meeting invitations are sent by mail and are added to the user’s calendar once the user accepts the invitation. Users viewing address data on websites are limited to the mapping capabilities provided by the webpage viewed by the user. We developed a Firefox extension that allows the user to select portions of text with contact or event information and add it as a contact or an event in the calendar of their existing mail client application such as: Microsoft Outlook, Thunderbird, etc. The data is automatically parsed to pick up relevant information such as name, street address, phone number, and email address in case of contacts and street addresses and event dates in case of event. The extension also allows users to right click on a webpage that has a tabular display of addresses and view these addresses on a maps application such as Google Maps

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Creating digital library collections with Greenstone

    Get PDF
    The Greenstone digital library software is a comprehensive system for building and distributing digital library collections. It provides a way of organizing information based on metadata and publishing ti on the Internet. This paper introduces Greenstone and explains how librarians use it to create and customize digital library collections. Through an end-user interface, they add documents and metadata to collections, create new collections whose structure mirrors existing ones, and build collections and put them in place for users to view. More advanced users can design and customize new collection structures
    corecore