Search CORE

238,465 research outputs found

Cross-language Wikipedia Editing of Okinawa, Japan

Author: Grosjean F.
Hong L.
Kittur A.
Milne D.
Zuckerman E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

This article analyzes users who edit Wikipedia articles about Okinawa, Japan, in English and Japanese. It finds these users are among the most active and dedicated users in their primary languages, where they make many large, high-quality edits. However, when these users edit in their non-primary languages, they tend to make edits of a different type that are overall smaller in size and more often restricted to the narrow set of articles that exist in both languages. Design changes to motivate wider contributions from users in their non-primary languages and to encourage multilingual users to transfer more information across language divides are presented.Comment: In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2015. AC

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

A Controllable Model of Grounded Response Generation

Author: Brockett Chris
Dolan Bill
Galley Michel
Gao Jianfeng
Gao Xiang
Hajishirzi Hannaneh
Koncel-Kedziorski Rik
Ostendorf Mari
Quirk Chris
Wu Zeqiu
Zhang Yizhe
Publication venue
Publication date: 18/05/2021
Field of study

Current end-to-end neural conversation models inherently lack the flexibility to impose semantic control in the response generation process, often resulting in uninteresting responses. Attempts to boost informativeness alone come at the expense of factual accuracy, as attested by pretrained language models' propensity to "hallucinate" facts. While this may be mitigated by access to background knowledge, there is scant guarantee of relevance and informativeness in generated responses. We propose a framework that we call controllable grounded response generation (CGRG), in which lexical control phrases are either provided by a user or automatically extracted by a control phrase predictor from dialogue context and grounding knowledge. Quantitative and qualitative results show that, using this framework, a transformer based model with a novel inductive attention mechanism, trained on a conversation-like Reddit dataset, outperforms strong generation baselines.Comment: AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Algorithms and Architecture for Real-time Recommendations at News UK

Author: Bailey Dion
Clarke Daoud
Pajak Tom
Rodriguez Carlos
Publication venue
Publication date: 15/09/2017
Field of study

Recommendation systems are recognised as being hugely important in industry, and the area is now well understood. At News UK, there is a requirement to be able to quickly generate recommendations for users on news items as they are published. However, little has been published about systems that can generate recommendations in response to changes in recommendable items and user behaviour in a very short space of time. In this paper we describe a new algorithm for updating collaborative filtering models incrementally, and demonstrate its effectiveness on clickstream data from The Times. We also describe the architecture that allows recommendations to be generated on the fly, and how we have made each component scalable. The system is currently being used in production at News UK.Comment: Accepted for presentation at AI-2017 Thirty-seventh SGAI International Conference on Artificial Intelligence. Cambridge, England 12-14 December 201

arXiv.org e-Print Archive

Crossref

Enhanced Search for Educational Resources - A Perspective and a Prototype from ccLearn

Author: Ahrash Bissell
Jane Park
Mike Linksvayer
Nathan Yergler
Publication venue: Creative Commons
Publication date: 07/07/2009
Field of study

Users of search tools who seek educational materials on the Internet are typically presented with either a web-scale search (e.g., Google or Yahoo) or a specialized, site-specific tool. The specialized search tools often rely upon custom data fields, such as user-entered ratings, to provide additional value. As currently designed, these systems are generally too labor intensive to manage and scale up beyond a single site or set of resources.However, custom (or structured) data of some form is necessary if search outcomes foreducational materials are to be improved. For example, design criteria and evaluative metrics are crucial attributes for educational resources, and these currently require human labeling and verification. Thus, one challenge is to design a search tool that capitalizes on available structured data (also called metadata) but is not crippled if the data are missing. This information should be amenable to repurposing by anyone, which means that it must be archived in a manner that can be discovered and leveraged easily.In this paper, we describe the extent to which DiscoverEd, a prototype developed by ccLearn, meets the design challenge of a scalable, enhanced search platform for educational resources. We then explore some of the key challenges regarding enhanced search for topic-specific Internet resources generally. We conclude by illustrating some possible future developments and third-party enhancements to the DiscoverEd prototype

IssueLab

Confusion and information triggered by photos in persona profiles

Author: An Jisun
Jansen Bernard J.
Jung Soon-Gyo
Kwak Haewoon
Nielsen Lene
Salminen joni
Publication venue: 'Elsevier BV'
Publication date: 01/09/2019
Field of study

Institutional Knowledge at Singapore Management University

The IT University of Copenhagen's Repository

Towards Knowledge-Based Personalized Product Description Generation in E-commerce

Author: Asghar Nabiha
Devlin Jacob
Gehring Jonas
Kingma Diederik P
Lample Guillaume
Li Jiwei
Li Jiwei
Nair Vinod
Sun Maosong
van der Maaten Laurens
Vaswani Ashish
Wang Jinpeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/06/2019
Field of study

Quality product descriptions are critical for providing competitive customer experience in an e-commerce platform. An accurate and attractive description not only helps customers make an informed decision but also improves the likelihood of purchase. However, crafting a successful product description is tedious and highly time-consuming. Due to its importance, automating the product description generation has attracted considerable interests from both research and industrial communities. Existing methods mainly use templates or statistical methods, and their performance could be rather limited. In this paper, we explore a new way to generate the personalized product description by combining the power of neural networks and knowledge base. Specifically, we propose a KnOwledge Based pErsonalized (or KOBE) product description generation model in the context of e-commerce. In KOBE, we extend the encoder-decoder framework, the Transformer, to a sequence modeling formulation using self-attention. In order to make the description both informative and personalized, KOBE considers a variety of important factors during text generation, including product aspects, user categories, and knowledge base, etc. Experiments on real-world datasets demonstrate that the proposed method out-performs the baseline on various metrics. KOBE can achieve an improvement of 9.7% over state-of-the-arts in terms of BLEU. We also present several case studies as the anecdotal evidence to further prove the effectiveness of the proposed approach. The framework has been deployed in Taobao, the largest online e-commerce platform in China.Comment: KDD 2019 Camera-ready. Website: https://sites.google.com/view/kobe201

arXiv.org e-Print Archive

Crossref

Firefox Extension to Add Contacts, Events, and View Addresses

Author: Rao Vijay
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2008
Field of study

Users of the Firefox browser have the ability to download plugins to manage their contacts. This usually involves typing or copying the details from some source to add contacts. Event and meeting invitations are sent by mail and are added to the user’s calendar once the user accepts the invitation. Users viewing address data on websites are limited to the mapping capabilities provided by the webpage viewed by the user. We developed a Firefox extension that allows the user to select portions of text with contact or event information and add it as a contact or an event in the calendar of their existing mail client application such as: Microsoft Outlook, Thunderbird, etc. The data is automatically parsed to pick up relevant information such as name, street address, phone number, and email address in case of contacts and street addresses and event dates in case of event. The extension also allows users to right click on a webpage that has a tabular display of addresses and view these addresses on a maps application such as Google Maps

SJSU ScholarWorks

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Creating digital library collections with Greenstone

Author: Bainbridge David
Witten Ian H.
Publication venue: 'Emerald'
Publication date: 01/01/2005
Field of study

The Greenstone digital library software is a comprehensive system for building and distributing digital library collections. It provides a way of organizing information based on metadata and publishing ti on the Internet. This paper introduces Greenstone and explains how librarians use it to create and customize digital library collections. Through an end-user interface, they add documents and metadata to collections, create new collections whose structure mirrors existing ones, and build collections and put them in place for users to view. More advanced users can design and customize new collection structures

CiteSeerX

Research Commons@Waikato