6,206 research outputs found

    A User-Centered Concept Mining System for Query and Document Understanding at Tencent

    Full text link
    Concepts embody the knowledge of the world and facilitate the cognitive processes of human beings. Mining concepts from web documents and constructing the corresponding taxonomy are core research problems in text understanding and support many downstream tasks such as query analysis, knowledge base construction, recommendation, and search. However, we argue that most prior studies extract formal and overly general concepts from Wikipedia or static web pages, which are not representing the user perspective. In this paper, we describe our experience of implementing and deploying ConcepT in Tencent QQ Browser. It discovers user-centered concepts at the right granularity conforming to user interests, by mining a large amount of user queries and interactive search click logs. The extracted concepts have the proper granularity, are consistent with user language styles and are dynamically updated. We further present our techniques to tag documents with user-centered concepts and to construct a topic-concept-instance taxonomy, which has helped to improve search as well as news feeds recommendation in Tencent QQ Browser. We performed extensive offline evaluation to demonstrate that our approach could extract concepts of higher quality compared to several other existing methods. Our system has been deployed in Tencent QQ Browser. Results from online A/B testing involving a large number of real users suggest that the Impression Efficiency of feeds users increased by 6.01% after incorporating the user-centered concepts into the recommendation framework of Tencent QQ Browser.Comment: Accepted by KDD 201

    A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web

    Full text link
    Over the past decade, rapid advances in web technologies, coupled with innovative models of spatial data collection and consumption, have generated a robust growth in geo-referenced information, resulting in spatial information overload. Increasing 'geographic intelligence' in traditional text-based information retrieval has become a prominent approach to respond to this issue and to fulfill users' spatial information needs. Numerous efforts in the Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the Linking Open Data initiative have converged in a constellation of open knowledge bases, freely available online. In this article, we survey these open knowledge bases, focusing on their geospatial dimension. Particular attention is devoted to the crucial issue of the quality of geo-knowledge bases, as well as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic Network, is outlined as our contribution to this area. Research directions in information integration and Geographic Information Retrieval (GIR) are then reviewed, with a critical discussion of their current limitations and future prospects

    Live Social Semantics

    Get PDF
    Social interactions are one of the key factors to the success of conferences and similar community gatherings. This paper describes a novel application that integrates data from the semantic web, online social networks, and a real-world contact sensing platform. This application was successfully deployed at ESWC09, and actively used by 139 people. Personal profiles of the participants were automatically generated using several Web~2.0 systems and semantic academic data sources, and integrated in real-time with face-to-face contact networks derived from wearable sensors. Integration of all these heterogeneous data layers made it possible to offer various services to conference attendees to enhance their social experience such as visualisation of contact data, and a site to explore and connect with other participants. This paper describes the architecture of the application, the services we provided, and the results we achieved in this deployment

    A Model for Personalized Keyword Extraction from Web Pages using Segmentation

    Full text link
    The World Wide Web caters to the needs of billions of users in heterogeneous groups. Each user accessing the World Wide Web might have his / her own specific interest and would expect the web to respond to the specific requirements. The process of making the web to react in a customized manner is achieved through personalization. This paper proposes a novel model for extracting keywords from a web page with personalization being incorporated into it. The keyword extraction problem is approached with the help of web page segmentation which facilitates in making the problem simpler and solving it effectively. The proposed model is implemented as a prototype and the experiments conducted on it empirically validate the model's efficiency.Comment: 6 Pages, 2 Figure

    City Data Fusion: Sensor Data Fusion in the Internet of Things

    Full text link
    Internet of Things (IoT) has gained substantial attention recently and play a significant role in smart city application deployments. A number of such smart city applications depend on sensor fusion capabilities in the cloud from diverse data sources. We introduce the concept of IoT and present in detail ten different parameters that govern our sensor data fusion evaluation framework. We then evaluate the current state-of-the art in sensor data fusion against our sensor data fusion framework. Our main goal is to examine and survey different sensor data fusion research efforts based on our evaluation framework. The major open research issues related to sensor data fusion are also presented.Comment: Accepted to be published in International Journal of Distributed Systems and Technologies (IJDST), 201

    Social media analytics: a survey of techniques, tools and platforms

    Get PDF
    This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing

    DARIAH and the Benelux

    Get PDF
    corecore