48,252 research outputs found
A User-Centered Concept Mining System for Query and Document Understanding at Tencent
Concepts embody the knowledge of the world and facilitate the cognitive
processes of human beings. Mining concepts from web documents and constructing
the corresponding taxonomy are core research problems in text understanding and
support many downstream tasks such as query analysis, knowledge base
construction, recommendation, and search. However, we argue that most prior
studies extract formal and overly general concepts from Wikipedia or static web
pages, which are not representing the user perspective. In this paper, we
describe our experience of implementing and deploying ConcepT in Tencent QQ
Browser. It discovers user-centered concepts at the right granularity
conforming to user interests, by mining a large amount of user queries and
interactive search click logs. The extracted concepts have the proper
granularity, are consistent with user language styles and are dynamically
updated. We further present our techniques to tag documents with user-centered
concepts and to construct a topic-concept-instance taxonomy, which has helped
to improve search as well as news feeds recommendation in Tencent QQ Browser.
We performed extensive offline evaluation to demonstrate that our approach
could extract concepts of higher quality compared to several other existing
methods. Our system has been deployed in Tencent QQ Browser. Results from
online A/B testing involving a large number of real users suggest that the
Impression Efficiency of feeds users increased by 6.01% after incorporating the
user-centered concepts into the recommendation framework of Tencent QQ Browser.Comment: Accepted by KDD 201
The Evolving Landscape of Internet Control
Over the past two years, we have undertaken several studies at the Berkman Center designed to better understand the control of the Internet in less open societies. During the years we've been engaged in this research, we have seen many incidents that have highlighted the continuing role of the Internet as a battleground for political control, including partial or total Internet shutdowns in China, Iran, Egypt, Libya, and Syria; many hundreds of documented DDoS, hacking, and other cyber attacks against political sites; continued growth in the number of countries that filter the Internet; and dozens of well documented cases of on- and offline persecution of online dissidents. The energy dedicated to these battles for control of the Internet on both the government and dissident sides indicated, if nothing else, that both sides think that the Internet is a critical space for political action. In this paper, we offer an overview of our research in the context of these changes in the methods used to control online speech, and some thoughts on the challenges to online speech in the immediate future
Youth and Digital Media: From Credibility to Information Quality
Building upon a process-and context-oriented information quality framework, this paper seeks to map and explore what we know about the ways in which young users of age 18 and under search for information online, how they evaluate information, and how their related practices of content creation, levels of new literacies, general digital media usage, and social patterns affect these activities. A review of selected literature at the intersection of digital media, youth, and information quality -- primarily works from library and information science, sociology, education, and selected ethnographic studies -- reveals patterns in youth's information-seeking behavior, but also highlights the importance of contextual and demographic factors both for search and evaluation. Looking at the phenomenon from an information-learning and educational perspective, the literature shows that youth develop competencies for personal goals that sometimes do not transfer to school, and are sometimes not appropriate for school. Thus far, educational initiatives to educate youth about search, evaluation, or creation have depended greatly on the local circumstances for their success or failure
Counterfactual Estimation and Optimization of Click Metrics for Search Engines
Optimizing an interactive system against a predefined online metric is
particularly challenging, when the metric is computed from user feedback such
as clicks and payments. The key challenge is the counterfactual nature: in the
case of Web search, any change to a component of the search engine may result
in a different search result page for the same query, but we normally cannot
infer reliably from search log how users would react to the new result page.
Consequently, it appears impossible to accurately estimate online metrics that
depend on user feedback, unless the new engine is run to serve users and
compared with a baseline in an A/B test. This approach, while valid and
successful, is unfortunately expensive and time-consuming. In this paper, we
propose to address this problem using causal inference techniques, under the
contextual-bandit framework. This approach effectively allows one to run
(potentially infinitely) many A/B tests offline from search log, making it
possible to estimate and optimize online metrics quickly and inexpensively.
Focusing on an important component in a commercial search engine, we show how
these ideas can be instantiated and applied, and obtain very promising results
that suggest the wide applicability of these techniques
- …