4 research outputs found

    Reconstruction of Objects Using Lineage

    No full text
    Abstract We study the problem of object reconstruction based on lineage, using photographs as our driving application. In addition to standard forward reconstructions, our model allows inverse transformations, reconstructions that exploit properties (e.g., commutativity), and imperfect reconstructions. With these additions, our model provides many more options for recovering a lost object. However, to choose among many possibly imperfect reconstructions, we need to carefully account for the accompanying "degradation." In this paper, we propose a model for measuring degradation and a set of composition rules that help us measure the quality of reconstructions. Given this model, we propose an efficient algorithm for finding reconstructions and illustrate how it strikes a balance between efficiency and the quality of the produced results

    Domain bias in web search

    No full text
    This paper uncovers a new phenomenon in web search that we call domain bias — a user’s propensity to believe that a page is more relevant just because it comes from a par-ticular domain. We provide evidence of the existence of domain bias in click activity as well as in human judgments via a comprehensive collection of experiments. We begin by studying the difference between domains that a search engine surfaces and that users click. Surprisingly, we find that despite changes in the overall distribution of surfaced domains, there has not been a comparable shift in the dis-tribution of clicked domains. Users seem to have learned the landscape of the internet and their click behavior has thus become more predictable over time. Next, we run a blind domain test, akin to a Pepsi/Coke taste test, to determine whether domains can shift a user’s opinion of which page is more relevant. We find that domains can actually flip a user’s preference about 25 % of the time. Finally, we demon-strate the existence of systematic domain preferences, even after factoring out confounding issues such as position bias and relevance, two factors that have been used extensively in past work to explain user behavior. The existence of domain bias has numerous consequences including, for example, the importance of discounting click activity from reputable do-mains

    Clustering query refinements by user intent

    No full text
    We address the problem of clustering the refinements of a user search query. The clusters computed by our proposed algorithm can be used to improve the selection and placement of the query suggestions proposed by a search engine, and can also serve to summarize the different aspects of information relevant to the original user query. Our algorithm clusters refinements based on their likely underlying user intents by combining document click and session cooccurrence information. At its core, our algorithm operates by performing multiple random walks on a Markov graph that approximates user search behavior. A user study performed on top search engine queries shows that our clusters are rated better than corresponding clusters computed using approaches that use only document click or only sessions co-occurrence information. 1

    Correcting for Missing Data in Information Cascades

    No full text
    Transmission of infectious diseases, propagation of information, and spread of ideas and influence through social networks are all examples of diffusion. In such cases we say that a contagion spreads through the network, a process that can be modeled by a cascade graph. Studying cascades and network diffusion is challenging due to missing data. Even a single missing observation in a sequence of propagation events can significantly alter our inferences about the diffusion process. We address the problem of missing data in information cascades. Specifically, given only a fraction C ′ of the complete cascade C, our goal is to estimate the properties of the complete cascade C, such as its size or depth. To estimate the properties of C, we first formulate k-tree model of cascades and analytically study its properties in the face of missing data. We then propose a numerical method that given a cascade model and observed cascade C ′ can estimate properties ofthecomplete cascade C. Weevaluate our methodology usinginformation propagation cascades in the Twitter network (70 million nodes and 2 billion edges), as well as information cascades arising in the blogosphere. Our experiments show that the k-tree model is an effective tool to study the effects of missing data in cascades. Most importantly, we show that our method (and the k-tree model) can accurately estimate properties of the complete cascade C even when 90 % of the data is missing.
    corecore