820 research outputs found

    Heterogeneous Metric Learning of Categorical Data with Hierarchical Couplings

    Full text link
    © 1989-2012 IEEE. Learning appropriate metric is critical for effectively capturing complex data characteristics. The metric learning of categorical data with hierarchical coupling relationships and local heterogeneous distributions is very challenging yet rarely explored. This paper proposes a Heterogeneous mEtric Learning with hIerarchical Couplings (HELIC for short) for this type of categorical data. HELIC captures both low-level value-to-attribute and high-level attribute-to-class hierarchical couplings, and reveals the intrinsic heterogeneities embedded in each level of couplings. Theoretical analyses of the effectiveness and generalization error bound verify that HELIC effectively represents the above complexities. Extensive experiments on 30 data sets with diverse characteristics demonstrate that HELIC-enabled classification significantly enhances the accuracy (up to 40.93 percent), compared with five state-of-the-art baselines

    Search strategies of Wikipedia readers

    Get PDF
    The quest for information is one of the most common activity of human beings. Despite the the impressive progress of search engines, not to miss the needed piece of information could be still very tough, as well as to acquire specific competences and knowledge by shaping and following the proper learning paths. Indeed, the need to find sensible paths in information networks is one of the biggest challenges of our societies and, to effectively address it, it is important to investigate the strategies adopted by human users to cope with the cognitive bottleneck of finding their way in a growing sea of information. Here we focus on the case of Wikipedia and investigate a recently released dataset about users’ click on the English Wikipedia, namely the English Wikipedia Clickstream. We perform a semantically charged analysis to uncover the general patterns followed by information seekers in the multi-dimensional space of Wikipedia topics/categories. We discover the existence of well defined strategies in which users tend to start from very general, i.e., semantically broad, pages and progressively narrow down the scope of their navigation, while keeping a growing semantic coherence. This is unlike strategies associated to tasks with predefined search goals, namely the case of the Wikispeedia game. In this case users first move from the ‘particular’ to the ‘universal’ before focusing down again to the required target. The clear picture offered here represents a very important stepping stone towards a better design of information networks and recommendation strategies, as well as the construction of radically new learning paths

    Chiral symmetry and exclusive B decays in the SCET

    Get PDF
    We describe a chiral formalism for processes involving both energetic hadrons and soft Goldstone bosons, which extends the application of soft-collinear effective theory to multibody B decays. The nonfactorizable helicity amplitudes for heavy meson decays into multibody final states satisfy symmetry relations analogous to the large energy form factor relations, which are broken at leading order in Lambda/mb by calculable factorizable terms. We use the chiral effective theory to compute the leading corrections to these symmetry relations in B -> M_n pi ell\bar\nu and B -> M_n pi e+e- decays.Comment: 6 pages, 1 figure; typos correcte

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table
    • …
    corecore