55,793 research outputs found

    FSSD - A Fast and Efficient Algorithm for Subgroup Set Discovery

    Get PDF
    International audienceSubgroup discovery (SD) is the task of discovering interpretable patterns in the data that stand out w.r.t. some property of interest. Discovering patterns that accurately discriminate a class from the others is one of the most common SD tasks. Standard approaches of the literature are based on local pattern discovery, which is known to provide an overwhelmingly large number of redundant patterns. To solve this issue, pattern set mining has been proposed: instead of evaluating the quality of patterns separately, one should consider the quality of a pattern set as a whole. The goal is to provide a small pattern set that is diverse and well-discriminant to the target class. In this work, we introduce a novel formulation of the task of diverse subgroup set discovery where both discriminative power and diversity of the subgroup set are incorporated in the same quality measure. We propose an efficient and parameter-free algorithm dubbed FSSD and based on a greedy scheme. FSSD uses several optimization strategies that enable to efficiently provide a high quality pattern set in a short amount of time

    Credibility of subgroup analyses by socioeconomic status in public health intervention evaluations:An underappreciated problem?

    Get PDF
    There is increasing interest amongst researchers and policy makers in identifying the effect of public health interventions on health inequalities by socioeconomic status (SES). This issue is typically addressed in evaluation studies through subgroup analyses, where researchers test whether the effect of an intervention differs according to the socioeconomic status of participants. The credibility of such analyses is therefore crucial when making judgements about how an intervention is likely to affect health inequalities, although this issue appears to be rarely considered within public health. The aim of this study was therefore to assess the credibility of subgroup analyses in published evaluations of public health interventions. An established set of 10 credibility criteria for subgroup analyses was applied to a purposively sampled set of 21 evaluation studies, the majority of which focussed on healthy eating interventions, which reported differential intervention effects by SES. While the majority of these studies were found to be otherwise of relatively high quality methodologically, only 8 of the 21 studies met at least 6 of the 10 credibility criteria for subgroup analysis. These findings suggest that the credibility of subgroup analyses conducted within evaluations of public health interventions’ impact on health inequalities may be an underappreciated problem. Keywords: Health inequalities, Health inequities, Equity and public health interventions, Policy impact by socioeconomic statu

    Demographic Faultlines and Creativity In Diverse Groups

    Get PDF
    Despite the oft made argument that demographic diversity should enhance creativity, little is known about this relationship. We propose that group diversity, measured in terms of demographic faultlines, affects creativity through its effects on group members’ felt psychological safety to express their diverse ideas and the quality of information sharing that takes place across subgroup boundaries. Further, we propose that the relationship between faultlines and creativity will be moderated by task interdependence and equality of subgroup sizes. Finally, we provide suggestions for how organizations can establish norms for self-verification and use accountability techniques to enhance creativity in diverse groups

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

    Replication of linkage at chromosome 20p13 and identification of suggestive sex-differential risk loci for autism spectrum disorder.

    Get PDF
    BackgroundAutism spectrum disorders (ASDs) are male-biased and genetically heterogeneous. While sequencing of sporadic cases has identified de novo risk variants, the heritable genetic contribution and mechanisms driving the male bias are less understood. Here, we aimed to identify familial and sex-differential risk loci in the largest available, uniformly ascertained, densely genotyped sample of multiplex ASD families from the Autism Genetics Resource Exchange (AGRE), and to compare results with earlier findings from AGRE.MethodsFrom a total sample of 1,008 multiplex families, we performed genome-wide, non-parametric linkage analysis in a discovery sample of 847 families, and separately on subsets of families with only male, affected children (male-only, MO) or with at least one female, affected child (female-containing, FC). Loci showing evidence for suggestive linkage (logarithm of odds ≄2.2) in this discovery sample, or in previous AGRE samples, were re-evaluated in an extension study utilizing all 1,008 available families. For regions with genome-wide significant linkage signal in the discovery stage, those families not included in the corresponding discovery sample were then evaluated for independent replication of linkage. Association testing of common single nucleotide polymorphisms (SNPs) was also performed within suggestive linkage regions.ResultsWe observed an independent replication of previously observed linkage at chromosome 20p13 (P < 0.01), while loci at 6q27 and 8q13.2 showed suggestive linkage in our extended sample. Suggestive sex-differential linkage was observed at 1p31.3 (MO), 8p21.2 (FC), and 8p12 (FC) in our discovery sample, and the MO signal at 1p31.3 was supported in our expanded sample. No sex-differential signals met replication criteria, and no common SNPs were significantly associated with ASD within any identified linkage regions.ConclusionsWith few exceptions, analyses of subsets of families from the AGRE cohort identify different risk loci, consistent with extreme locus heterogeneity in ASD. Large samples appear to yield more consistent results, and sex-stratified analyses facilitate the identification of sex-differential risk loci, suggesting that linkage analyses in large cohorts are useful for identifying heritable risk loci. Additional work, such as targeted re-sequencing, is needed to identify the specific variants within these loci that are responsible for increasing ASD risk
    • 

    corecore