1,063 research outputs found

    Why We Read Wikipedia

    Get PDF
    Wikipedia is one of the most popular sites on the Web, with millions of users relying on it to satisfy a broad range of information needs every day. Although it is crucial to understand what exactly these needs are in order to be able to meet them, little is currently known about why users visit Wikipedia. The goal of this paper is to fill this gap by combining a survey of Wikipedia readers with a log-based analysis of user activity. Based on an initial series of user surveys, we build a taxonomy of Wikipedia use cases along several dimensions, capturing users' motivations to visit Wikipedia, the depth of knowledge they are seeking, and their knowledge of the topic of interest prior to visiting Wikipedia. Then, we quantify the prevalence of these use cases via a large-scale user survey conducted on live Wikipedia with almost 30,000 responses. Our analyses highlight the variety of factors driving users to Wikipedia, such as current events, media coverage of a topic, personal curiosity, work or school assignments, or boredom. Finally, we match survey responses to the respondents' digital traces in Wikipedia's server logs, enabling the discovery of behavioral patterns associated with specific use cases. For instance, we observe long and fast-paced page sequences across topics for users who are bored or exploring randomly, whereas those using Wikipedia for work or school spend more time on individual articles focused on topics such as science. Our findings advance our understanding of reader motivations and behavior on Wikipedia and can have implications for developers aiming to improve Wikipedia's user experience, editors striving to cater to their readers' needs, third-party services (such as search engines) providing access to Wikipedia content, and researchers aiming to build tools such as recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table

    Asynchronous Training of Word Embeddings for Large Text Corpora

    Full text link
    Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is typically sequentially processed and parameters are synchronously updated. Distributed architectures for asynchronous training that have been proposed either focus on scaling vocabulary sizes and dimensionality or suffer from expensive synchronization latencies. In this paper, we propose a scalable approach to train word embeddings by partitioning the input space instead in order to scale to massive text corpora while not sacrificing the performance of the embeddings. Our training procedure does not involve any parameter synchronization except a final sub-model merge phase that typically executes in a few minutes. Our distributed training scales seamlessly to large corpus sizes and we get comparable and sometimes even up to 45% performance improvement in a variety of NLP benchmarks using models trained by our distributed procedure which requires 1/101/10 of the time taken by the baseline approach. Finally we also show that we are robust to missing words in sub-models and are able to effectively reconstruct word representations.Comment: This paper contains 9 pages and has been accepted in the WSDM201

    Gender and Videogames: The political valency of Lara Croft

    Get PDF
    The Face: Is Lara a feminist icon or a sexist fantasy? Toby Gard: Neither and a bit of both. Lara was designed to be a tough, self-reliant, intelligent woman. She confounds all the sexist cliches apart from the fact that she’s got an unbelievable figure. Strong, independent women are the perfect fantasy girls—the untouchable is always the most desirable (Interview with Lara’s creator Toby Gard in The Face magazine, June 1997)

    LACHESIS restricts gametic cell fate in the female gametophyte of Arabidopsis

    Get PDF
    In flowering plants, the egg and sperm cells form within haploid gametophytes. The female gametophyte of Arabidopsis consists of two gametic cells, the egg cell and the central cell, which are flanked by five accessory cells. Both gametic and accessory cells are vital for fertilization; however, the mechanisms that underlie the formation of accessory versus gametic cell fate are unknown. In a screen for regulators of egg cell fate, we isolated the lachesis (lis) mutant which forms supernumerary egg cells. In lis mutants, accessory cells differentiate gametic cell fate, indicating that LIS is involved in a mechanism that prevents accessory cells from adopting gametic cell fate. The temporal and spatial pattern of LIS expression suggests that this mechanism is generated in gametic cells. LIS is homologous to the yeast splicing factor PRP4, indicating that components of the splice apparatus participate in cell fate decisions

    Wavelet Based Fractal Analysis of Airborne Pollen

    Full text link
    The most abundant biological particles in the atmosphere are pollen grains and spores. Self protection of pollen allergy is possible through the information of future pollen contents in the air. In spite of the importance of airborne pol len concentration forecasting, it has not been possible to predict the pollen concentrations with great accuracy, and about 25% of the daily pollen forecasts have resulted in failures. Previous analysis of the dynamic characteristics of atmospheric pollen time series indicate that the system can be described by a low dimensional chaotic map. We apply the wavelet transform to study the multifractal characteristics of an a irborne pollen time series. We find the persistence behaviour associated to low pollen concentration values and to the most rare events of highest pollen co ncentration values. The information and the correlation dimensions correspond to a chaotic system showing loss of information with time evolution.Comment: 11 pages, 7 figure

    Communication calls produced by electrical stimulation of four structures in the guinea pig brain

    Get PDF
    One of the main central processes affecting the cortical representation of conspecific vocalizations is the collateral output from the extended motor system for call generation. Before starting to study this interaction we sought to compare the characteristics of calls produced by stimulating four different parts of the brain in guinea pigs (Cavia porcellus). By using anaesthetised animals we were able to reposition electrodes without distressing the animals. Trains of 100 electrical pulses were used to stimulate the midbrain periaqueductal grey (PAG), hypothalamus, amygdala, and anterior cingulate cortex (ACC). Each structure produced a similar range of calls, but in significantly different proportions. Two of the spontaneous calls (chirrup and purr) were never produced by electrical stimulation and although we identified versions of chutter, durr and tooth chatter, they differed significantly from our natural call templates. However, we were routinely able to elicit seven other identifiable calls. All seven calls were produced both during the 1.6 s period of stimulation and subsequently in a period which could last for more than a minute. A single stimulation site could produce four or five different calls, but the amygdala was much less likely to produce a scream, whistle or rising whistle than any of the other structures. These three high-frequency calls were more likely to be produced by females than males. There were also differences in the timing of the call production with the amygdala primarily producing calls during the electrical stimulation and the hypothalamus mainly producing calls after the electrical stimulation. For all four structures a significantly higher stimulation current was required in males than females. We conclude that all four structures can be stimulated to produce fictive vocalizations that should be useful in studying the relationship between the vocal motor system and cortical sensory representation

    Managed Aquifer Recharge as a Tool to Enhance Sustainable Groundwater Management in California

    Get PDF
    A growing population and an increased demand for water resources have resulted in a global trend of groundwater depletion. Arid and semi-arid climates are particularly susceptible, often relying on groundwater to support large population centers or irrigated agriculture in the absence of sufficient surface water resources. In an effort to increase the security of groundwater resources, managed aquifer recharge (MAR) programs have been developed and implemented globally. MAR is the approach of intentionally harvesting and infiltrating water to recharge depleted aquifer storage. California is a prime example of this growing problem, with three cities that have over a million residents and an agricultural industry that was valued at 47 billion dollars in 2015. The present-day groundwater overdraft of over 100 km3 (since 1962) indicates a clear disparity between surface water supply and water demand within the state. In the face of groundwater overdraft and the anticipated effects of climate change, many new MAR projects are being constructed or investigated throughout California, adding to those that have existed for decades. Some common MAR types utilized in California include injection wells, infiltration basins (also known as spreading basins, percolation basins, or recharge basins), and low-impact development. An emerging MAR type that is actively being investigated is the winter flooding of agricultural fields using existing irrigation infrastructure and excess surface water resources, known as agricultural MAR. California therefore provides an excellent case study to look at the historical use and performance of MAR, ongoing and emerging challenges, novel MAR applications, and the potential for expansion of MAR. Effective MAR projects are an essential tool for increasing groundwater security, both in California and on a global scale. This chapter aims to provide an overview of the most common MAR types and applications within the State of California and neighboring semi-arid regions

    Interpretation of DAS28 and its components in the assessment of inflammatory and non-inflammatory aspects of rheumatoid arthritis

    Get PDF
    Background: DAS28 is interpreted as the inflammatory disease activity of RA. Non-inflammatory pain mechanisms can confound assessment. We aimed to examine the use of DAS28 components or DAS28-derived measures that have been published as indices of non-inflammatory pain mechanisms, to inform interpretation of disease activity. Methods: Data were used from multiple observational epidemiology studies of people with RA. Statistical characteristics of DAS28 components and derived indices were assessed using baseline and follow up data from British Society for Rheumatology Biologics Registry participants [1] commencing anti-TNF therapy (n = 10813), or [2] changing between non-biologic DMARDs (n=2992), [3] Early Rheumatoid Arthritis Network participants (n=813), and [4] participants in a cross-sectional study exploring fibromyalgia and pain thresholds (n=45). Repeatability was tested in 34 patients with active RA. Derived indices were the proportion of DAS28 attributable to patient-reported components (DAS28-P), tender-swollen difference and tender:swollen ratio. Pressure pain detection threshold (PPT) was used as an index of pain sensitisation. Results: DAS28, tender joint count, visual analogue scale, DAS28-P, tender-swollen difference and tender:swollen ratio were more strongly associated with pain, PPT and fibromyalgia status than were swollen joint count or erythrocyte sedimentation rate. DAS28-P, tender-swollen difference and tender:swollen ratio better predicted pain over 1 year than did DAS28 or its individual components. Conclusions: DAS28 is strongly associated both with inflammation and with patient-reported outcomes. DAS28-derived indices such as tender-swollen difference are associated with non-inflammatory pain mechanisms, can predict future pain and should inform how DAS28 is interpreted as an index of inflammatory disease activity in RA
    corecore