1,063 research outputs found
Why We Read Wikipedia
Wikipedia is one of the most popular sites on the Web, with millions of users
relying on it to satisfy a broad range of information needs every day. Although
it is crucial to understand what exactly these needs are in order to be able to
meet them, little is currently known about why users visit Wikipedia. The goal
of this paper is to fill this gap by combining a survey of Wikipedia readers
with a log-based analysis of user activity. Based on an initial series of user
surveys, we build a taxonomy of Wikipedia use cases along several dimensions,
capturing users' motivations to visit Wikipedia, the depth of knowledge they
are seeking, and their knowledge of the topic of interest prior to visiting
Wikipedia. Then, we quantify the prevalence of these use cases via a
large-scale user survey conducted on live Wikipedia with almost 30,000
responses. Our analyses highlight the variety of factors driving users to
Wikipedia, such as current events, media coverage of a topic, personal
curiosity, work or school assignments, or boredom. Finally, we match survey
responses to the respondents' digital traces in Wikipedia's server logs,
enabling the discovery of behavioral patterns associated with specific use
cases. For instance, we observe long and fast-paced page sequences across
topics for users who are bored or exploring randomly, whereas those using
Wikipedia for work or school spend more time on individual articles focused on
topics such as science. Our findings advance our understanding of reader
motivations and behavior on Wikipedia and can have implications for developers
aiming to improve Wikipedia's user experience, editors striving to cater to
their readers' needs, third-party services (such as search engines) providing
access to Wikipedia content, and researchers aiming to build tools such as
recommendation engines.Comment: Published in WWW'17; v2 fixes caption of Table
Asynchronous Training of Word Embeddings for Large Text Corpora
Word embeddings are a powerful approach for analyzing language and have been
widely popular in numerous tasks in information retrieval and text mining.
Training embeddings over huge corpora is computationally expensive because the
input is typically sequentially processed and parameters are synchronously
updated. Distributed architectures for asynchronous training that have been
proposed either focus on scaling vocabulary sizes and dimensionality or suffer
from expensive synchronization latencies.
In this paper, we propose a scalable approach to train word embeddings by
partitioning the input space instead in order to scale to massive text corpora
while not sacrificing the performance of the embeddings. Our training procedure
does not involve any parameter synchronization except a final sub-model merge
phase that typically executes in a few minutes. Our distributed training scales
seamlessly to large corpus sizes and we get comparable and sometimes even up to
45% performance improvement in a variety of NLP benchmarks using models trained
by our distributed procedure which requires of the time taken by the
baseline approach. Finally we also show that we are robust to missing words in
sub-models and are able to effectively reconstruct word representations.Comment: This paper contains 9 pages and has been accepted in the WSDM201
Gender and Videogames: The political valency of Lara Croft
The Face: Is Lara a feminist icon or a sexist fantasy? Toby Gard: Neither and a bit of both. Lara was designed to be a tough, self-reliant, intelligent woman. She confounds all the sexist cliches apart from the fact that she’s got an unbelievable figure. Strong, independent women are the perfect fantasy girls—the untouchable is always the most desirable (Interview with Lara’s creator Toby Gard in The Face magazine, June 1997)
LACHESIS restricts gametic cell fate in the female gametophyte of Arabidopsis
In flowering plants, the egg and sperm cells form within haploid gametophytes. The female gametophyte of Arabidopsis consists of two gametic cells, the egg cell and the central cell, which are flanked by five accessory cells. Both gametic and accessory cells are vital for fertilization; however, the mechanisms that underlie the formation of accessory versus gametic cell fate are unknown. In a screen for regulators of egg cell fate, we isolated the lachesis (lis) mutant which forms supernumerary egg cells. In lis mutants, accessory cells differentiate gametic cell fate, indicating that LIS is involved in a mechanism that prevents accessory cells from adopting gametic cell fate. The temporal and spatial pattern of LIS expression suggests that this mechanism is generated in gametic cells. LIS is homologous to the yeast splicing factor PRP4, indicating that components of the splice apparatus participate in cell fate decisions
Wavelet Based Fractal Analysis of Airborne Pollen
The most abundant biological particles in the atmosphere are pollen grains
and spores. Self protection of pollen allergy is possible through the
information of future pollen contents in the air. In spite of the importance of
airborne pol len concentration forecasting, it has not been possible to predict
the pollen concentrations with great accuracy, and about 25% of the daily
pollen forecasts have resulted in failures. Previous analysis of the dynamic
characteristics of atmospheric pollen time series indicate that the system can
be described by a low dimensional chaotic map. We apply the wavelet transform
to study the multifractal characteristics of an a irborne pollen time series.
We find the persistence behaviour associated to low pollen concentration values
and to the most rare events of highest pollen co ncentration values. The
information and the correlation dimensions correspond to a chaotic system
showing loss of information with time evolution.Comment: 11 pages, 7 figure
Communication calls produced by electrical stimulation of four structures in the guinea pig brain
One of the main central processes affecting the cortical representation of conspecific vocalizations is the collateral output from the extended motor system for call generation. Before starting to study this interaction we sought to compare the characteristics of calls produced by stimulating four different parts of the brain in guinea pigs (Cavia porcellus). By using anaesthetised animals we were able to reposition electrodes without distressing the animals. Trains of 100 electrical pulses were used to stimulate the midbrain periaqueductal grey (PAG), hypothalamus, amygdala, and anterior cingulate cortex (ACC). Each structure produced a similar range of calls, but in significantly different proportions. Two of the spontaneous calls (chirrup and purr) were never produced by electrical stimulation and although we identified versions of chutter, durr and tooth chatter, they differed significantly from our natural call templates. However, we were routinely able to elicit seven other identifiable calls. All seven calls were produced both during the 1.6 s period of stimulation and subsequently in a period which could last for more than a minute. A single stimulation site could produce four or five different calls, but the amygdala was much less likely to produce a scream, whistle or rising whistle than any of the other structures. These three high-frequency calls were more likely to be produced by females than males. There were also differences in the timing of the call production with the amygdala primarily producing calls during the electrical stimulation and the hypothalamus mainly producing calls after the electrical stimulation. For all four structures a significantly higher stimulation current was required in males than females. We conclude that all four structures can be stimulated to produce fictive vocalizations that should be useful in studying the relationship between the vocal motor system and cortical sensory representation
Recommended from our members
Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset
Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics
Managed Aquifer Recharge as a Tool to Enhance Sustainable Groundwater Management in California
A growing population and an increased demand for water resources have resulted in a global trend of groundwater depletion. Arid and semi-arid climates are particularly susceptible, often relying on groundwater to support large population centers or irrigated agriculture in the absence of sufficient surface water resources. In an effort to increase the security of groundwater resources, managed aquifer recharge (MAR) programs have been developed and implemented globally. MAR is the approach of intentionally harvesting and infiltrating water to recharge depleted aquifer storage. California is a prime example of this growing problem, with three cities that have over a million residents and an agricultural industry that was valued at 47 billion dollars in 2015. The present-day groundwater overdraft of over 100 km3 (since 1962) indicates a clear disparity between surface water supply and water demand within the state. In the face of groundwater overdraft and the anticipated effects of climate change, many new MAR projects are being constructed or investigated throughout California, adding to those that have existed for decades. Some common MAR types utilized in California include injection wells, infiltration basins (also known as spreading basins, percolation basins, or recharge basins), and low-impact development. An emerging MAR type that is actively being investigated is the winter flooding of agricultural fields using existing irrigation infrastructure and excess surface water resources, known as agricultural MAR. California therefore provides an excellent case study to look at the historical use and performance of MAR, ongoing and emerging challenges, novel MAR applications, and the potential for expansion of MAR. Effective MAR projects are an essential tool for increasing groundwater security, both in California and on a global scale. This chapter aims to provide an overview of the most common MAR types and applications within the State of California and neighboring semi-arid regions
Interpretation of DAS28 and its components in the assessment of inflammatory and non-inflammatory aspects of rheumatoid arthritis
Background: DAS28 is interpreted as the inflammatory disease activity of RA. Non-inflammatory pain mechanisms can confound assessment. We aimed to examine the use of DAS28 components or DAS28-derived measures that have been published as indices of non-inflammatory pain mechanisms, to inform interpretation of disease activity.
Methods: Data were used from multiple observational epidemiology studies of people with RA. Statistical characteristics of DAS28 components and derived indices were assessed using baseline and follow up data from British Society for Rheumatology Biologics Registry participants [1] commencing anti-TNF therapy (n = 10813), or [2] changing between non-biologic DMARDs (n=2992), [3] Early Rheumatoid Arthritis Network participants (n=813), and [4] participants in a cross-sectional study exploring fibromyalgia and pain thresholds (n=45). Repeatability was tested in 34 patients with active RA. Derived indices were the proportion of DAS28 attributable to patient-reported components (DAS28-P), tender-swollen difference and tender:swollen ratio. Pressure pain detection threshold (PPT) was used as an index of pain sensitisation.
Results: DAS28, tender joint count, visual analogue scale, DAS28-P, tender-swollen difference and tender:swollen ratio were more strongly associated with pain, PPT and fibromyalgia status than were swollen joint count or erythrocyte sedimentation rate. DAS28-P, tender-swollen difference and tender:swollen ratio better predicted pain over 1 year than did DAS28 or its individual components.
Conclusions: DAS28 is strongly associated both with inflammation and with patient-reported outcomes. DAS28-derived indices such as tender-swollen difference are associated with non-inflammatory pain mechanisms, can predict future pain and should inform how DAS28 is interpreted as an index of inflammatory disease activity in RA
- …
