1,992 research outputs found
Faster K-Means Cluster Estimation
There has been considerable work on improving popular clustering algorithm
`K-means' in terms of mean squared error (MSE) and speed, both. However, most
of the k-means variants tend to compute distance of each data point to each
cluster centroid for every iteration. We propose a fast heuristic to overcome
this bottleneck with only marginal increase in MSE. We observe that across all
iterations of K-means, a data point changes its membership only among a small
subset of clusters. Our heuristic predicts such clusters for each data point by
looking at nearby clusters after the first iteration of k-means. We augment
well known variants of k-means with our heuristic to demonstrate effectiveness
of our heuristic. For various synthetic and real-world datasets, our heuristic
achieves speed-up of up-to 3 times when compared to efficient variants of
k-means.Comment: 6 pages, Accepted at ECIR 201
Infant cortex responds to other humans from shortly after birth
A significant feature of the adult human brain is its ability to selectively process information about conspecifics. Much debate has centred on whether this specialization is primarily a result of phylogenetic adaptation, or whether the brain acquires expertise in processing social stimuli as a result of its being born into an intensely social environment. Here we study the haemodynamic response in cortical areas of newborns (1–5 days old) while they passively viewed dynamic human or mechanical action videos. We observed activation selective to a dynamic face stimulus over bilateral posterior temporal cortex, but no activation in response to a moving human arm. This selective activation to the social stimulus correlated with age in hours over the first few days post partum. Thus, even very limited experience of face-to-face interaction with other humans may be sufficient to elicit social stimulus activation of relevant cortical regions
CWRML: representing crop wild relative conservation and use data in XML
Background
Crop wild relatives are wild species that are closely related to crops. They are valuable as potential gene donors for crop improvement and may help to ensure food security for the future. However, they are becoming increasingly threatened in the wild and are inadequately conserved, both in situ and ex situ. Information about the conservation status and utilisation potential of crop wild relatives is diverse and dispersed, and no single agreed standard exists for representing such information; yet, this information is vital to ensure these species are effectively conserved and utilised. The European Community-funded project, European Crop Wild Relative Diversity Assessment and Conservation Forum, determined the minimum information requirements for the conservation and utilisation of crop wild relatives and created the Crop Wild Relative Information System, incorporating an eXtensible Markup Language (XML) schema to aid data sharing and exchange.
Results
Crop Wild Relative Markup Language (CWRML) was developed to represent the data necessary for crop wild relative conservation and ensure that they can be effectively utilised for crop improvement. The schema partitions data into taxon-, site-, and population-specific elements, to allow for integration with other more general conservation biology schemata which may emerge as accepted standards in the future. These elements are composed of sub-elements, which are structured in order to facilitate the use of the schema in a variety of crop wild relative conservation and use contexts. Pre-existing standards for data representation in conservation biology were reviewed and incorporated into the schema as restrictions on element data contents, where appropriate.
Conclusion
CWRML provides a flexible data communication format for representing in situ and ex situ conservation status of individual taxa as well as their utilisation potential. The development of the schema highlights a number of instances where additional standards-development may be valuable, particularly with regard to the representation of population-specific data and utilisation potential. As crop wild relatives are intrinsically no different to other wild plant species there is potential for the inclusion of CWRML data elements in the emerging standards for representation of biodiversity data
The importance of specifying and studying causal mechanisms in school-based randomised controlled trials: lessons from two studies of cross-age peer tutoring
Based on the experience of evaluating 2 cross-age peer-tutoring interventions, we argue that researchers need to pay greater attention to causal mechanisms within the context of school-based randomised controlled trials. Without studying mechanisms, researchers are less able to explain the underlying causal processes that give rise to results from randomised controlled trials. Studying implementation fidelity is necessary but not sufficient for causal explanation; the study of causal mechanisms through the application of mixed methods is also required. Due to the increasingly complicated nature of many classroom-based innovations that are subject to evaluation, and the potentially distal nature of hypothesised effects, particularly on attainment, programme theory and articulation of mechanisms are essential in enhancing causal explanation and promoting the accumulation of knowledge of what works and why in classroom settings
Durham Shared Maths Project. Evaluation report and executive summary
Published in July 2015, this report details the findings of the Durham University Shared Maths intervention project on pupils from 82 primary schools across four local authorities. The intervention was a cross-age peer tutoring, developed at Durham University, where older pupils (Year 5/Year 6) work with younger pupils (Year 3/Year 4) to discuss and work through maths problems using a structured approach. The intervention was delivered by teachers, with training and support from a Local Co-ordinator and participating pupils spent 20 minutes each week using the approach, for two blocks of 16 weeks over consecutive years
k is the Magic Number -- Inferring the Number of Clusters Through Nonparametric Concentration Inequalities
Most convex and nonconvex clustering algorithms come with one crucial
parameter: the in -means. To this day, there is not one generally
accepted way to accurately determine this parameter. Popular methods are simple
yet theoretically unfounded, such as searching for an elbow in the curve of a
given cost measure. In contrast, statistically founded methods often make
strict assumptions over the data distribution or come with their own
optimization scheme for the clustering objective. This limits either the set of
applicable datasets or clustering algorithms. In this paper, we strive to
determine the number of clusters by answering a simple question: given two
clusters, is it likely that they jointly stem from a single distribution? To
this end, we propose a bound on the probability that two clusters originate
from the distribution of the unified cluster, specified only by the sample mean
and variance. Our method is applicable as a simple wrapper to the result of any
clustering method minimizing the objective of -means, which includes
Gaussian mixtures and Spectral Clustering. We focus in our experimental
evaluation on an application for nonconvex clustering and demonstrate the
suitability of our theoretical results. Our \textsc{SpecialK} clustering
algorithm automatically determines the appropriate value for , without
requiring any data transformation or projection, and without assumptions on the
data distribution. Additionally, it is capable to decide that the data consists
of only a single cluster, which many existing algorithms cannot
Decentralized Estimation over Orthogonal Multiple-access Fading Channels in Wireless Sensor Networks - Optimal and Suboptimal Estimators
Optimal and suboptimal decentralized estimators in wireless sensor networks
(WSNs) over orthogonal multiple-access fading channels are studied in this
paper. Considering multiple-bit quantization before digital transmission, we
develop maximum likelihood estimators (MLEs) with both known and unknown
channel state information (CSI). When training symbols are available, we derive
a MLE that is a special case of the MLE with unknown CSI. It implicitly uses
the training symbols to estimate the channel coefficients and exploits the
estimated CSI in an optimal way. To reduce the computational complexity, we
propose suboptimal estimators. These estimators exploit both signal and data
level redundant information to improve the estimation performance. The proposed
MLEs reduce to traditional fusion based or diversity based estimators when
communications or observations are perfect. By introducing a general message
function, the proposed estimators can be applied when various analog or digital
transmission schemes are used. The simulations show that the estimators using
digital communications with multiple-bit quantization outperform the estimator
using analog-and-forwarding transmission in fading channels. When considering
the total bandwidth and energy constraints, the MLE using multiple-bit
quantization is superior to that using binary quantization at medium and high
observation signal-to-noise ratio levels
Don’t turn your back on the symptoms of psychosis : a proof-of-principle, quasi-experimental public health trial to reduce the duration of untreated psychosis in Birmingham, UK
Background: Reducing the duration of untreated psychosis (DUP) is an aspiration of international guidelines for first episode psychosis; however, public health initiatives have met with mixed results. Systematic reviews suggest that greater focus on the sources of delay within care pathways, (which will vary between healthcare settings) is needed to achieve sustainable reductions in DUP (BJP 198: 256-263; 2011).
Methods/Design: A quasi-experimental trial, comparing a targeted intervention area with a ‘detection as usual’ area in the same city. A proof-of–principle trial, no a priori assumptions are made regarding effect size; key outcome will be an estimate of the potential effect size for a definitive trial. DUP and number of new cases will be collected over an 18-month period in target and control areas and compared; historical data on DUP collected in both areas over the previous three years, will serve as a benchmark. The intervention will focus on reducing two significant DUP component delays within the overall care pathway: delays within the mental health service and help-seeking delay.
Discussion: This pragmatic trial will be the first to target known delays within the care pathway for those with a first episode of psychosis. If successful, this will provide a generalizable methodology that can be implemented in a variety of healthcare contexts with differing sources of delay.
Trial registration: http://www.controlled-trials.com/ISRCTN45058713
Keywords: Public mental health campaign, First-episode psychosis, Early detection, Duration of untreated psychosis, Youth mental healt
- …