72,859 research outputs found
Organising a daily visual diary using multifeature clustering
The SenseCam is a prototype device from Microsoft that facilitates automatic capture of images of a person's life by integrating a colour camera, storage media and multiple sensors into a small wearable device. However, efficient search methods are required to reduce the user's burden of sifting through the thousands of images that are captured per day. In this paper, we describe experiments using colour spatiogram and block-based cross-correlation image features in conjunction with accelerometer sensor readings to cluster a day's worth of data into meaningful events, allowing the user to quickly browse a day's captured images. Two different low-complexity algorithms are detailed and evaluated for SenseCam image clustering
Towards efficient music genre classification using FastMap
Automatic genre classification aims to correctly categorize an unknown recording with a music genre. Recent studies use the Kullback-Leibler (KL) divergence to estimate music similarity then perform classification using k-nearest neighbours (k-NN). However, this approach is not practical for large databases. We propose an efficient genre classifier that addresses the scalability problem. It uses a combination of modified FastMap algorithm and KL divergence to return the nearest neighbours then use 1- NN for classification. Our experiments showed that high accuracies are obtained while performing classification in less than 1/20 second per track
Network Community Detection on Metric Space
Community detection in a complex network is an important problem of much
interest in recent years. In general, a community detection algorithm chooses
an objective function and captures the communities of the network by optimizing
the objective function, and then, one uses various heuristics to solve the
optimization problem to extract the interesting communities for the user. In
this article, we demonstrate the procedure to transform a graph into points of
a metric space and develop the methods of community detection with the help of
a metric defined for a pair of points. We have also studied and analyzed the
community structure of the network therein. The results obtained with our
approach are very competitive with most of the well-known algorithms in the
literature, and this is justified over the large collection of datasets. On the
other hand, it can be observed that time taken by our algorithm is quite less
compared to other methods and justifies the theoretical findings
Class attendance, peer similarity, and academic performance in a large field study
Identifying the factors that determine academic performance is an essential
part of educational research. Existing research indicates that class attendance
is a useful predictor of subsequent course achievements. The majority of the
literature is, however, based on surveys and self-reports, methods which have
well-known systematic biases that lead to limitations on conclusions and
generalizability as well as being costly to implement. Here we propose a novel
method for measuring class attendance that overcomes these limitations by using
location and bluetooth data collected from smartphone sensors. Based on
measured attendance data of nearly 1,000 undergraduate students, we demonstrate
that early and consistent class attendance strongly correlates with academic
performance. In addition, our novel dataset allows us to determine that
attendance among social peers was substantially correlated (0.5), suggesting
either an important peer effect or homophily with respect to attendance
Using Machine Learning to Predict the Evolution of Physics Research
The advancement of science as outlined by Popper and Kuhn is largely
qualitative, but with bibliometric data it is possible and desirable to develop
a quantitative picture of scientific progress. Furthermore it is also important
to allocate finite resources to research topics that have growth potential, to
accelerate the process from scientific breakthroughs to technological
innovations. In this paper, we address this problem of quantitative knowledge
evolution by analysing the APS publication data set from 1981 to 2010. We build
the bibliographic coupling and co-citation networks, use the Louvain method to
detect topical clusters (TCs) in each year, measure the similarity of TCs in
consecutive years, and visualize the results as alluvial diagrams. Having the
predictive features describing a given TC and its known evolution in the next
year, we can train a machine learning model to predict future changes of TCs,
i.e., their continuing, dissolving, merging and splitting. We found the number
of papers from certain journals, the degree, closeness, and betweenness to be
the most predictive features. Additionally, betweenness increases significantly
for merging events, and decreases significantly for splitting events. Our
results represent a first step from a descriptive understanding of the Science
of Science (SciSci), towards one that is ultimately prescriptive.Comment: 24 pages, 10 figures, 4 tables, supplementary information is include
Risks of Friendships on Social Networks
In this paper, we explore the risks of friends in social networks caused by
their friendship patterns, by using real life social network data and starting
from a previously defined risk model. Particularly, we observe that risks of
friendships can be mined by analyzing users' attitude towards friends of
friends. This allows us to give new insights into friendship and risk dynamics
on social networks.Comment: 10 pages, 8 figures, 3 tables. To Appear in the 2012 IEEE
International Conference on Data Mining (ICDM
- âŚ