3,814,263 research outputs found
Online Data Stream Learning and Classification with Limited Labels
Mining data streams such as Internet traffic andnetwork security is complex. Due to the difficulty of storage, datastreams analytics need to be done in one scan. This limits thetime to observe stream feature and hence, further complicatesthe data mining processes. Traditional supervised data miningwith batch training natural is not suitable to mine data streams.This paper proposes an algorithm for online data streamclassification and learning with limited labels using selective selftrainingsemi-supervised classification. The experimental resultsshow it is able to achieve up to 99.6% average accuracy for 10%labeled data and 98.6% average accuracy for 1% labeled data. Itcan classify up to 34K instances per second
Stereo and ToF Data Fusion by Learning from Synthetic Data
Time-of-Flight (ToF) sensors and stereo vision systems are both capable of acquiring depth information but they have complementary characteristics and issues. A more accurate representation of the scene geometry can be obtained by fusing the two depth sources. In this paper we present a novel framework for data fusion where the contribution of the two depth sources is controlled by confidence measures that are jointly estimated using a Convolutional Neural Network. The two depth sources are fused enforcing the local consistency of depth data, taking into account the estimated confidence information. The deep network is trained using a synthetic dataset and we show how the classifier is able to generalize to different data, obtaining reliable estimations not only on synthetic data but also on real world scenes. Experimental results show that the proposed approach increases the accuracy of the depth estimation on both synthetic and real data and that it is able to outperform state-of-the-art methods
Teaching and Learning Data Visualization: Ideas and Assignments
This article discusses how to make statistical graphics a more prominent
element of the undergraduate statistics curricula. The focus is on several
different types of assignments that exemplify how to incorporate graphics into
a course in a pedagogically meaningful way. These assignments include having
students deconstruct and reconstruct plots, copy masterful graphs, create
one-minute visual revelations, convert tables into `pictures', and develop
interactive visualizations with, e.g., the virtual earth as a plotting canvas.
In addition to describing the goals and details of each assignment, we also
discuss the broader topic of graphics and key concepts that we think warrant
inclusion in the statistics curricula. We advocate that more attention needs to
be paid to this fundamental field of statistics at all levels, from
introductory undergraduate through graduate level courses. With the rapid rise
of tools to visualize data, e.g., Google trends, GapMinder, ManyEyes, and
Tableau, and the increased use of graphics in the media, understanding the
principles of good statistical graphics, and having the ability to create
informative visualizations is an ever more important aspect of statistics
education
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
- …