87,504 research outputs found
Evaluating Go Game Records for Prediction of Player Attributes
We propose a way of extracting and aggregating per-move evaluations from sets
of Go game records. The evaluations capture different aspects of the games such
as played patterns or statistic of sente/gote sequences. Using machine learning
algorithms, the evaluations can be utilized to predict different relevant
target variables. We apply this methodology to predict the strength and playing
style of the player (e.g. territoriality or aggressivity) with good accuracy.
We propose a number of possible applications including aiding in Go study,
seeding real-work ranks of internet players or tuning of Go-playing programs
Information-Theoretic Active Learning for Content-Based Image Retrieval
We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode
active learning method for binary classification, and apply it for acquiring
meaningful user feedback in the context of content-based image retrieval.
Instead of combining different heuristics such as uncertainty, diversity, or
density, our method is based on maximizing the mutual information between the
predicted relevance of the images and the expected user feedback regarding the
selected batch. We propose suitable approximations to this computationally
demanding problem and also integrate an explicit model of user behavior that
accounts for possible incorrect labels and unnameable instances. Furthermore,
our approach does not only take the structure of the data but also the expected
model output change caused by the user feedback into account. In contrast to
other methods, ITAL turns out to be highly flexible and provides
state-of-the-art performance across various datasets, such as MIRFLICKR and
ImageNet.Comment: GCPR 2018 paper (14 pages text + 2 pages references + 6 pages
appendix
Oversampling for Imbalanced Learning Based on K-Means and SMOTE
Learning from class-imbalanced data continues to be a common and challenging
problem in supervised learning as standard classification algorithms are
designed to handle balanced class distributions. While different strategies
exist to tackle this problem, methods which generate artificial data to achieve
a balanced class distribution are more versatile than modifications to the
classification algorithm. Such techniques, called oversamplers, modify the
training data, allowing any classifier to be used with class-imbalanced
datasets. Many algorithms have been proposed for this task, but most are
complex and tend to generate unnecessary noise. This work presents a simple and
effective oversampling method based on k-means clustering and SMOTE
oversampling, which avoids the generation of noise and effectively overcomes
imbalances between and within classes. Empirical results of extensive
experiments with 71 datasets show that training data oversampled with the
proposed method improves classification results. Moreover, k-means SMOTE
consistently outperforms other popular oversampling methods. An implementation
is made available in the python programming language.Comment: 19 pages, 8 figure
Object Proposals for Text Extraction in the Wild
Object Proposals is a recent computer vision technique receiving increasing
interest from the research community. Its main objective is to generate a
relatively small set of bounding box proposals that are most likely to contain
objects of interest. The use of Object Proposals techniques in the scene text
understanding field is innovative. Motivated by the success of powerful while
expensive techniques to recognize words in a holistic way, Object Proposals
techniques emerge as an alternative to the traditional text detectors.
In this paper we study to what extent the existing generic Object Proposals
methods may be useful for scene text understanding. Also, we propose a new
Object Proposals algorithm that is specifically designed for text and compare
it with other generic methods in the state of the art. Experiments show that
our proposal is superior in its ability of producing good quality word
proposals in an efficient way. The source code of our method is made publicly
available.Comment: 13th International Conference on Document Analysis and Recognition
(ICDAR 2015
- …