66 research outputs found
On the consistency of Multithreshold Entropy Linear Classifier
Multithreshold Entropy Linear Classifier (MELC) is a recent classifier idea
which employs information theoretic concept in order to create a multithreshold
maximum margin model. In this paper we analyze its consistency over
multithreshold linear models and show that its objective function upper bounds
the amount of misclassified points in a similar manner like hinge loss does in
support vector machines. For further confirmation we also conduct some
numerical experiments on five datasets.Comment: Presented at Theoretical Foundations of Machine Learning 2015
(http://tfml.gmum.net), final version published in Schedae Informaticae
Journa
Fast optimization of Multithreshold Entropy Linear Classifier
Multithreshold Entropy Linear Classifier (MELC) is a density based model
which searches for a linear projection maximizing the Cauchy-Schwarz Divergence
of dataset kernel density estimation. Despite its good empirical results, one
of its drawbacks is the optimization speed. In this paper we analyze how one
can speed it up through solving an approximate problem. We analyze two methods,
both similar to the approximate solutions of the Kernel Density Estimation
querying and provide adaptive schemes for selecting a crucial parameters based
on user-specified acceptable error. Furthermore we show how one can exploit
well known conjugate gradients and L-BFGS optimizers despite the fact that the
original optimization problem should be solved on the sphere. All above methods
and modifications are tested on 10 real life datasets from UCI repository to
confirm their practical usability.Comment: Presented at Theoretical Foundations of Machine Learning 2015
(http://tfml.gmum.net), final version published in Schedae Informaticae
Journa
Maximum Entropy Linear Manifold for Learning Discriminative Low-dimensional Representation
Representation learning is currently a very hot topic in modern machine
learning, mostly due to the great success of the deep learning methods. In
particular low-dimensional representation which discriminates classes can not
only enhance the classification procedure, but also make it faster, while
contrary to the high-dimensional embeddings can be efficiently used for visual
based exploratory data analysis.
In this paper we propose Maximum Entropy Linear Manifold (MELM), a
multidimensional generalization of Multithreshold Entropy Linear Classifier
model which is able to find a low-dimensional linear data projection maximizing
discriminativeness of projected classes. As a result we obtain a linear
embedding which can be used for classification, class aware dimensionality
reduction and data visualization. MELM provides highly discriminative 2D
projections of the data which can be used as a method for constructing robust
classifiers.
We provide both empirical evaluation as well as some interesting theoretical
properties of our objective function such us scale and affine transformation
invariance, connections with PCA and bounding of the expected balanced accuracy
error.Comment: submitted to ECMLPKDD 201
Extreme Entropy Machines: Robust information theoretic classification
Most of the existing classification methods are aimed at minimization of
empirical risk (through some simple point-based error measured with loss
function) with added regularization. We propose to approach this problem in a
more information theoretic way by investigating applicability of entropy
measures as a classification model objective function. We focus on quadratic
Renyi's entropy and connected Cauchy-Schwarz Divergence which leads to the
construction of Extreme Entropy Machines (EEM).
The main contribution of this paper is proposing a model based on the
information theoretic concepts which on the one hand shows new, entropic
perspective on known linear classifiers and on the other leads to a
construction of very robust method competetitive with the state of the art
non-information theoretic ones (including Support Vector Machines and Extreme
Learning Machines).
Evaluation on numerous problems spanning from small, simple ones from UCI
repository to the large (hundreads of thousands of samples) extremely
unbalanced (up to 100:1 classes' ratios) datasets shows wide applicability of
the EEM in real life problems and that it scales well
Feature Extraction and Classification of Automatically Segmented Lung Lesion Using Improved Toboggan Algorithm
The accurate detection of lung lesions from computed tomography (CT) scans is essential for clinical diagnosis. It provides valuable information for treatment of lung cancer. However, the process is exigent to achieve a fully automatic lesion detection. Here, a novel segmentation algorithm is proposed, it's an improved toboggan algorithm with a three-step framework, which includes automatic seed point selection, multi-constraints lesion extraction and the lesion refinement. Then, the features like local binary pattern (LBP), wavelet, contourlet, grey level co-occurence matrix (GLCM) are applied to each region of interest of the segmented lung lesion image to extract the texture features such as contrast, homogeneity, energy, entropy and statistical extraction like mean, variance, standard deviation, convolution of modulated and normal frequencies. Finally, support vector machine (SVM) and K-nearest neighbour (KNN) classifiers are applied to classify the abnormal region based on the performance of the extracted features and their performance is been compared. The accuracy of 97.8% is been obtained by using SVM classifier when compared to KNN classifier. This approach does not require any human interaction for lesion detection. Thus, the improved toboggan algorithm can achieve precise lung lesion segmentation in CT images. The features extracted also helps to classify the lesion region of lungs efficiently
Fast optimization of multithreshold entropy linear classifier
Multithreshold Entropy Linear Classifier (MELC) is a density based
model which searches for a linear projection maximizing the Cauchy-Schwarz
Divergence of dataset kernel density estimation. Despite its good empirical
results, one of its drawbacks is the optimization speed. In this paper we analyze
how one can speed it up through solving an approximate problem. We
analyze two methods, both similar to the approximate solutions of the Kernel
Density Estimation querying and provide adaptive schemes for selecting a crucial
parameters based on user-specified acceptable error. Furthermore we show
how one can exploit well known conjugate gradients and L-BFGS optimizers
despite the fact that the original optimization problem should be solved on the
sphere. All above methods and modifications are tested on 10 real life datasets
from UCI repository to confirm their practical usability
Extreme entropy machines : robust information theoretic classification
Most existing classification methods are aimed
at minimization of empirical risk (through some simple
point-based error measured with loss function) with added
regularization. We propose to approach the classification
problem by applying entropy measures as a model objective
function. We focus on quadratic Renyi’s entropy and
connected Cauchy-Schwarz Divergence which leads to the
construction of extreme entropy machines (EEM). The
main contribution of this paper is proposing a model based
on the information theoretic concepts which on the one
hand shows new, entropic perspective on known linear
classifiers and on the other leads to a construction of very
robust method competitive with the state of the art noninformation
theoretic ones (including Support Vector
Machines and Extreme Learning Machines). Evaluation on
numerous problems spanning from small, simple ones from
UCI repository to the large (hundreds of thousands of
samples) extremely unbalanced (up to 100:1 classes’
ratios) datasets shows wide applicability of the EEM in
real-life problems. Furthermore, it scales better than all
considered competitive methods
Connected image processing with multivariate attributes: an unsupervised Markovian classification approach
International audienceThis article presents a new approach for constructing connected operators for image processing and analysis. It relies on a hierarchical Markovian unsupervised algorithm in order to classify the nodes of the traditional Max-Tree. This approach enables to naturally handle multivariate attributes in a robust non-local way. The technique is demonstrated on several image analysis tasks: filtering, segmentation, and source detection, on astronomical and biomedical images. The obtained results show that the method is competitive despite its general formulation. This article provides also a new insight in the field of hierarchical Markovian image processing showing that morphological trees can advantageously replace traditional quadtrees
- …
