Search CORE

65 research outputs found

Fast optimization of Multithreshold Entropy Linear Classifier

Author: Czarnecki Wojciech Marian
Jozefowicz Rafal
Publication venue: 'Uniwersytet Jagiellonski - Wydawnictwo Uniwersytetu Jagiellonskiego'
Publication date: 01/01/2014
Field of study

Multithreshold Entropy Linear Classifier (MELC) is a density based model which searches for a linear projection maximizing the Cauchy-Schwarz Divergence of dataset kernel density estimation. Despite its good empirical results, one of its drawbacks is the optimization speed. In this paper we analyze how one can speed it up through solving an approximate problem. We analyze two methods, both similar to the approximate solutions of the Kernel Density Estimation querying and provide adaptive schemes for selecting a crucial parameters based on user-specified acceptable error. Furthermore we show how one can exploit well known conjugate gradients and L-BFGS optimizers despite the fact that the original optimization problem should be solved on the sphere. All above methods and modifications are tested on 10 real life datasets from UCI repository to confirm their practical usability.Comment: Presented at Theoretical Foundations of Machine Learning 2015 (http://tfml.gmum.net), final version published in Schedae Informaticae Journa

arXiv.org e-Print Archive

Portal Czasopism Naukowych (E-Journals)

Biblioteka Nauki - repozytorium artykuÅÃ³w

Jagiellonian Univeristy Repository

K Means Clustering and Meanshift Analysis for Grouping the Data of Coal Term in Puslitbang tekMIRA

Author: Awangga Rolly Maulana
Pane Syafrial Fachri
Suwardi Iping Supriana
Tunnisa Khaera
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2018
Field of study

Indonesian government agencies under the Ministry of Energy and Mineral Resources have problems in classifying data dictionary of coal. This research conduct grouping coal dictionary using K-Means and MeanShift algorithm. K-means algorithm is used to get cluster value on character and word criteria. The last iteration of Euclidian distance calculation data on k-means combine with Meanshift algorithm. The meanshift calculates centroid by selecting different bandwidths. The result of grouping using k-means and meanshift algorithm shows different centroid to find optimum bandwidth value. The data dictionary of this research has sorted in alphabetically

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Estimating a Signal In the Presence of an Unknown Background

Author: López Angel M.
Rolke Wolfgang A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

We describe a method for fitting distributions to data which only requires knowledge of the parametric form of either the signal or the background but not both. The unknown distribution is fit using a non-parametric kernel density estimator. The method returns parameter estimates as well as errors on those estimates. Simulation studies show that these estimates are unbiased and that the errors are correct

arXiv.org e-Print Archive

Crossref

The Bregman Variational Dual-Tree Framework

Author: Amizadeh Saeed
Hauskrecht Milos
Thiesson Bo
Publication venue
Publication date: 01/01/2013
Field of study

Graph-based methods provide a powerful tool set for many non-parametric frameworks in Machine Learning. In general, the memory and computational complexity of these methods is quadratic in the number of examples in the data which makes them quickly infeasible for moderate to large scale datasets. A significant effort to find more efficient solutions to the problem has been made in the literature. One of the state-of-the-art methods that has been recently introduced is the Variational Dual-Tree (VDT) framework. Despite some of its unique features, VDT is currently restricted only to Euclidean spaces where the Euclidean distance quantifies the similarity. In this paper, we extend the VDT framework beyond the Euclidean distance to more general Bregman divergences that include the Euclidean distance as a special case. By exploiting the properties of the general Bregman divergence, we show how the new framework can maintain all the pivotal features of the VDT framework and yet significantly improve its performance in non-Euclidean domains. We apply the proposed framework to different text categorization problems and demonstrate its benefits over the original VDT.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013

arXiv.org e-Print Archive

CiteSeerX

VBN