Search CORE

233 research outputs found

The use of positive and negative equivalence constraints in model-based clustering

Author: Melnykov I.
Melnykov V.
Michael S.
Publication venue: Nazarbayev University
Publication date: 01/01/2014
Field of study

Cluster analysis is a popular technique in statistics and computer science with the objective to group similar observations into relatively distinct groups known as clusters. Semi-supervised model-based clustering assumes that some additional information about group memberships is available

Nazarbayev University Repository

ClickClust: An R Package for Model-Based Clustering of Categorical Sequences

Author: Melnykov Volodymyr
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/10/2016
Field of study

The R package ClickClust is a new piece of software devoted to finite mixture modeling and model-based clustering of categorical sequences. As a special kind of time series, categorical sequences, also known as categorical time series, exhibit a time-dependent nature and are traditionally modeled by means of Markov chains. Clustering categorical sequences is an important problem with multiple applications, but grouping sequences of sites or web-pages, also known as clickstreams, is one of the most well-known problems that helps discover common navigation patterns and routes taken by users. This popular application is recognized in the package title ClickClust. The paper discusses methodological and algorithmic foundations of the package based on finite mixtures of Markov models. The number of Markov chain states can often be large leading to high-dimensional transition probability matrices. The high number of model parameters can affect clustering performance severely. As a remedy to this problem, backward and forward selection algorithms are proposed for grouping states. This extends the original clustering problem to a biclustering framework. Among other capabilities of ClickClust, there are the estimation of the variance-covariance matrix corresponding to model parameter estimates, prediction of future states visited, and the construction of a display named click-plot that helps illustrate the obtained clustering solutions. All available functions and the utility of the package are thoroughly discussed and illustrated on multiple examples

Directory of Open Access Journals

Journal of Statistical Software

Some theoretical contributions to the evaluation and assessment of finite mixture models with applications

Author: Melnykov Volodymyr
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2009
Field of study

This dissertation develops theory and methodology for the evaluation and assessment of finite mixture models. New methods for simulating finite mixture models satisfying a pre-specified level of complexity defined through the notion of pairwise overlap, are developed. Corresponding software is publicly available at CRAN. This dissertation also develops methodology for assessing significance in finite mixture models with applications to model-based unsupervised and semi-supervised clustering frameworks. The dissertation concludes with an application of finite mixture models to two-dimensional gel electrophoresis

Digital Repository @ Iowa State University (ISU)

Assessing Significance in Finite Mixture Models

Author: Maitra Ranjan
Melnykov Volodymyr
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2018
Field of study

A new method is proposed to quantify significance in finite mixture models. The basis for this new methodology is an approach that calculates the p-value for testing a simpler model against a more complicated one in a way that is able to obviate the failure of regularity conditions for likelihood ratio tests. The developed testing procedure allows for pairwise comparison of any two mixture models with failure to reject the null hypothesis implying insignificant likelihood improvement under the more complex model. This leads to a comprehensive tool called a quantitation map which displays significance and quantitatively summarizes all model comparisons. This map can be used, among other applications, to decide on the best among a set of candidate mixture models. The performance of the procedure is illustrated on some classification datasets and a comprehensive simulation study. The methodology is also applied to a study of voting preferences of senators in the 109th US Congress. Although the development of our testing strategy is based on large-sample theory, we note that it has impressive performance even in cases with moderate sample sizes

Digital Repository @ Iowa State University (ISU)

Finite mixture models and model-based clustering

Author: Maitra Ranjan
Melnykov Volodymyr
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2010
Field of study

Finite mixture models have a long history in statistics, having been used to model population heterogeneity, generalize distributional assumptions, and lately, for providing a convenient yet formal framework for clustering and classification. This paper provides a detailed review into mixture models and model-based clustering. Recent trends as well as open problems in the area are also discussed

Digital Repository @ Iowa State University (ISU)

CARP: Software for Fishing Out Good Clustering Algorithms

Author: Maitra Ranjan
Melnykov Volodymyr
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2011
Field of study

This paper presents the CLUSTERING ALGORITHMS’ REFEREE PACKAGE or CARP, an open source GNU GPL-licensed C package for evaluating clustering algorithms. Calibrating performance of such algorithms is important and CARP addresses this need by generating datasets of different clustering complexity and by assessing the performance of the concerned algorithm in terms of its ability to classify each dataset relative to the true grouping. This paper briefly describes the software and its capabilities

Digital Repository @ Iowa State University (ISU)

Незабываемые корифеи печати

Author: Melnykov O. V.
Мельников О. В.
Publication venue: Київ
Publication date: 01/01/2009
Field of study

Electronic Archive of Kyiv Polytechnic Institute

Формализация предметной области управления издательско-полиграфической отраслью

Author: Melnykov O. V.
Мельников О. В.
Publication venue: Київ
Publication date: 01/01/2013
Field of study

Представлена інформаційно-логічна модель видавничо-поліграфічної галузі (ВПГ), що містить схему формування банку аналітичних даних ВПГ та структуровані бази даних.Presented information and the logical model of publishing and printing industry (PРI), which contains a scheme for generating analytical data bank PРI and structured databases.Представлена информационно-логическая модель издательско-полиграфической отрасли (ИПО), которая содержит схему формирования банка аналитических данных ИПО и структурированные базы данных

Electronic Archive of Kyiv Polytechnic Institute

Время работы и свершений (к 80-летию Украинской академии книгопечатания)

Author: Melnykov O. V.
Мельников О. В.
Publication venue: Київ
Publication date: 01/01/2010
Field of study

На основі аналізу опрацьованих архівних джерел, окремих видань і публікацій у періодичних та продовжуваних виданнях наведено організаційну структуру та спеціальності, за якими упродовж 1930-2010 рр. проводилася підготовка спеціалістів в Українській академії друкарства.On the basis of the analysis of the processed archival sources, separate editions and publications in periodic and continued editions the organizational structure and specialities behind which throughout 1930-2010 preparation of experts in the Ukrainian academy of publishing was spent is resulted.На основе анализа обработанных архивных источников, отдельных изданий и публикаций в периодических и продолжаемых изданиях приведена организационная структура и специальности, за которыми на протяжении 1930-2010 гг. проводилась подготовка специалистов в Украинской академии книгопечатания

Electronic Archive of Kyiv Polytechnic Institute