Search CORE

28 research outputs found

Transductive-Inductive Cluster Approximation Via Multivariate Chebyshev Inequality

Author: Sinha Shriprakash
Publication venue
Publication date: 19/06/2012
Field of study

Approximating adequate number of clusters in multidimensional data is an open area of research, given a level of compromise made on the quality of acceptable results. The manuscript addresses the issue by formulating a transductive inductive learning algorithm which uses multivariate Chebyshev inequality. Considering clustering problem in imaging, theoretical proofs for a particular level of compromise are derived to show the convergence of the reconstruction error to a finite value with increasing (a) number of unseen examples and (b) the number of clusters, respectively. Upper bounds for these error rates are also proved. Non-parametric estimates of these error from a random sample of sequences empirically point to a stable number of clusters. Lastly, the generalization of algorithm can be applied to multidimensional data sets from different fields.Comment: 16 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Концептуальные основы и методология создания индуктивной технологии объективной кластеризациия

Author: Бабичев С.А.
Publication venue: Міжнародний науково-навчальний центр інформаційних технологій і систем НАН та МОН України
Publication date: 01/01/2016
Field of study

В статье представлены теоретические разработки по созданию методологии объективной кластеризации объектов сложной природы на основе методов индуктивного моделирования сложных систем. Разработана архитектура индуктивной технологии объективной кластеризации в виде подробной схемы пошаговой реализации процедуры индуктивного моделирования процесса кластеризации объектов сложной природы.У статті представлено теоретичні розробки по створенню методології об'єктивної кластеризації об'єктів складної природи на основі методів індуктивного моделювання складних систем. Розроблено архітектуру індуктивної технології об'єктивної кластеризації у вигляді детальної схеми покрокової реалізації процедури індуктивного моделювання процесу кластеризації об'єктів складної природи.The paper presents the theoretical developments to create a methodology of objective clustering of complex nature objects based on the complex systems inductive modeling methods. The architecture of the objective clustering inductive modeling as a detailed scheme of step by step implementation of procedures of inductive modeling of the objects complex nature clustering is developed

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Nearly maximally predictive features and their dimensions

Author: Crutchfield James P.
Marzen Sarah E.
Publication venue: 'American Physical Society (APS)'
Publication date: 27/02/2017
Field of study

Scientific explanation often requires inferring maximally predictive features from a given data set. Unfortunately, the collection of minimal maximally predictive features for most stochastic processes is uncountably infinite. In such cases, one compromises and instead seeks nearly maximally predictive features. Here, we derive upper bounds on the rates at which the number and the coding cost of nearly maximally predictive features scale with desired predictive power. The rates are determined by the fractal dimensions of a process' mixed-state distribution. These results, in turn, show how widely used finite-order Markov models can fail as predictors and that mixed-state predictive features can offer a substantial improvement.United States. Army Research Office (W911NF-13-1-0390)United States. Army Research Office (W911NF-12-1- 0288

arXiv.org e-Print Archive

DSpace@MIT

Crossref

eScholarship - University of California

Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data

Author: Braun Rosemary
Leibon Gregory
Pauls Scott
Rockmore Daniel
Publication venue
Publication date: 01/01/2011
Field of study

We present the extention and application of a new unsupervised statistical learning technique--the Partition Decoupling Method--to gene expression data. Because it has the ability to reveal non-linear and non-convex geometries present in the data, the PDM is an improvement over typical gene expression analysis algorithms, permitting a multi-gene analysis that can reveal phenotypic differences even when the individual genes do not exhibit differential expression. Here, we apply the PDM to publicly-available gene expression data sets, and demonstrate that we are able to identify cell types and treatments with higher accuracy than is obtained through other approaches. By applying it in a pathway-by-pathway fashion, we demonstrate how the PDM may be used to find sets of mechanistically-related genes that discriminate phenotypes.Comment: Revise

arXiv.org e-Print Archive

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

Image Segmentation using Sparse Subset Selection

Author: Kamangar Farhad
Kheirandishfard Mohsen
Zohrizadeh Fariba
Publication venue
Publication date: 08/04/2018
Field of study

In this paper, we present a new image segmentation method based on the concept of sparse subset selection. Starting with an over-segmentation, we adopt local spectral histogram features to encode the visual information of the small segments into high-dimensional vectors, called superpixel features. Then, the superpixel features are fed into a novel convex model which efficiently leverages the features to group the superpixels into a proper number of coherent regions. Our model automatically determines the optimal number of coherent regions and superpixels assignment to shape final segments. To solve our model, we propose a numerical algorithm based on the alternating direction method of multipliers (ADMM), whose iterations consist of two highly parallelizable sub-problems. We show each sub-problem enjoys closed-form solution which makes the ADMM iterations computationally very efficient. Extensive experiments on benchmark image segmentation datasets demonstrate that our proposed method in combination with an over-segmentation can provide high quality and competitive results compared to the existing state-of-the-art methods.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 201

arXiv.org e-Print Archive

Crossref