Search CORE

25,698 research outputs found

Semi-supervised model-based clustering with controlled clusters leakage

Author: Struski Łukasz
Tabor Jacek
Śmieja Marek
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we focus on finding clusters in partially categorized data sets. We propose a semi-supervised version of Gaussian mixture model, called C3L, which retrieves natural subgroups of given categories. In contrast to other semi-supervised models, C3L is parametrized by user-defined leakage level, which controls maximal inconsistency between initial categorization and resulting clustering. Our method can be implemented as a module in practical expert systems to detect clusters, which combine expert knowledge with true distribution of data. Moreover, it can be used for improving the results of less flexible clustering techniques, such as projection pursuit clustering. The paper presents extensive theoretical analysis of the model and fast algorithm for its efficient optimization. Experimental results show that C3L finds high quality clustering model, which can be applied in discovering meaningful groups in partially classified data

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository

Learning with Augmented Features for Heterogeneous Domain Adaptation

Author: Duan Lixin
Tsang Ivor
Xu Dong
Publication venue
Publication date: 18/06/2012
Field of study

We propose a new learning method for heterogeneous domain adaptation (HDA), in which the data from the source domain and the target domain are represented by heterogeneous features with different dimensions. Using two different projection matrices, we first transform the data from two domains into a common subspace in order to measure the similarity between the data from two domains. We then propose two new feature mapping functions to augment the transformed data with their original features and zeros. The existing learning methods (e.g., SVM and SVR) can be readily incorporated with our newly proposed augmented feature representations to effectively utilize the data from both domains for HDA. Using the hinge loss function in SVM as an example, we introduce the detailed objective function in our method called Heterogeneous Feature Augmentation (HFA) for a linear case and also describe its kernelization in order to efficiently cope with the data with very high dimensions. Moreover, we also develop an alternating optimization algorithm to effectively solve the nontrivial optimization problem in our HFA method. Comprehensive experiments on two benchmark datasets clearly demonstrate that HFA outperforms the existing HDA methods.Comment: ICML201

arXiv.org e-Print Archive

CiteSeerX

OPUS - University of Technology Sydney

Using bag-of-concepts to improve the performance of support vector machines in text categorization

Author: Cöster Rickard
Sahlgren Magnus
Publication venue
Publication date: 01/01/2004
Field of study

This paper investigates the use of concept-based representations for text categorization. We introduce a new approach to create concept-based text representations, and apply it to a standard text categorization collection. The representations are used as input to a Support Vector Machine classifier, and the results show that there are certain categories for which concept-based representations constitute a viable supplement to word-based ones. We also demonstrate how the performance of the Support Vector Machine can be improved by combining representations

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Microgenesis, immediate experience and visual processes in reading

Author: Rosenthal Victor
Publication venue: Kluver
Publication date: 01/01/2002
Field of study

The concept of microgenesis refers to the development on a brief present-time scale of a percept, a thought, an object of imagination, or an expression. It defines the occurrence of immediate experience as dynamic unfolding and differentiation in which the ‘germ’ of the final experience is already embodied in the early stages of its development. Immediate experience typically concerns the focal experience of an object that is thematized as a ‘figure’ in the global field of consciousness; this can involve a percept, thought, object of imagination, or expression (verbal and/or gestural). Yet, whatever its modality or content, focal experience is postulated to develop and stabilize through dynamic differentiation and unfolding. Such a microgenetic description of immediate experience substantiates a phenomenological and genetic theory of cognition where any process of perception, thought, expression or imagination is primarily a process of genetic differentiation and development, rather than one of detection (of a stimulus array or information), transformation, and integration (of multiple primitive components) as theories of cognitivist kind have contended. My purpose in this essay is to provide an overview of the main constructs of microgenetic theory, to outline its potential avenues of future development in the field of cognitive science, and to illustrate an application of the theory to research, using visual processes in reading as an example

CogPrints Cognitive Sciences Eprint Archive