Search CORE

5,142 research outputs found

FAME: Face Association through Model Evolution

Author: Duygulu Pinar
Golge Eren
Publication venue
Publication date: 10/07/2014
Field of study

We attack the problem of learning face models for public faces from weakly-labelled images collected from web through querying a name. The data is very noisy even after face detection, with several irrelevant faces corresponding to other people. We propose a novel method, Face Association through Model Evolution (FAME), that is able to prune the data in an iterative way, for the face models associated to a name to evolve. The idea is based on capturing discriminativeness and representativeness of each instance and eliminating the outliers. The final models are used to classify faces on novel datasets with possibly different characteristics. On benchmark datasets, our results are comparable to or better than state-of-the-art studies for the task of face identification.Comment: Draft version of the stud

arXiv.org e-Print Archive

Crossref

Learning Mixtures of Bernoulli Templates by Two-Round EM with Performance Guarantee

Author: Barbu Adrian
Wu Tianfu
Wu Ying Nian
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

Dasgupta and Shulman showed that a two-round variant of the EM algorithm can learn mixture of Gaussian distributions with near optimal precision with high probability if the Gaussian distributions are well separated and if the dimension is sufficiently high. In this paper, we generalize their theory to learning mixture of high-dimensional Bernoulli templates. Each template is a binary vector, and a template generates examples by randomly switching its binary components independently with a certain probability. In computer vision applications, a binary vector is a feature map of an image, where each binary component indicates whether a local feature or structure is present or absent within a certain cell of the image domain. A Bernoulli template can be considered as a statistical model for images of objects (or parts of objects) from the same category. We show that the two-round EM algorithm can learn mixture of Bernoulli templates with near optimal precision with high probability, if the Bernoulli templates are sufficiently different and if the number of features is sufficiently high. We illustrate the theoretical results by synthetic and real examples.Comment: 27 pages, 8 figure

arXiv.org e-Print Archive

Crossref

An investigation into the performance and representation of a stochastic evolutionary neural tree

Author: Adams R.G.
Butchart K.
Davey N.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/1997
Field of study

Copyright Springer.The Stochastic Competitive Evolutionary Neural Tree (SCENT) is a new unsupervised neural net that dynamically evolves a representational structure in response to its training data. Uniquely SCENT requires no initial parameter setting as it autonomously creates appropriate parameterisation at runtime. Pruning and convergence are stochastically controlled using locally calculated heuristics. A thorough investigation into the performance of SCENT is presented. The network is compared to other dynamic tree based models and to a high quality flat clusterer over a variety of data sets and runs

University of Hertfordshire Research Archive

Interpretable Clustering using Unsupervised Binary Trees

Author: Fraiman Ricardo
Ghattas Badih
Svarc Marcela
Publication venue
Publication date: 01/01/2011
Field of study

We herein introduce a new method of interpretable clustering that uses unsupervised binary trees. It is a three-stage procedure, the first stage of which entails a series of recursive binary splits to reduce the heterogeneity of the data within the new subsamples. During the second stage (pruning), consideration is given to whether adjacent nodes can be aggregated. Finally, during the third stage (joining), similar clusters are joined together, even if they do not descend from the same node originally. Consistency results are obtained, and the procedure is used on simulated and real data sets.Comment: 25 pages, 6 figure

arXiv.org e-Print Archive

Biblioteca Max von Buch, Universidad de San Andrés

Survey of data mining approaches to user modeling for adaptive hypermedia

Author: Chen SY
Frias-Martinez E
Liu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio

CiteSeerX

Crossref

Brunel University Research Archive

Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging

Author: Das Dipanjan
McDonald Ryan
Nivre Joakim
Petrov Slav
Täckström Oscar
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2013
Field of study

We consider the construction of part-of-speech taggers for resource-poor languages. Recently, manually constructed tag dictionaries from Wiktionary and dictionaries projected via bitext have been used as type constraints to overcome the scarcity of annotated data in this setting. In this paper, we show that additional token constraints can be projected from a resource-rich source language to a resource-poor target language via word-aligned bitext. We present several models to this end; in particular a partially observed conditional random ﬁeld model, where coupled token and type constraints provide a partial signal for training. Averaged across eight previously studied Indo-European languages, our model achieves a 25% relative error reduction over the prior state of the art. We further present successful results on seven additional languages from different families, empirically demonstrating the applicability of coupled token and type constraints across a diverse set of languages

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database