Search CORE

28,897 research outputs found

The zero exemplar distance problem

Author: D. Sankoff
G. Blin
J.E. Hopcroft
M.R. Garey
P. Bonizzoni
R.G. Downey
S. Angibaud
S.S. Adi
V. Ferretti
Z. Chen
Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Given two genomes with duplicate genes, \textsc{Zero Exemplar Distance} is the problem of deciding whether the two genomes can be reduced to the same genome without duplicate genes by deleting all but one copy of each gene in each genome. Blin, Fertin, Sikora, and Vialette recently proved that \textsc{Zero Exemplar Distance} for monochromosomal genomes is NP-hard even if each gene appears at most two times in each genome, thereby settling an important open question on genome rearrangement in the exemplar model. In this paper, we give a very simple alternative proof of this result. We also study the problem \textsc{Zero Exemplar Distance} for multichromosomal genomes without gene order, and prove the analogous result that it is also NP-hard even if each gene appears at most two times in each genome. For the positive direction, we show that both variants of \textsc{Zero Exemplar Distance} admit polynomial-time algorithms if each gene appears exactly once in one genome and at least once in the other genome. In addition, we present a polynomial-time algorithm for the related problem \textsc{Exemplar Longest Common Subsequence} in the special case that each mandatory symbol appears exactly once in one input sequence and at least once in the other input sequence. This answers an open question of Bonizzoni et al. We also show that \textsc{Zero Exemplar Distance} for multichromosomal genomes without gene order is fixed-parameter tractable if the parameter is the maximum number of chromosomes in each genome.Comment: Strengthened and reorganize

arXiv.org e-Print Archive

CiteSeerX

Crossref

Naive Bayes and Exemplar-Based approaches to Word Sense Disambiguation Revisited

Author: Escudero Gerard
Marquez Lluis
Rigau German
Publication venue
Publication date: 01/01/2000
Field of study

This paper describes an experimental comparison between two standard supervised learning methods, namely Naive Bayes and Exemplar-based classification, on the Word Sense Disambiguation (WSD) problem. The aim of the work is twofold. Firstly, it attempts to contribute to clarify some confusing information about the comparison between both methods appearing in the related literature. In doing so, several directions have been explored, including: testing several modifications of the basic learning algorithms and varying the feature space. Secondly, an improvement of both algorithms is proposed, in order to deal with large attribute sets. This modification, which basically consists in using only the positive information appearing in the examples, allows to improve greatly the efficiency of the methods, with no loss in accuracy. The experiments have been performed on the largest sense-tagged corpus available containing the most frequent and ambiguous English words. Results show that the Exemplar-based approach to WSD is generally superior to the Bayesian approach, especially when a specific metric for dealing with symbolic attributes is used.Comment: 5 page

arXiv.org e-Print Archive

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Clustering by soft-constraint affinity propagation: Applications to gene-expression data

Author: Alizadeh
Blatt
Braunstein
Golub
M. Leone
M. Weigt
Pomeroy
Sumedha
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2007
Field of study

Motivation: Similarity-measure based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck \cite{Frey07}. In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, {\it e.g.}, in analyzing gene expression data. Results: This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new {\it a priori} free-parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.Comment: 11 pages, supplementary material: http://isiosf.isi.it/~weigt/scap_supplement.pd

arXiv.org e-Print Archive

CiteSeerX

Crossref

Learning feed-forward one-shot learners

Author: Bertinetto Luca
Henriques João F.
Torr Philip H. S.
Valmadre Jack
Vedaldi Andrea
Publication venue
Publication date: 01/01/2016
Field of study

One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a second deep network, called a learnet, which predicts the parameters of a pupil network from a single exemplar. In this manner we obtain an efficient feed-forward one-shot learner, trained end-to-end by minimizing a one-shot classification objective in a learning to learn formulation. In order to make the construction feasible, we propose a number of factorizations of the parameters of the pupil network. We demonstrate encouraging results by learning characters from single exemplars in Omniglot, and by tracking visual objects from a single initial exemplar in the Visual Object Tracking benchmark.Comment: The first three authors contributed equally, and are listed in alphabetical orde

arXiv.org e-Print Archive

Oxford University Research Archive

Secondary generalisation in categorisation: an exemplar-based account

Author: De Schryver Maarten
Rosseel Yves
Publication venue
Publication date: 01/01/2010
Field of study

The parallel rule activation and rule synthesis (PRAS) model is a computational model for generalisation in category learning, proposed by Vandierendonck (1995). An important concept underlying the PRAS model is the distinction between primary and secondary generalisation. In Vandierendonck (1995), an empirical study is reported that provides support for the concept of secondary generalisation. In this paper, we re-analyse the data reported by Vandierendonck (1995) by fitting three different variants of the Generalised Context Model (GCM) which do not rely on secondary generalisation. Although some of the GCM variants outperformed the PRAS model in terms of global fit, they all have difficulty in providing a qualitatively good fit of a specific critical pattern

Ghent University Academic Bibliography

Directory of Open Access Journals