Search CORE

74,667 research outputs found

A review of domain adaptation without target labels

Author: Kouw Wouter M.
Loog Marco
Publication venue
Publication date: 01/01/2019
Field of study

Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Recommended from our members

A Goal-Directed Bayesian Framework for Categorization

Author: Acuna
Anderson
Anthony
Ashby
Ashby
Barsalou
Barsalou
Barsalou
Barto
Bishop
Botvinick
Botvinick
Bouton
Caramazza
Caramazza
Chater
Clark
Collins
Collins
Courville
Dayan
Dickinson
Dolan
Estes
FitzGerald
Friston
Friston
Friston
Friston
Friston
Friston
Friston
Gershman
Griffiths
Hinton
Hobson
Hoeting
Hohwy
Homa
Kauffman
Knill
Lamberts
Maddox
McClelland
Miller
Mirza
Nosofsky
Oaksford
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Pezzulo
Rigoli
Rigoli
Rigoli
Roach
Rosch
Rosch
Rosch
Rosenblueth
Rosenman
Shenhav
Sjöberg
Smith
Solway
Squire
Stoianov
Tononi
Traulsen
Tulving
Tulving
Warrington
Warrington
Warrington
Warrington
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Categorization is a fundamental ability for efficient behavioral control. It allows organisms to remember the correct responses to categorical cues and not for every stimulus encountered (hence eluding computational cost or complexity), and to generalize appropriate responses to novel stimuli dependant on category assignment. Assuming the brain performs Bayesian inference, based on a generative model of the external world and future goals, we propose a computational model of categorization in which important properties emerge. These properties comprise the ability to infer latent causes of sensory experience, a hierarchical organization of latent causes, and an explicit inclusion of context and action representations. Crucially, these aspects derive from considering the environmental statistics that are relevant to achieve goals, and from the fundamental Bayesian principle that any generative model should be preferred over alternative models based on an accuracy-complexity trade-off. Our account is a step toward elucidating computational principles of categorization and its role within the Bayesian brain hypothesis

City Research Online

Crossref

Frontiers - Publisher Connector

PubMed Central

UCL Discovery

MPG.PuRe

Multilayer Aggregation with Statistical Validation: Application to Investor Networks

Author: Baltakys Kęstutis
Emmert-Streib Frank
Kanniainen Juho
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/05/2018
Field of study

Multilayer networks are attracting growing attention in many fields, including finance. In this paper, we develop a new tractable procedure for multilayer aggregation based on statistical validation, which we apply to investor networks. Moreover, we propose two other improvements to their analysis: transaction bootstrapping and investor categorization. The aggregation procedure can be used to integrate security-wise and time-wise information about investor trading networks, but it is not limited to finance. In fact, it can be used for different applications, such as gene, transportation, and social networks, were they inferred or observable. Additionally, in the investor network inference, we use transaction bootstrapping for better statistical validation. Investor categorization allows for constant size networks and having more observations for each node, which is important in the inference especially for less liquid securities. Furthermore, we observe that the window size used for averaging has a substantial effect on the number of inferred relationships. We apply this procedure by analyzing a unique data set of Finnish shareholders during the period 2004-2009. We find that households in the capital have high centrality in investor networks, which, under the theory of information channels in investor networks suggests that they are well-informed investors

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

A probabilistic model of cross-categorization

Author: Kemp Charles
Mansinghka Vikash K.
Shafto Patrick
Tenenbaum Joshua B.
Publication venue: 'Elsevier BV'
Publication date: 01/02/2011
Field of study

Most natural domains can be represented in multiple ways: we can categorize foods in terms of their nutritional content or social role, animals in terms of their taxonomic groupings or their ecological niches, and musical instruments in terms of their taxonomic categories or social uses. Previous approaches to modeling human categorization have largely ignored the problem of cross-categorization, focusing on learning just a single system of categories that explains all of the features. Cross-categorization presents a difficult problem: how can we infer categories without first knowing which features the categories are meant to explain? We present a novel model that suggests that human cross-categorization is a result of joint inference about multiple systems of categories and the features that they explain. We also formalize two commonly proposed alternative explanations for cross-categorization behavior: a features-first and an objects-first approach. The features-first approach suggests that cross-categorization is a consequence of attentional processes, where features are selected by an attentional mechanism first and categories are derived second. The objects-first approach suggests that cross-categorization is a consequence of repeated, sequential attempts to explain features, where categories are derived first, then features that are poorly explained are recategorized. We present two sets of simulations and experiments testing the models’ predictions about human categorization. We find that an approach based on joint inference provides the best fit to human categorization behavior, and we suggest that a full account of human category learning will need to incorporate something akin to these capabilities

DSpace@MIT

Algorithms for item categorization based on ordinal ranking data

Author: Aeron Shuchin
Girson Josh
Publication venue
Publication date: 29/09/2016
Field of study

We present a new method for identifying the latent categorization of items based on their rankings. Complimenting a recent work that uses a Dirichlet prior on preference vectors and variational inference, we show that this problem can be effectively dealt with using existing community detection algorithms, with the communities corresponding to item categories. In particular we convert the bipartite ranking data to a unipartite graph of item affinities, and apply community detection algorithms. In this context we modify an existing algorithm - namely the label propagation algorithm to a variant that uses the distance between the nodes for weighting the label propagation - to identify the categories. We propose and analyze a synthetic ordinal ranking model and show its relation to the recently much studied stochastic block model. We test our algorithms on synthetic data and compare performance with several popular community detection algorithms. We also test the method on real data sets of movie categorization from the Movie Lens database. In all of the cases our algorithm is able to identify the categories for a suitable choice of tuning parameter.Comment: To appear in IEEE Allerton conference on computing, communications and control, 201

arXiv.org e-Print Archive

Crossref