CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
Unsupervised feature construction for improving data representation and semantics
Authors
S Lallich
MA Rizoiu
J Velcin
Publication date
1 February 2013
Publisher
'Springer Science and Business Media LLC'
Doi
Cite
Abstract
Attribute-based format is the main data representation format used by machine learning algorithms. When the attributes do not properly describe the initial data, performance starts to degrade. Some algorithms address this problem by internally changing the representation space, but the newly constructed features rarely have any meaning. We seek to construct, in an unsupervised way, new attributes that are more appropriate for describing a given dataset and, at the same time, comprehensible for a human user. We propose two algorithms that construct the new attributes as conjunctions of the initial primitive attributes or their negations. The generated feature sets have reduced correlations between features and succeed in catching some of the hidden relations between individuals in a dataset. For example, a feature like sky \wedge \neg building \wedge panorama would be true for non-urban images and is more informative than simple features expressing the presence or the absence of an object. The notion of Pareto optimality is used to evaluate feature sets and to obtain a balance between total correlation and the complexity of the resulted feature set. Statistical hypothesis testing is employed in order to automatically determine the values of the parameters used for constructing a data-dependent feature set. We experimentally show that our approaches achieve the construction of informative feature sets for multiple datasets. © 2013 Springer Science+Business Media New York
Similar works
Full text
Available Versions
OPUS - University of Technology Sydney
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:opus.lib.uts.edu.au:10453/...
Last time updated on 20/06/2020