3,509 research outputs found
A decade of application of the Choquet and Sugeno integrals in multi-criteria decision aid
The main advances regarding the use of the Choquet and Sugeno integrals in multi-criteria decision aid over the last decade are reviewed. They concern mainly a bipolar extension of both the Choquet integral and the Sugeno integral, interesting particular submodels, new learning techniques, a better interpretation of the models and a better use of the Choquet integral in multi-criteria decision aid. Parallel to these theoretical works, the Choquet integral has been applied to many new fields, and several softwares and libraries dedicated to this model have been developed.Choquet integral, Sugeno integral, capacity, bipolarity, preferences
ACORA: Distribution-Based Aggregation for Relational Learning from Identifier Attributes
Feature construction through aggregation plays an essential role in modeling relational
domains with one-to-many relationships between tables. One-to-many relationships
lead to bags (multisets) of related entities, from which predictive information
must be captured. This paper focuses on aggregation from categorical attributes
that can take many values (e.g., object identifiers). We present a novel aggregation
method as part of a relational learning system ACORA, that combines the use of
vector distance and meta-data about the class-conditional distributions of attribute
values. We provide a theoretical foundation for this approach deriving a "relational
fixed-effect" model within a Bayesian framework, and discuss the implications of
identifier aggregation on the expressive power of the induced model. One advantage
of using identifier attributes is the circumvention of limitations caused either by
missing/unobserved object properties or by independence assumptions. Finally, we
show empirically that the novel aggregators can generalize in the presence of identi-
fier (and other high-dimensional) attributes, and also explore the limitations of the
applicability of the methods.Information Systems Working Papers Serie
Distribution-based aggregation for relational learning with identifier attributes
Identifier attributes—very high-dimensional categorical attributes such as particular
product ids or people’s names—rarely are incorporated in statistical modeling. However,
they can play an important role in relational modeling: it may be informative to have communicated
with a particular set of people or to have purchased a particular set of products. A
key limitation of existing relational modeling techniques is how they aggregate bags (multisets)
of values from related entities. The aggregations used by existing methods are simple
summaries of the distributions of features of related entities: e.g., MEAN, MODE, SUM,
or COUNT. This paper’s main contribution is the introduction of aggregation operators that
capture more information about the value distributions, by storing meta-data about value
distributions and referencing this meta-data when aggregating—for example by computing
class-conditional distributional distances. Such aggregations are particularly important for
aggregating values from high-dimensional categorical attributes, for which the simple aggregates
provide little information. In the first half of the paper we provide general guidelines
for designing aggregation operators, introduce the new aggregators in the context of the
relational learning system ACORA (Automated Construction of Relational Attributes), and
provide theoretical justification.We also conjecture special properties of identifier attributes,
e.g., they proxy for unobserved attributes and for information deeper in the relationship
network. In the second half of the paper we provide extensive empirical evidence that the
distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical
attributes, and in support of the aforementioned conjectures.NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc
Structural Logistic Regression for Link Analysis
We present Structural Logistic Regression, an extension of logistic regression to modeling relational data. It is an integrated approach to building regression models from data stored in relational databases in which potential predictors, both boolean and real-valued, are generated by structured search in the space of queries to the database, and then tested with statistical information criteria for inclusion in a logistic regression. Using statistics and relational representation allows modeling in noisy domains with complex structure. Link prediction is a task of high interest with exactly such characteristics. Be it in the domain of scientific citations, social networks or hypertext, the underlying data are extremely noisy and the features useful for prediction are not readily available in a flat file format. We propose the application of Structural Logistic Regression to building link prediction models, and present experimental results for the task of predicting citations made in scientific literature using relational data taken from the CiteSeer search engine. This data includes the citation graph, authorship and publication venues of papers, as well as their word content
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Acoustic Scene Classification
This work was supported by the Centre for Digital Music Platform (grant EP/K009559/1) and a Leadership Fellowship
(EP/G007144/1) both from the United Kingdom Engineering and Physical Sciences Research Council
Multiple classifiers in biometrics. part 1: Fundamentals and review
We provide an introduction to Multiple Classifier Systems (MCS) including basic nomenclature and describing key elements: classifier dependencies, type of classifier outputs, aggregation procedures, architecture, and types of methods. This introduction complements other existing overviews of MCS, as here we also review the most prevalent theoretical framework for MCS and discuss theoretical developments related to MCS
The introduction to MCS is then followed by a review of the application of MCS to the particular field of multimodal biometric person authentication in the last 25 years, as a prototypical area in which MCS has resulted in important achievements. This review includes general descriptions of successful MCS methods and architectures in order to facilitate the export of them to other information fusion problems.
Based on the theory and framework introduced here, in the companion paper we then develop in more technical detail recent trends and developments in MCS from multimodal biometrics that incorporate context information in an adaptive way. These new MCS architectures exploit input quality measures and pattern-specific particularities that move apart from general population statistics, resulting in robust multimodal biometric systems. Similarly as in the present paper, methods in the companion paper are introduced in a general way so they can be applied to other information fusion problems as well. Finally, also in the companion paper, we discuss open challenges in biometrics and the role of MCS to advance themThis work was funded by projects CogniMetrics (TEC2015-70627-R)
from MINECO/FEDER and RiskTrakc (JUST-2015-JCOO-AG-1). Part of thisthis work was conducted during a research visit of J.F. to Prof. Ludmila Kuncheva at Bangor University (UK) with STSM funding from COST CA16101 (MULTI-FORESEE
Distribution-based aggregation for relational learning with identifier attributes
Identifier attributes—very high-dimensional categorical attributes such as particular
product ids or people’s names—rarely are incorporated in statistical modeling. However,
they can play an important role in relational modeling: it may be informative to have communicated
with a particular set of people or to have purchased a particular set of products. A
key limitation of existing relational modeling techniques is how they aggregate bags (multisets)
of values from related entities. The aggregations used by existing methods are simple
summaries of the distributions of features of related entities: e.g., MEAN, MODE, SUM,
or COUNT. This paper’s main contribution is the introduction of aggregation operators that
capture more information about the value distributions, by storing meta-data about value
distributions and referencing this meta-data when aggregating—for example by computing
class-conditional distributional distances. Such aggregations are particularly important for
aggregating values from high-dimensional categorical attributes, for which the simple aggregates
provide little information. In the first half of the paper we provide general guidelines
for designing aggregation operators, introduce the new aggregators in the context of the
relational learning system ACORA (Automated Construction of Relational Attributes), and
provide theoretical justification.We also conjecture special properties of identifier attributes,
e.g., they proxy for unobserved attributes and for information deeper in the relationship
network. In the second half of the paper we provide extensive empirical evidence that the
distribution-based aggregators indeed do facilitate modeling with high-dimensional categorical
attributes, and in support of the aforementioned conjectures.NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc
ACORA: Distribution-Based Aggregation for Relational Learning from Identifier Attributes
Feature construction through aggregation plays an essential role in modeling relational
domains with one-to-many relationships between tables. One-to-many relationships
lead to bags (multisets) of related entities, from which predictive information
must be captured. This paper focuses on aggregation from categorical attributes
that can take many values (e.g., object identifiers). We present a novel aggregation
method as part of a relational learning system ACORA, that combines the use of
vector distance and meta-data about the class-conditional distributions of attribute
values. We provide a theoretical foundation for this approach deriving a "relational
fixed-effect" model within a Bayesian framework, and discuss the implications of
identifier aggregation on the expressive power of the induced model. One advantage
of using identifier attributes is the circumvention of limitations caused either by
missing/unobserved object properties or by independence assumptions. Finally, we
show empirically that the novel aggregators can generalize in the presence of identi-
fier (and other high-dimensional) attributes, and also explore the limitations of the
applicability of the methods.Information Systems Working Papers Serie
- …