Search CORE

278 research outputs found

Representational information: a new general notion and measure\ud of information

Author: Vigo Professor Ronaldo
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

In what follows, we introduce the notion of representational information (information conveyed by sets of dimensionally deﬁned objects about their superset of origin) as well as an\ud original deterministic mathematical framework for its analysis and measurement. The framework, based in part on categorical invariance theory [Vigo, 2009], uniﬁes three key constructsof universal science – invariance, complexity, and information. From this uniﬁcation we deﬁne the amount of information that a well-deﬁned set of objects R carries about its ﬁnite superset of origin S, as the rate of change in the structural complexity of S (as determined by its degree of categorical invariance), whenever the objects in R are removed from the set S. The measure captures deterministically the signiﬁcant role that context and category structure play in determining the relative quantity and quality of subjective information conveyed by particular objects in multi-object stimuli

On Modelling and Understanding Image Manifolds

Author: Woodland Alan John
Publication venue
Publication date: 26/03/2010
Field of study

Aberystwyth Research Portal

Using rule extraction to improve the comprehensibility of predictive models.

Author: Baesens Bart
Huysmans Johan
Vanthienen Jan
Publication venue
Publication date
Field of study

Whereas newer machine learning techniques, like artifficial neural net-works and support vector machines, have shown superior performance in various benchmarking studies, the application of these techniques remains largely restricted to research environments. A more widespread adoption of these techniques is foiled by their lack of explanation capability which is required in some application areas, like medical diagnosis or credit scoring. To overcome this restriction, various algorithms have been proposed to extract a meaningful description of the underlying `blackbox' models. These algorithms' dual goal is to mimic the behavior of the black box as closely as possible while at the same time they have to ensure that the extracted description is maximally comprehensible. In this research report, we first develop a formal definition of`rule extraction and comment on the inherent trade-off between accuracy and comprehensibility. Afterwards, we develop a taxonomy by which rule extraction algorithms can be classiffied and discuss some criteria by which these algorithms can be evaluated. Finally, an in-depth review of the most important algorithms is given.This report is concluded by pointing out some general shortcomings of existing techniques and opportunities for future research.Models; Model; Algorithms; Criteria; Opportunities; Research; Learning; Neural networks; Networks; Performance; Benchmarking; Studies; Area; Credit; Credit scoring; Behavior; Time;

Research Papers in Economics

End-to-End Entity Resolution for Big Data: A Survey

Author: Christophides Vassilis
Efthymiou Vasilis
Palpanas Themis
Papadakis George
Stefanidis Kostas
Publication venue
Publication date: 01/02/1988
Field of study

One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions

arXiv.org e-Print Archive

University of Richmond