Search CORE

914 research outputs found

Electrostatic Field Classifier for Deficient Data

Author: A.P. Dempster
B. Gabrys
D. Ruta
J.L. Schafer
K. Torkkola
W. Outhwaite
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

This paper investigates the suitability of recently developed models based on the physical field phenomena for classification problems with incomplete datasets. An original approach to exploiting incomplete training data with missing features and labels, involving extensive use of electrostatic charge analogy, has been proposed. Classification of incomplete patterns has been investigated using a local dimensionality reduction technique, which aims at exploiting all available information rather than trying to estimate the missing values. The performance of all proposed methods has been tested on a number of benchmark datasets for a wide range of missing data scenarios and compared to the performance of some standard techniques. Several modifications of the original electrostatic field classifier aiming at improving speed and robustness in higher dimensional spaces are also discussed

Crossref

Bournemouth University Research Online

Next challenges for adaptive learning systems

Author: Bifet A.
Gaber M.
Gabrys B.
Gama J.
Minku L.
Musial K.
Zliobaite I.
Publication venue
Publication date: 01/01/2012
Field of study

Learning from evolving streaming data has become a 'hot' research topic in the last decade and many adaptive learning algorithms have been developed. This research was stimulated by rapidly growing amounts of industrial, transactional, sensor and other business data that arrives in real time and needs to be mined in real time. Under such circumstances, constant manual adjustment of models is in-efficient and with increasing amounts of data is becoming infeasible. Nevertheless, adaptive learning models are still rarely employed in business applications in practice. In the light of rapidly growing structurally rich 'big data', new generation of parallel computing solutions and cloud computing services as well as recent advances in portable computing devices, this article aims to identify the current key research directions to be taken to bring the adaptive learning closer to application needs. We identify six forthcoming challenges in designing and building adaptive learning (pre-diction) systems: making adaptive systems scalable, dealing with realistic data, improving usability and trust, integrat-ing expert knowledge, taking into account various application needs, and moving from adaptive algorithms towards adaptive tools. Those challenges are critical for the evolving stream settings, as the process of model building needs to be fully automated and continuous.</jats:p

Crossref

University of Birmingham Research Portal

INESC TEC Repository

Portsmouth University Research Portal (Pure)

Fuzzy min-max neural networks for categorical data: application to missing data imputation

Author: A Bargiela
A Farhangfar
B Gabrys
B Gabrys
B Gabrys
B Gabrys
D Dubois
DB Rubin
DB Rubin
G Klir
H Witten
I Myrtveit
Jesús Cardeñosa
JL Schafer
K Tanaka
M Meneganti
MJ Greenacre
P Allison
P Dempster
P Nandedkar
Pilar Rey-del-Castillo
PK Simpson
PK Simpson
Q Song
R Cox
RJ Little
RK Brouwer
RR Yager
S Mitra
S Mitra
TJ Santner
V Nelwamondo
W Pedrycz
X Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The fuzzy min–max neural network classifier is a supervised learning method. This classifier takes the hybrid neural networks and fuzzy systems approach. All input variables in the network are required to correspond to continuously valued variables, and this can be a significant constraint in many real-world situations where there are not only quantitative but also categorical data. The usual way of dealing with this type of variables is to replace the categorical by numerical values and treat them as if they were continuously valued. But this method, implicitly defines a possibly unsuitable metric for the categories. A number of different procedures have been proposed to tackle the problem. In this article, we present a new method. The procedure extends the fuzzy min–max neural network input to categorical variables by introducing new fuzzy sets, a new operation, and a new architecture. This provides for greater flexibility and wider application. The proposed method is then applied to missing data imputation in voting intention polls. The micro data—the set of the respondents’ individual answers to the questions—of this type of poll are especially suited for evaluating the method since they include a large number of numerical and categorical attributes

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM (Univ. Politécnica de Madrid)

Political airs : from monitoring to attuned sensing air pollution

Author: Callon M
Conolly WE
De la Bellacasa MP
Estalella A
Gabrys J
Gugliotta A
Hecht G
Latour B
Latour B
Murphy M
Murphy M
Nerea Calvillo
Proctor RN
Rancière J
Tironi M
Publication venue: 'SAGE Publications'
Publication date: 01/06/2018
Field of study

In Madrid, as in many European cities, air pollution is known about and made accountable through techno-scientific monitoring processes based on data, and the toxicity of the air is defined through epidemiological studies and made political through policy. In 2009, Madrid’s City Council changed the location of its air quality monitoring stations without notice, reducing the average pollution of the city and therefore provoking a public scandal. This scandal challenged the monitoring process, as the data that used to be the evidence of pollution could not be relied on anymore. To identify the characteristics of some of the diverse forms of public’s participation that emerged, I route theories of environmental sensing from STS and feminist theory through the notion of attuned sensing. Reading environmental sensing through the processual and orientational processes of attunement expands the ways in which toxicity can be sensed outside of quantitative data. This mode of sensing recognizes how the different spontaneous attunements to and with air pollution and the scandal acknowledged Madrid’s chemical infrastructure, rendering visible qualitative conditions of toxicity. This mode of sensing politicized the toxicity of the air not through management or policy making, nor only through established forms environmental activism, but through contagion and accumulation of the different forms of public participation. All together, they made air pollution a matter of public concern. They also redistributed the actors, practices and objects that make the toxicity not only knowable, but also accountable, and most importantly, they opened up spaces for citizen intervention

Crossref

Warwick Research Archives Portal Repository

A comparative study of general fuzzy min-max neural networks for pattern classification problems

Author: Gabrys B
Khuat TT
Publication venue: 'Elsevier BV'
Publication date: 08/01/2020
Field of study

© 2019 Elsevier B.V. General fuzzy min-max (GFMM) neural network is a generalization of fuzzy neural networks formed by hyperbox fuzzy sets for classification and clustering problems. Two principle algorithms are deployed to train this type of neural network, i.e., incremental learning and agglomerative learning. This paper presents a comprehensive empirical study of performance influencing factors, advantages, and drawbacks of the general fuzzy min-max neural network on pattern classification problems. The subjects of this study include (1) the impact of maximum hyperbox size, (2) the influence of the similarity threshold and measures on the agglomerative learning algorithm, (3) the effect of data presentation order, (4) comparative performance evaluation of the GFMM with other types of fuzzy min-max neural networks and prevalent machine learning algorithms. The experimental results on benchmark datasets widely used in machine learning showed overall strong and weak points of the GFMM classifier. These outcomes also informed potential research directions for this class of machine learning algorithms in the future

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Data analytics enhanced data visualization and interrogation with parallel coordinates plots

Author: Akbar MS
Gabrys B
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/02/2019
Field of study

© 2018 IEEE. Parallel coordinates plots (PCPs) suffer from curse of dimensionality when used with larger multidimensional datasets. Curse of dimentionality results in clutter which hides important visual data trends among coordinates. A number of solutions to address this problem have been proposed including filtering, aggregation, and dimension reordering. These solutions, however, have their own limitations with regard to exploring relationships and trends among the coordinates in PCPs. Correlation based coordinates reordering techniques are among the most popular and have been widely used in PCPs to reduce clutter, though based on the conducted experiments, this research has identified some of their limitations. To achieve better visualization with reduced clutter, we have proposed and evaluated dimensions reordering approach based on minimization of the number of crossing pairs. In the last step, k-means clustering is combined with reordered coordinates to highlight key trends and patterns. The conducted comparative analysis have shown that minimum crossings pairs approach performed much better than other applied techniques for coordinates reordering, and when combined with k-means clustering, resulted in better visualization with significantly reduced clutter

OPUS - University of Technology Sydney

Adaptive community detection incorporating topology and content in social networks<sup>✰</sup>

Author: Gabrys B
Jin D
Lei K
Musial-Gabrys K
Qin M
Publication venue: 'Elsevier BV'
Publication date: 01/12/2018
Field of study

© 2018 In social network analysis, community detection is a basic step to understand the structure and function of networks. Some conventional community detection methods may have limited performance because they merely focus on the networks’ topological structure. Besides topology, content information is another significant aspect of social networks. Although some state-of-the-art methods started to combine these two aspects of information for the sake of the improvement of community partitioning, they often assume that topology and content carry similar information. In fact, for some examples of social networks, the hidden characteristics of content may unexpectedly mismatch with topology. To better cope with such situations, we introduce a novel community detection method under the framework of non-negative matrix factorization (NMF). Our proposed method integrates topology as well as content of networks and has an adaptive parameter (with two variations) to effectively control the contribution of content with respect to the identified mismatch degree. Based on the disjoint community partition result, we also introduce an additional overlapping community discovery algorithm, so that our new method can meet the application requirements of both disjoint and overlapping community detection. The case study using real social networks shows that our new method can simultaneously obtain the community structures and their corresponding semantic description, which is helpful to understand the semantics of communities. Related performance evaluations on both artificial and real networks further indicate that our method outperforms some state-of-the-art methods while exhibiting more robust behavior when the mismatch between topology and content is observed

OPUS - University of Technology Sydney

Towards Digital Twin-Oriented Complex Networked Systems: Introducing heterogeneous node features and interaction rules.

Author: Gabrys B
Musial K
Wen J
Publication venue: Public Library of Science (PLoS)
Publication date: 22/01/2024
Field of study

This study proposes an extendable modelling framework for Digital Twin-Oriented Complex Networked Systems (DT-CNSs) with a goal of generating networks that faithfully represent real-world social networked systems. Modelling process focuses on (i) features of nodes and (ii) interaction rules for creating connections that are built based on individual node's preferences. We conduct experiments on simulation-based DT-CNSs that incorporate various features and rules about network growth and different transmissibilities related to an epidemic spread on these networks. We present a case study on disaster resilience of social networks given an epidemic outbreak by investigating the infection occurrence within specific time and social distance. The experimental results show how different levels of the structural and dynamics complexities, concerned with feature diversity and flexibility of interaction rules respectively, influence network growth and epidemic spread. The analysis revealed that, to achieve maximum disaster resilience, mitigation policies should be targeted at nodes with preferred features as they have higher infection risks and should be the focus of the epidemic control

OPUS - University of Technology Sydney

Directed closure coefficient and its patterns.

Author: Gabrys B
Jia M
Musial K
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

The triangle structure, being a fundamental and significant element, underlies many theories and techniques in studying complex networks. The formation of triangles is typically measured by the clustering coefficient, in which the focal node is the centre-node in an open triad. In contrast, the recently proposed closure coefficient measures triangle formation from an end-node perspective and has been proven to be a useful feature in network analysis. Here, we extend it by proposing the directed closure coefficient that measures the formation of directed triangles. By distinguishing the direction of the closing edge in building triangles, we further introduce the source closure coefficient and the target closure coefficient. Then, by categorising particular types of directed triangles (e.g., head-of-path), we propose four closure patterns. Through multiple experiments on 24 directed networks from six domains, we demonstrate that at network-level, the four closure patterns are distinctive features in classifying network types, while at node-level, adding the source and target closure coefficients leads to significant improvement in link prediction task in most types of directed networks

OPUS - University of Technology Sydney

Directory of Open Access Journals

An Effective Multi-Resolution Hierarchical Granular Representation based Classifier using General Fuzzy Min-Max Neural Network

Author: Chen F
Gabrys B
Khuat TT
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

IEEE Motivated by the practical demands for simplification of data towards being consistent with human thinking and problem solving as well as tolerance of uncertainty, information granules are becoming important entities in data processing at different levels of data abstraction. This paper proposes a method to construct classifiers from multi-resolution hierarchical granular representations (MRHGRC) using hyperbox fuzzy sets. The proposed approach forms a series of granular inferences hierarchically through many levels of abstraction. An attractive characteristic of our classifier is that it can maintain a high accuracy in comparison to other fuzzy min-max models at a low degree of granularity based on reusing the knowledge learned from lower levels of abstraction. In addition, our approach can reduce the data size significantly as well as handle the uncertainty and incompleteness associated with data in real-world applications. The construction process of the classifier consists of two phases. The first phase is to formulate the model at the greatest level of granularity, while the later stage aims to reduce the complexity of the constructed model and deduce it from data at higher abstraction levels. Experimental analyses conducted comprehensively on both synthetic and real datasets indicated the efficiency of our method in terms of training time and predictive performance in comparison to other types of fuzzy min-max neural networks and common machine learning algorithms

arXiv.org e-Print Archive

Crossref

OPUS - University of Technology Sydney