Search CORE

857 research outputs found

Outlier detection using flexible categorisation and interrogative agendas

Author: Boersma Marcel
Manoorkar Krishna
Palmigiano Alessandra
Panettiere Mattia
Tzimoulis Apostolos
Wijnberg Nachoem
Publication venue
Publication date: 20/12/2023
Field of study

Categorization is one of the basic tasks in machine learning and data analysis. Building on formal concept analysis (FCA), the starting point of the present work is that different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them, and different such sets of features may yield better or worse categorizations, relative to the task at hand. In their turn, the (a priori) choice of a particular set of features over another might be subjective and express a certain epistemic stance (e.g. interests, relevance, preferences) of an agent or a group of agents, namely, their interrogative agenda. In the present paper, we represent interrogative agendas as sets of features, and explore and compare different ways to categorize objects w.r.t. different sets of features (agendas). We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas. We then present a supervised meta-learning algorithm to learn suitable (fuzzy) agendas for categorization as sets of features with different weights or masses. We combine this meta-learning algorithm with the unsupervised outlier detection algorithm to obtain a supervised outlier detection algorithm. We show that these algorithms perform at par with commonly used algorithms for outlier detection on commonly used datasets in outlier detection. These algorithms provide both local and global explanations of their results

arXiv.org e-Print Archive

A social media and crowd-sourcing data mining system for crime prevention during and post-crisis situations

Author: Babak Akhgar
Helen Gibson
Konstantinos Domdouzis
Laurence Hirsch
Simon Andrews
Publication venue: 'Emerald'
Publication date: 14/11/2016
Field of study

A number of large crisis situations, such as natural disasters have affected the planet over the last decade. The outcomes of such disasters are catastrophic for the infrastructures of modern societies. Furthermore, after large disasters, societies come face-to-face with important issues, such as the loss of human lives, people who are missing and the increment of the criminality rate. In many occasions, they seem unprepared to face such issues. This paper aims to present an automated system for the synchronization of the police and Law Enforcement Agencies (LEAs) for the prevention of criminal activities during and post a large crisis situation. The paper presents a review of the literature focusing on the necessity of using data mining in combination with advanced web technologies, such as social media and crowd-sourcing, for the resolution of the problems related to criminal activities caused during and post-crisis situations. The paper provides an introduction to examples of different techniques and algorithms used for social media and crowd-sourcing scanning, such as sentiment analysis and link analysis. The main focus of the paper is the ATHENA Crisis Management system. The function of the ATHENA system is based on the use of social media and crowd-sourcing for collecting crisis-related information. The system uses a number of data mining techniques to collect and analyze data from the social media for the purpose of crime prevention. A number of conclusions are drawn on the significance of social media and crowd-sourcing data mining techniques for the resolution of problems related to large crisis situations with emphasis to the ATHENA system

Crossref

Sheffield Hallam University Research Archive

A Comparison on the Classification of Short-text Documents Using Latent Dirichlet Allocation and Formal Concept Analysis

Author: Longo Luca
Rogers Noel
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2017
Field of study

With the increasing amounts of textual data being collected online, automated text classification techniques are becoming increasingly important. However, a lot of this data is in the form of short-text with just a handful of terms per document (e.g. Text messages, tweets or Facebook posts). This data is generally too sparse and noisy to obtain satisfactory classification. Two techniques which aim to alleviate this problem are Latent Dirichlet Allocation (LDA) and Formal Concept Analysis (FCA). Both techniques have been shown to improve the performance of short-text classification by reducing the sparsity of the input data. The relative performance of classifiers that have been enhanced using each technique has not been directly compared so, to address this issue, this work presents an experiment to compare them, using super- vised models. It has shown that FCA leads to a much higher degree of correlation among terms than LDA and initially gives lower classification accuracy. However, once a subset of features is selected for training, the FCA models can outperform those trained on LDA expanded data

Arrow@TUDublin

A Comparison on the Classification of Short-text Documents Using Latent Dirichlet Allocation and Formal Concept Analysis

Author: Longo Luca
Rogers Noel
Publication venue: Technological University Dublin
Publication date: 01/01/2017
Field of study

Arrow@TUDublin

The AI Revolution: Opportunities and Challenges for the Finance Sector

Author: Avramovic Pavle
Epiphaniou Gregory
Hariharan Jagdish
Maple Carsten
Penwarden William
Singh Simran
Staykova Kalina
Szpruch Lukasz
Wang Zijian
Wen Yisi
Publication venue
Publication date: 31/08/2023
Field of study

This report examines Artificial Intelligence (AI) in the financial sector, outlining its potential to revolutionise the industry and identify its challenges. It underscores the criticality of a well-rounded understanding of AI, its capabilities, and its implications to effectively leverage its potential while mitigating associated risks. The potential of AI potential extends from augmenting existing operations to paving the way for novel applications in the finance sector. The application of AI in the financial sector is transforming the industry. Its use spans areas from customer service enhancements, fraud detection, and risk management to credit assessments and high-frequency trading. However, along with these benefits, AI also presents several challenges. These include issues related to transparency, interpretability, fairness, accountability, and trustworthiness. The use of AI in the financial sector further raises critical questions about data privacy and security. A further issue identified in this report is the systemic risk that AI can introduce to the financial sector. Being prone to errors, AI can exacerbate existing systemic risks, potentially leading to financial crises. Regulation is crucial to harnessing the benefits of AI while mitigating its potential risks. Despite the global recognition of this need, there remains a lack of clear guidelines or legislation for AI use in finance. This report discusses key principles that could guide the formation of effective AI regulation in the financial sector, including the need for a risk-based approach, the inclusion of ethical considerations, and the importance of maintaining a balance between innovation and consumer protection. The report provides recommendations for academia, the finance industry, and regulators

arXiv.org e-Print Archive

Bagged Randomized Conceptual Machine Learning Method

Author: Ali Mohamed Abdalhakem Taha
Publication venue
Publication date: 01/06/2018
Field of study

Formal concept analysis (FCA) is a scientific approach aiming to investigate, analyze and represent the conceptual knowledge deduced from the data in conceptual structures (lattice). Recently many researchers are counting on the potentials of FCA to resolve or contribute addressing machine learning problems. However, some of these heuristics are still far from achieving this goal. In another context, ensemble-learning methods are deemed effective in addressing the classification problem, in addition, introducing randomness to ensemble learning found effective in certain scenarios. We exploit the potentials of FCA and the notion of randomness in ensemble learning, and propose a new machine learning method based on random conceptual decomposition. We also propose a novel approach for rule optimization. We develop an effective learning algorithm that is capable of handling some of learning problem aspects, with results that are comparable to other ensemble learning algorithms

Qatar University Institutional Repository

DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and Authentication

Author: Liao Qingmin
Liu Yanjun
Yang Wenming
Publication venue
Publication date: 03/02/2024
Field of study

Finger vein authentication, recognized for its high security and specificity, has become a focal point in biometric research. Traditional methods predominantly concentrate on vein feature extraction for discriminative modeling, with a limited exploration of generative approaches. Suffering from verification failure, existing methods often fail to obtain authentic vein patterns by segmentation. To fill this gap, we introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks. DiffVein is composed of two dedicated branches: one for segmentation and the other for denoising. For better feature interaction between these two branches, we introduce two specialized modules to improve their collective performance. The first, a mask condition module, incorporates the semantic information of vein patterns from the segmentation branch into the denoising process. Additionally, we also propose a Semantic Difference Transformer (SD-Former), which employs Fourier-space self-attention and cross-attention modules to extract category embedding before feeding it to the segmentation task. In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings, thus vein segmentation and authentication tasks can inform and enhance each other in the joint training. To further optimize our model, we introduce a Fourier-space Structural Similarity (FourierSIM) loss function, which is tailored to improve the denoising network's learning efficacy. Extensive experiments on the USM and THU-MVFV3V datasets substantiates DiffVein's superior performance, setting new benchmarks in both vein segmentation and authentication tasks

arXiv.org e-Print Archive