Search CORE

135 research outputs found

A Hybrid Continual Machine Learning Model for Efficient Hierarchical Classification of Domain-Specific Text in The Presence of Class Overlap (Case Study: IT Support Tickets)

Author: Wahba Yasmen M
Publication venue: Scholarship@Western
Publication date: 17/03/2023
Field of study

In today’s world, support ticketing systems are employed by a wide range of businesses. The ticketing system facilitates the interaction between customers and the support teams when the customer faces an issue with a product or a service. For large-scale IT companies with a large number of clients and a great volume of communications, the task of automating the classification of incoming tickets is key to guaranteeing long-term clients and ensuring business growth. Although the problem of text classification has been widely studied in the literature, the majority of the proposed approaches revolve around state-of-the-art deep learning models. This thesis addresses the following research questions: What are the reasons behind employing black box models (i.e., deep learning models) for text classification tasks? What is the level of polysemy (i.e., the coexistence of many possible meanings for a word or phrase) in a technical (i.e., specialized) text? How do static word embeddings like Word2vec fare against traditional TFIDF vectorization? How do dynamic word embeddings (e.g., PLMs) compare against a linear classifier such as Support Vector Machine (SVM) for classifying a domain-specific text? This integrated article thesis aims to investigate the aforementioned issues through five empirical studies that were conducted over the past four years. The observation of our studies is an emerging theory that demonstrates why traditional ML models offer a more efficient solution to domain-specific text classification compared to state-of-the-art DL language models (i.e., PLMs). Based on extensive experiments on a real-world dataset, we propose a novel Hybrid Online Offline Model (HOOM) that can efficiently classify IT Support Tickets in a real-time (i.e., dynamic) environment. Our classification model is anticipated to build trust and confidence when deployed into production as the model is interpretable, efficient, and can detect concept drifts in the data

Scholarship@Western

Cyber Security

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/12/2022
Field of study

This open access book constitutes the refereed proceedings of the 18th China Annual Conference on Cyber Security, CNCERT 2022, held in Beijing, China, in August 2022. The 17 papers presented were carefully reviewed and selected from 64 submissions. The papers are organized according to the following topical sections: data security; anomaly detection; cryptocurrency; information security; vulnerabilities; mobile internet; threat intelligence; text recognition

Directory of Open Access Books (DOAB)

Machine Learning Methods with Noisy, Incomplete or Small Datasets

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios

Directory of Open Access Books (DOAB)

Cyber Security

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

Author: Bunte Kerstin
Canducci Marco
De Rijcke Sven
Mastropietro Michele
Peletier Reynier
Taghribi Albolfazl
Tino Peter
Yin H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2021
Field of study

The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

University of Birmingham Research Portal

Dissertations of the University of Groningen

Multimodal Identification of Alzheimer's Disease: A Review

Author: Chen Calvin Yu-Chian
Fang Guian
Huang Jiehui
Liu Mengsha
Tang Zhenchao
Zhang Zhuolin
Zhong Yi
Publication venue
Publication date: 06/10/2023
Field of study

Alzheimer's disease is a progressive neurological disorder characterized by cognitive impairment and memory loss. With the increasing aging population, the incidence of AD is continuously rising, making early diagnosis and intervention an urgent need. In recent years, a considerable number of teams have applied computer-aided diagnostic techniques to early classification research of AD. Most studies have utilized imaging modalities such as magnetic resonance imaging (MRI), positron emission tomography (PET), and electroencephalogram (EEG). However, there have also been studies that attempted to use other modalities as input features for the models, such as sound, posture, biomarkers, cognitive assessment scores, and their fusion. Experimental results have shown that the combination of multiple modalities often leads to better performance compared to a single modality. Therefore, this paper will focus on different modalities and their fusion, thoroughly elucidate the mechanisms of various modalities, explore which methods should be combined to better harness their utility, analyze and summarize the literature in the field of early classification of AD in recent years, in order to explore more possibilities of modality combinations

arXiv.org e-Print Archive

Acta Cybernetica : Volume 23. Number 2.

Author
Publication venue
Publication date: 01/01/2017
Field of study

University of Szeged

Volume II Acquisition Research Creating Synergy for Informed Change, Thursday 19th Annual Acquisition Research Proceedings

Author: Acquisition Research Program
Publication venue: Monterey, California. Naval Postgraduate School
Publication date: 29/04/2022
Field of study

ProceedingsApproved for public release; distribution is unlimited

Calhoun, Institutional Archive of the Naval Postgraduate School

“I Can See the Forest for the Trees”: Examining Personality Traits with Trasformers

Author: Moore Alexander
Publication venue: Clemson University Libraries
Publication date: 01/05/2022
Field of study

Our understanding of Personality and its structure is rooted in linguistic studies operating under the assumptions made by the Lexical Hypothesis: personality characteristics that are important to a group of people will at some point be codified in their language, with the number of encoded representations of a personality characteristic indicating their importance. Qualitative and quantitative efforts in the dimension reduction of our lexicon throughout the mid-20th century have played a vital role in the field’s eventual arrival at the widely accepted Five Factor Model (FFM). However, there are a number of presently unresolved conflicts regarding the breadth and structure of this model (c.f., Hough, Oswald, & Ock, 2015). The present study sought to address such issues through previously unavailable language modeling techniques. The Distributional Semantic Hypothesis (DSH) argues that the meaning of words may be formed through some function of their co-occurrence with other words. There is evidence that DSH-based techniques are cognitively valid, serving as a proxy for learned associations between stimuli (Günther et al., 2019). Given that Personality is often measured through self-report surveys, the present study proposed that a Personality measure be created directly from this source data, using large pre-trained Transformers (a type of neural network that is adept at encoding and decoding semantic representations from natural language). An inventory was constructed, administered, and response data was analyzed using partial correlation networks. This exploratory study identifies differences in the internal structure of trait-domains, while simultaneously demonstrating a quantitative approach to item creation and survey development

Clemson University: TigerPrints