5 research outputs found
Minimum Density Hyperplanes
Associating distinct groups of objects (clusters) with contiguous regions of
high probability density (high-density clusters), is central to many
statistical and machine learning approaches to the classification of unlabelled
data. We propose a novel hyperplane classifier for clustering and
semi-supervised classification which is motivated by this objective. The
proposed minimum density hyperplane minimises the integral of the empirical
probability density function along it, thereby avoiding intersection with high
density clusters. We show that the minimum density and the maximum margin
hyperplanes are asymptotically equivalent, thus linking this approach to
maximum margin clustering and semi-supervised support vector classifiers. We
propose a projection pursuit formulation of the associated optimisation problem
which allows us to find minimum density hyperplanes efficiently in practice,
and evaluate its performance on a range of benchmark datasets. The proposed
approach is found to be very competitive with state of the art methods for
clustering and semi-supervised classification
Real Time Sentiment Change Detection of Twitter Data Streams
In the past few years, there has been a huge growth in Twitter sentiment
analysis having already provided a fair amount of research on sentiment
detection of public opinion among Twitter users. Given the fact that Twitter
messages are generated constantly with dizzying rates, a huge volume of
streaming data is created, thus there is an imperative need for accurate
methods for knowledge discovery and mining of this information. Although there
exists a plethora of twitter sentiment analysis methods in the recent
literature, the researchers have shifted to real-time sentiment identification
on twitter streaming data, as expected. A major challenge is to deal with the
Big Data challenges arising in Twitter streaming applications concerning both
Volume and Velocity. Under this perspective, in this paper, a methodological
approach based on open source tools is provided for real-time detection of
changes in sentiment that is ultra efficient with respect to both memory
consumption and computational cost. This is achieved by iteratively collecting
tweets in real time and discarding them immediately after their process. For
this purpose, we employ the Lexicon approach for sentiment characterizations,
while change detection is achieved through appropriate control charts that do
not require historical information. We believe that the proposed methodology
provides the trigger for a potential large-scale monitoring of threads in an
attempt to discover fake news spread or propaganda efforts in their early
stages. Our experimental real-time analysis based on a recent hashtag provides
evidence that the proposed approach can detect meaningful sentiment changes
across a hashtags lifetime
Detection of Fake Generated Scientific Abstracts
The widespread adoption of Large Language Models and publicly available
ChatGPT has marked a significant turning point in the integration of Artificial
Intelligence into people's everyday lives. The academic community has taken
notice of these technological advancements and has expressed concerns regarding
the difficulty of discriminating between what is real and what is artificially
generated. Thus, researchers have been working on developing effective systems
to identify machine-generated text. In this study, we utilize the GPT-3 model
to generate scientific paper abstracts through Artificial Intelligence and
explore various text representation methods when combined with Machine Learning
models with the aim of identifying machine-written text. We analyze the models'
performance and address several research questions that rise during the
analysis of the results. By conducting this research, we shed light on the
capabilities and limitations of Artificial Intelligence generated text