7,239 research outputs found
HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention
Twitter bot detection has become an increasingly important and challenging
task to combat online misinformation, facilitate social content moderation, and
safeguard the integrity of social platforms. Though existing graph-based
Twitter bot detection methods achieved state-of-the-art performance, they are
all based on the homophily assumption, which assumes users with the same label
are more likely to be connected, making it easy for Twitter bots to disguise
themselves by following a large number of genuine users. To address this issue,
we proposed HOFA, a novel graph-based Twitter bot detection framework that
combats the heterophilous disguise challenge with a homophily-oriented graph
augmentation module (Homo-Aug) and a frequency adaptive attention module
(FaAt). Specifically, the Homo-Aug extracts user representations and computes a
k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN
graph. For the FaAt, we propose an attention mechanism that adaptively serves
as a low-pass filter along a homophilic edge and a high-pass filter along a
heterophilic edge, preventing user features from being over-smoothed by their
neighborhood. We also introduce a weight guidance loss to guide the frequency
adaptive attention module. Our experiments demonstrate that HOFA achieves
state-of-the-art performance on three widely-acknowledged Twitter bot detection
benchmarks, which significantly outperforms vanilla graph-based bot detection
techniques and strong heterophilic baselines. Furthermore, extensive studies
confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability
to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure
Audio-visual multi-modality driven hybrid feature learning model for crowd analysis and classification
The high pace emergence in advanced software systems, low-cost hardware and decentralized cloud computing technologies have broadened the horizon for vision-based surveillance, monitoring and control. However, complex and inferior feature learning over visual artefacts or video streams, especially under extreme conditions confine majority of the at-hand vision-based crowd analysis and classification systems. Retrieving event-sensitive or crowd-type sensitive spatio-temporal features for the different crowd types under extreme conditions is a highly complex task. Consequently, it results in lower accuracy and hence low reliability that confines existing methods for real-time crowd analysis. Despite numerous efforts in vision-based approaches, the lack of acoustic cues often creates ambiguity in crowd classification. On the other hand, the strategic amalgamation of audio-visual features can enable accurate and reliable crowd analysis and classification. Considering it as motivation, in this research a novel audio-visual multi-modality driven hybrid feature learning model is developed for crowd analysis and classification. In this work, a hybrid feature extraction model was applied to extract deep spatio-temporal features by using Gray-Level Co-occurrence Metrics (GLCM) and AlexNet transferrable learning model. Once extracting the different GLCM features and AlexNet deep features, horizontal concatenation was done to fuse the different feature sets. Similarly, for acoustic feature extraction, the audio samples (from the input video) were processed for static (fixed size) sampling, pre-emphasis, block framing and Hann windowing, followed by acoustic feature extraction like GTCC, GTCC-Delta, GTCC-Delta-Delta, MFCC, Spectral Entropy, Spectral Flux, Spectral Slope and Harmonics to Noise Ratio (HNR). Finally, the extracted audio-visual features were fused to yield a composite multi-modal feature set, which is processed for classification using the random forest ensemble classifier. The multi-class classification yields a crowd-classification accurac12529y of (98.26%), precision (98.89%), sensitivity (94.82%), specificity (95.57%), and F-Measure of 98.84%. The robustness of the proposed multi-modality-based crowd analysis model confirms its suitability towards real-world crowd detection and classification tasks
Towards A Practical High-Assurance Systems Programming Language
Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation.
Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code.
To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process
GenAssist: Making Image Generation Accessible
Blind and low vision (BLV) creators use images to communicate with sighted
audiences. However, creating or retrieving images is challenging for BLV
creators as it is difficult to use authoring tools or assess image search
results. Thus, creators limit the types of images they create or recruit
sighted collaborators. While text-to-image generation models let creators
generate high-fidelity images based on a text description (i.e. prompt), it is
difficult to assess the content and quality of generated images. We present
GenAssist, a system to make text-to-image generation accessible. Using our
interface, creators can verify whether generated image candidates followed the
prompt, access additional details in the image not specified in the prompt, and
skim a summary of similarities and differences between image candidates. To
power the interface, GenAssist uses a large language model to generate visual
questions, vision-language models to extract answers, and a large language
model to summarize the results. Our study with 12 BLV creators demonstrated
that GenAssist enables and simplifies the process of image selection and
generation, making visual authoring more accessible to all.Comment: For accessibility tagged pdf, please refer to the ancillary fil
Recommended from our members
The impact of employees' working relations in creating and retaining trust: the case of the Bahrain Olympic Committee
Introduction: This thesis investigates the impact of employees’ working relations in creating, maintaining and retaining trust in the Bahrain Olympic Committee (BOC).
Aim: The main aim of this thesis is to determine how the three groups of Organisational Trust variables, namely Social System Elements (SSE), Factors of Trustworthiness (FoT) and Third-Party Gossip (TPG), affect employees’ Organisational Trust (OTR) in the BOC and promote Organisational Citizenship Behaviour (OCB). To answer this main aim, a conceptual framework was created that focused on exploring the following research aims: (1) the interrelationship between SSE and FoT, (2) the effect of SSE on OTR, (3) the impact of TPG on OTR and (4) the effect of OTR on overall OCB.
Methodology: The study uses a mixed-method case study research style that included in-depth semi-structured interviews with 17 managers, an online questionnaire survey with 320 employees of the BOC and an analysis of the BOC’s Annual Reports from 2015 to 2018.
Results: The qualitative and quantitative findings indicate, firstly, that there is a significant interrelationship between SSE and FoT, establishing that SSE’s perception of organisational justice (OJ), including that FoTs benevolence and integrity as the most important factors in yielding employees’ trust in the BOC. Secondly, it has been established that SSEs have significant direct and indirect effects on OTR. Thirdly, negative and positive TPG concurrently occurred in the BOC and the prevalence of negative TPG poses more impact on OTR. Finally, this study’s findings demonstrated OTR’s effect in generating OCB, including that Civic Virtue was rated as the most preferred of the five OCB themes; this indicates the managers’ and the employees’ strong emotional attachment and support of the activities taking place at the BOC.
Contributions: Overall, this thesis substantially contributes to OTR literature, particularly in the context of the Middle East. It also proposes several insightful recommendations for future research and practical implications for practitioners in the field of Organisational Trust
Unlocking Foundation Models for Privacy-Enhancing Speech Understanding: An Early Study on Low Resource Speech Training Leveraging Label-guided Synthetic Speech Content
Automatic Speech Understanding (ASU) leverages the power of deep learning
models for accurate interpretation of human speech, leading to a wide range of
speech applications that enrich the human experience. However, training a
robust ASU model requires the curation of a large number of speech samples,
creating risks for privacy breaches. In this work, we investigate using
foundation models to assist privacy-enhancing speech computing. Unlike
conventional works focusing primarily on data perturbation or distributed
algorithms, our work studies the possibilities of using pre-trained generative
models to synthesize speech content as training data with just label guidance.
We show that zero-shot learning with training label-guided synthetic speech
content remains a challenging task. On the other hand, our results demonstrate
that the model trained with synthetic speech samples provides an effective
initialization point for low-resource ASU training. This result reveals the
potential to enhance privacy by reducing user data collection but using
label-guided synthetic speech content
Irish Ocean Climate and Ecosystem Status Report
Summary report for Irish Ocean Climate & Ecosystem Status Report also published here. This Irish Ocean Climate & Ecosystem Status
Summary for Policymakers brings together the
latest evidence of ocean change in Irish waters.
The report is intended to summarise the current
trends in atmospheric patterns, ocean warming,
sea level rise, ocean acidification, plankton and
fish distributions and abundance, and seabird
population trends. The report represents a
collaboration between marine researchers within
the Marine Institute and others based in Ireland’s
higher education institutes and public bodies. It
includes authors from Met Éireann, Maynooth
University, the University of Galway, the Atlantic
Technological University, National Parks and
Wildlife, Birdwatch Ireland, Trinity College Dublin,
University College Dublin, Inland Fisheries Ireland,
The National Water Forum, the Environmental
Protection Agency, and the Dundalk Institute of
Technology.This report is intended to summarise the
current trends in Ireland’s ocean climate. Use
has been made of archived marine data held by
a range of organisations to elucidate some of
the key trends observed in phenomena such as
atmospheric changes, ocean warming, sea level
rise, acidification, plankton and fish distributions
and abundance, and seabirds. The report aims to
summarise the key findings and recommendations
in each of these areas as a guide to climate
adaptation policy and for the public. It builds on the
previous Ocean Climate & Ecosystem Status Report
published in 2010.
The report examines the recently published
literature in each of the topic areas and combines
this in many cases with analysis of new data sets
including long-term time series to identify trends
in essential ocean variables in Irish waters. In
some cases, model projections of the likely future
state of the atmosphere and ocean are presented
under different climate emission scenarios.Marine Institut
RSGPT: A Remote Sensing Vision Language Model and Benchmark
The emergence of large-scale large language models, with GPT-4 as a prominent
example, has significantly propelled the rapid advancement of artificial
general intelligence and sparked the revolution of Artificial Intelligence 2.0.
In the realm of remote sensing (RS), there is a growing interest in developing
large vision language models (VLMs) specifically tailored for data analysis in
this domain. However, current research predominantly revolves around visual
recognition tasks, lacking comprehensive, large-scale image-text datasets that
are aligned and suitable for training large VLMs, which poses significant
challenges to effectively training such models for RS applications. In computer
vision, recent research has demonstrated that fine-tuning large vision language
models on small-scale, high-quality datasets can yield impressive performance
in visual and language understanding. These results are comparable to
state-of-the-art VLMs trained from scratch on massive amounts of data, such as
GPT-4. Inspired by this captivating idea, in this work, we build a high-quality
Remote Sensing Image Captioning dataset (RSICap) that facilitates the
development of large VLMs in the RS field. Unlike previous RS datasets that
either employ model-generated captions or short descriptions, RSICap comprises
2,585 human-annotated captions with rich and high-quality information. This
dataset offers detailed descriptions for each image, encompassing scene
descriptions (e.g., residential area, airport, or farmland) as well as object
information (e.g., color, shape, quantity, absolute position, etc). To
facilitate the evaluation of VLMs in the field of RS, we also provide a
benchmark evaluation dataset called RSIEval. This dataset consists of
human-annotated captions and visual question-answer pairs, allowing for a
comprehensive assessment of VLMs in the context of RS
Recommended from our members
A Survey of Quantum-Cognitively Inspired Sentiment Analysis Models
Quantum theory, originally proposed as a physical theory to describe the motions of microscopic particles, has been applied to various non-physics domains involving human cognition and decision-making that are inherently uncertain and exhibit certain non-classical, quantum-like characteristics. Sentiment analysis is a typical example of such domains. In the last few years, by leveraging the modeling power of quantum probability (a non-classical probability stemming from quantum mechanics methodology) and deep neural networks, a range of novel quantum-cognitively inspired models for sentiment analysis have emerged and performed well. This survey presents a timely overview of the latest developments in this fascinating cross-disciplinary area. We first provide a background of quantum probability and quantum cognition at a theoretical level, analyzing their advantages over classical theories in modeling the cognitive aspects of sentiment analysis. Then, recent quantum-cognitively inspired models are introduced and discussed in detail, focusing on how they approach the key challenges of the sentiment analysis task. Finally, we discuss the limitations of the current research and highlight future research directions
Fairness Testing: A Comprehensive Survey and Analysis of Trends
Unfair behaviors of Machine Learning (ML) software have garnered increasing
attention and concern among software engineers. To tackle this issue, extensive
research has been dedicated to conducting fairness testing of ML software, and
this paper offers a comprehensive survey of existing studies in this field. We
collect 100 papers and organize them based on the testing workflow (i.e., how
to test) and testing components (i.e., what to test). Furthermore, we analyze
the research focus, trends, and promising directions in the realm of fairness
testing. We also identify widely-adopted datasets and open-source tools for
fairness testing
- …