Search CORE

2,333 research outputs found

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Author: Bansal Gagan
Jiang Li
Li Beibin
Wang Chi
Wu Qingyun
Wu Yiran
Zhang Jieyu
Zhang Shaokun
Zhang Xiaoyun
Zhu Erkang
Publication venue
Publication date: 16/08/2023
Field of study

This technical report presents AutoGen, a new framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks. AutoGen agents are customizable, conversable, and seamlessly allow human participation. They can operate in various modes that employ combinations of LLMs, human inputs, and tools. AutoGen's design offers multiple advantages: a) it gracefully navigates the strong but imperfect generation and reasoning abilities of these LLMs; b) it leverages human understanding and intelligence, while providing valuable automation through conversations between agents; c) it simplifies and unifies the implementation of complex LLM workflows as automated agent chats. We provide many diverse examples of how developers can easily use AutoGen to effectively solve tasks or build applications, ranging from coding, mathematics, operations research, entertainment, online decision-making, question answering, etc.Comment: 28 page

arXiv.org e-Print Archive

Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks

Author: Biggio Battista
Giacinto Giorgio
Maiorca Davide
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use of social engineering techniques to infect their machines. Research showed that machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive taxonomy of the different approaches used to generate PDF malware, and of the corresponding learning-based detection systems. We then categorize threats specifically targeted against learning-based PDF malware detectors, using a well-established framework in the field of adversarial machine learning. This framework allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the impact of such threats. We conclude the paper by discussing how such findings highlight promising research directions towards tackling the more general challenge of designing robust malware detectors in adversarial settings

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Cagliari

Deep Adversarial Frameworks for Visually Explainable Periocular Recognition

Author: Brito João Pedro da Cruz
Publication venue
Publication date: 13/07/2021
Field of study

Machine Learning (ML) models have pushed stateoftheart performance closer to (and even beyond) human level. However, the core of such algorithms is usually latent and hardly understandable. Thus, the field of Explainability focuses on researching and adopting techniques that can explain the reasons that support a model’s predictions. Such explanations of the decisionmaking process would help to build trust between said model and the human(s) using it. An explainable system also allows for better debugging, during the training phase, and fixing, upon deployment. But why should a developer devote time and effort into refactoring or rethinking Artificial Intelligence (AI) systems, to make them more transparent? Don’t they work just fine? Despite the temptation to answer ”yes”, are we really considering the cases where these systems fail? Are we assuming that ”almost perfect” accuracy is good enough? What if, some of the cases where these systems get it right, were just a small margin away from a complete miss? Does that even matter? Considering the evergrowing presence of ML models in crucial areas like forensics, security and healthcare services, it clearly does. Motivating these concerns is the fact that powerful systems often operate as blackboxes, hiding the core reasoning underneath layers of abstraction [Gue]. In this scenario, there could be some seriously negative outcomes if opaque algorithms gamble on the presence of tumours in Xray images or the way autonomous vehicles behave in traffic. It becomes clear, then, that incorporating explainability with AI is imperative. More recently, the politicians have addressed this urgency through the General Data Protection Regulation (GDPR) [Com18]. With this document, the European Union (EU) brings forward several important concepts, amongst which, the ”right to an explanation”. The definition and scope are still subject to debate [MF17], but these are definite strides to formally regulate the explainable depth of autonomous systems. Based on the preface above, this work describes a periocular recognition framework that not only performs biometric recognition but also provides clear representations of the features/regions that support a prediction. Being particularly designed to explain nonmatch (”impostors”) decisions, our solution uses adversarial generative techniques to synthesise a large set of ”genuine” image pairs, from where the most similar elements with respect to a query are retrieved. Then, assuming the alignment between the query/retrieved pairs, the elementwise differences between the query and a weighted average of the retrieved elements yields a visual explanation of the regions in the query pair that would have to be different to transform it into a ”genuine” pair. Our quantitative and qualitative experiments validate the proposed solution, yielding recognition rates that are similar to the stateoftheart, while adding visually pleasing explanations

UBibliorum repositorio digital da ubi

Machine Learning for Synthetic Data Generation: A Review

Author: Lu Yingzhou
Wang Huazheng
Wei Wenqi
Publication venue
Publication date: 28/03/2023
Field of study

Data plays a crucial role in machine learning. However, in real-world applications, there are several problems with data, e.g., data are of low quality; a limited number of data points lead to under-fitting of the machine learning model; it is hard to access the data due to privacy, safety and regulatory concerns. Synthetic data generation offers a promising new avenue, as it can be shared and used in ways that real-world data cannot. This paper systematically reviews the existing works that leverage machine learning models for synthetic data generation. Specifically, we discuss the synthetic data generation works from several perspectives: (i) applications, including computer vision, speech, natural language, healthcare, and business; (ii) machine learning methods, particularly neural network architectures and deep generative models; (iii) privacy and fairness issue. In addition, we identify the challenges and opportunities in this emerging field and suggest future research directions

arXiv.org e-Print Archive

Location reliability and gamification mechanisms for mobile crowd sensing

Author: Talasila Manoop
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2015
Field of study

People-centric sensing with smart phones can be used for large scale sensing of the physical world by leveraging the sensors on the phones. This new type of sensing can be a scalable and cost-effective alternative to deploying static wireless sensor networks for dense sensing coverage across large areas. However, mobile people-centric sensing has two main issues: 1) Data reliability in sensed data and 2) Incentives for participants. To study these issues, this dissertation designs and develops McSense, a mobile crowd sensing system which provides monetary and social incentives to users. This dissertation proposes and evaluates two protocols for location reliability as a step toward achieving data reliability in sensed data, namely, ILR (Improving Location Reliability) and LINK (Location authentication through Immediate Neighbors Knowledge). ILR is a scheme which improves the location reliability of mobile crowd sensed data with minimal human efforts based on location validation using photo tasks and expanding the trust to nearby data points using periodic Bluetooth scanning. LINK is a location authentication protocol working independent of wireless carriers, in which nearby users help authenticate each other’s location claims using Bluetooth communication. The results of experiments done on Android phones show that the proposed protocols are capable of detecting a significant percentage of the malicious users claiming false location. Furthermore, simulations with the LINK protocol demonstrate that LINK can effectively thwart a number of colluding user attacks. This dissertation also proposes a mobile sensing game which helps collect crowd sensing data by incentivizing smart phone users to play sensing games on their phones. We design and implement a first person shooter sensing game, “Alien vs. Mobile User”, which employs techniques to attract users to unpopular regions. The user study results show that mobile gaming can be a successful alternative to micro-payments for fast and efficient area coverage in crowd sensing. It is observed that the proposed game design succeeds in achieving good player engagement

Digital Commons @ New Jersey Institute of Technology (NJIT)

Privacy Intelligence: A Survey on Image Sharing on Online Social Networks

Author: Liu Chi
Zhang Jun
Zhou Wanlei
Zhu Tianqing
Publication venue
Publication date: 27/08/2020
Field of study

Image sharing on online social networks (OSNs) has become an indispensable part of daily social activities, but it has also led to an increased risk of privacy invasion. The recent image leaks from popular OSN services and the abuse of personal photos using advanced algorithms (e.g. DeepFake) have prompted the public to rethink individual privacy needs when sharing images on OSNs. However, OSN image sharing itself is relatively complicated, and systems currently in place to manage privacy in practice are labor-intensive yet fail to provide personalized, accurate and flexible privacy protection. As a result, an more intelligent environment for privacy-friendly OSN image sharing is in demand. To fill the gap, we contribute a systematic survey of 'privacy intelligence' solutions that target modern privacy issues related to OSN image sharing. Specifically, we present a high-level analysis framework based on the entire lifecycle of OSN image sharing to address the various privacy issues and solutions facing this interdisciplinary field. The framework is divided into three main stages: local management, online management and social experience. At each stage, we identify typical sharing-related user behaviors, the privacy issues generated by those behaviors, and review representative intelligent solutions. The resulting analysis describes an intelligent privacy-enhancing chain for closed-loop privacy management. We also discuss the challenges and future directions existing at each stage, as well as in publicly available datasets.Comment: 32 pages, 9 figures. Under revie

arXiv.org e-Print Archive