159 research outputs found

    Hard Label Black Box Node Injection Attack on Graph Neural Networks

    Full text link
    While graph neural networks have achieved state-of-the-art performances in many real-world tasks including graph classification and node classification, recent works have demonstrated they are also extremely vulnerable to adversarial attacks. Most previous works have focused on attacking node classification networks under impractical white-box scenarios. In this work, we will propose a non-targeted Hard Label Black Box Node Injection Attack on Graph Neural Networks, which to the best of our knowledge, is the first of its kind. Under this setting, more real world tasks can be studied because our attack assumes no prior knowledge about (1): the model architecture of the GNN we are attacking; (2): the model's gradients; (3): the output logits of the target GNN model. Our attack is based on an existing edge perturbation attack, from which we restrict the optimization process to formulate a node injection attack. In the work, we will evaluate the performance of the attack using three datasets, COIL-DEL, IMDB-BINARY, and NCI1

    Disentangled Representation Learning

    Full text link
    Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controlability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining etc. In this article, we comprehensively review DRL from various aspects including motivations, definitions, methodologies, evaluations, applications and model designs. We discuss works on DRL based on two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition. We further categorize the methodologies for DRL into four groups, i.e., Traditional Statistical Approaches, Variational Auto-encoder Based Approaches, Generative Adversarial Networks Based Approaches, Hierarchical Approaches and Other Approaches. We also analyze principles to design different DRL models that may benefit different tasks in practical applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this work may provide insights for promoting the DRL research in the community.Comment: 22 pages,9 figure

    ModelGPT: Unleashing LLM's Capabilities for Tailored Model Generation

    Full text link
    The rapid advancement of Large Language Models (LLMs) has revolutionized various sectors by automating routine tasks, marking a step toward the realization of Artificial General Intelligence (AGI). However, they still struggle to accommodate the diverse and specific needs of users and simplify the utilization of AI models for the average user. In response, we propose ModelGPT, a novel framework designed to determine and generate AI models specifically tailored to the data or task descriptions provided by the user, leveraging the capabilities of LLMs. Given user requirements, ModelGPT is able to provide tailored models at most 270x faster than the previous paradigms (e.g. all-parameter or LoRA finetuning). Comprehensive experiments on NLP, CV, and Tabular datasets attest to the effectiveness of our framework in making AI models more accessible and user-friendly. Our code is available at https://github.com/IshiKura-a/ModelGPT

    The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks

    Full text link
    The era post-2018 marked the advent of Large Language Models (LLMs), with innovations such as OpenAI's ChatGPT showcasing prodigious linguistic prowess. As the industry galloped toward augmenting model parameters and capitalizing on vast swaths of human language data, security and privacy challenges also emerged. Foremost among these is the potential inadvertent accrual of Personal Identifiable Information (PII) during web-based data acquisition, posing risks of unintended PII disclosure. While strategies like RLHF during training and Catastrophic Forgetting have been marshaled to control the risk of privacy infringements, recent advancements in LLMs, epitomized by OpenAI's fine-tuning interface for GPT-3.5, have reignited concerns. One may ask: can the fine-tuning of LLMs precipitate the leakage of personal information embedded within training datasets? This paper reports the first endeavor to seek the answer to the question, particularly our discovery of a new LLM exploitation avenue, called the Janus attack. In the attack, one can construct a PII association task, whereby an LLM is fine-tuned using a minuscule PII dataset, to potentially reinstate and reveal concealed PIIs. Our findings indicate that, with a trivial fine-tuning outlay, LLMs such as GPT-3.5 can transition from being impermeable to PII extraction to a state where they divulge a substantial proportion of concealed PII. This research, through its deep dive into the Janus attack vector, underscores the imperative of navigating the intricate interplay between LLM utility and privacy preservation

    One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems

    Full text link
    The purpose of sequential recommendation is to utilize the interaction history of a user and predict the next item that the user is most likely to interact with. While data sparsity and cold start are two challenges that most recommender systems are still facing, many efforts are devoted to utilizing data from other domains, called cross-domain methods. However, general cross-domain methods explore the relationship between two domains by designing complex model architecture, making it difficult to scale to multiple domains and utilize more data. Moreover, existing recommendation systems use IDs to represent item, which carry less transferable signals in cross-domain scenarios, and user cross-domain behaviors are also sparse, making it challenging to learn item relationship from different domains. These problems hinder the application of multi-domain methods to sequential recommendation. Recently, large language models (LLMs) exhibit outstanding performance in world knowledge learning from text corpora and general-purpose question answering. Inspired by these successes, we propose a simple but effective framework for domain-agnostic recommendation by exploiting the pre-trained LLMs (namely LLM-Rec). We mix the user's behavior across different domains, and then concatenate the title information of these items into a sentence and model the user's behaviors with a pre-trained language model. We expect that by mixing the user's behaviors across different domains, we can exploit the common knowledge encoded in the pre-trained language model to alleviate the problems of data sparsity and cold start problems. Furthermore, we are curious about whether the latest technical advances in nature language processing (NLP) can transfer to the recommendation scenarios.Comment: 10 pages, 7 figures, 6 table

    Cis-Effects Condition the Induction of a Major Unfolded Protein Response Factor, ZmbZIP60, in Response to Heat Stress in Maize

    Get PDF
    Adverse environmental conditions such as heat and salt stress create endoplasmic reticulum (ER) stress in maize and set off the unfolded protein response (UPR). A key feature of the UPR is the upregulation of ZmbZIP60 and the splicing of its messenger RNA. We conducted an association analysis of a recombinant inbred line (RIL) derived from a cross of a tropical founder line, CML52 with a standard temperate line, B73. We found a major QTL conditioning heat-induced ZmbZIP60 expression located cis to the gene. Based on the premise that the QTL might be associated with the ZmbZIP60 promoter, we evaluated various maize inbred lines for their ability to upregulate the expression of ZmbZIP60 in response to heat stress. In general, tropical lines with promoter regions similar to CML52 were more robust in upregulating ZmbZIP60 in response to heat stress. This finding was confirmed by comparing the strength of the B73 and CML52 ZmbZIP60 promoters in transient maize protoplast assays. We concluded that the upstream region of ZmbZIP60 is important in conditioning the response to heat stress and was under selection in maize when adapted to different environments.Summary: Heat stress has large negative effects on maize grain yield. Heat stress creates ER stress in maize and sets off the UPR. We searched for factors conditioning heat induction of the UPR in maize seedlings by conducting an association analysis based on the upregulation of unspliced and spliced forms of ZmbZIP60 mRNA (ZmbZIP60u and ZmbZIP60s, respectively). ZmbZIP60u was upregulated more robustly by heat stress in the tropical maize line, CML52, than in B73, and a major QTL derived from the analysis of RILs from a cross of these two lines mapped in the vicinity of ZmbZIP60. We conducted a cis/trans test to determine whether the QTL was acting as a cis regulatory element or in trans, as might be expected for a transcription factor. We found that the QTL was acting in cis, likely involving the ZmbZIP60 promoter. ZmbZIP60 promoters in other temperate and tropical lines similar to CML52 showed enhanced expression of ZmbZIP60u by heat. The contribution of the CML52 promoter to heat induction of ZmbZIP60 was confirmed by analyzing the CML52 and B73 promoters linked to a luciferase reporter and assayed in heat-treated maize protoplasts

    Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding

    Full text link
    This study is a pioneering endeavor to investigate the capabilities of Large Language Models (LLMs) in addressing conceptual questions within the domain of mechanical engineering with a focus on mechanics. Our examination involves a manually crafted exam encompassing 126 multiple-choice questions, spanning various aspects of mechanics courses, including Fluid Mechanics, Mechanical Vibration, Engineering Statics and Dynamics, Mechanics of Materials, Theory of Elasticity, and Continuum Mechanics. Three LLMs, including ChatGPT (GPT-3.5), ChatGPT (GPT-4), and Claude (Claude-2.1), were subjected to evaluation against engineering faculties and students with or without mechanical engineering background. The findings reveal GPT-4's superior performance over the other two LLMs and human cohorts in answering questions across various mechanics topics, except for Continuum Mechanics. This signals the potential future improvements for GPT models in handling symbolic calculations and tensor analyses. The performances of LLMs were all significantly improved with explanations prompted prior to direct responses, underscoring the crucial role of prompt engineering. Interestingly, GPT-3.5 demonstrates improved performance with prompts covering a broader domain, while GPT-4 excels with prompts focusing on specific subjects. Finally, GPT-4 exhibits notable advancements in mitigating input bias, as evidenced by guessing preferences for humans. This study unveils the substantial potential of LLMs as highly knowledgeable assistants in both mechanical pedagogy and scientific research.Comment: 30 pages, 7 figures, and 1 tabl

    Anthracene Diphosphate Ligands for CdSe Quantum Dots; Molecular Design for Efficient Upconversion

    Get PDF
    Quantum dot (QD)-sensitized photon upconversion follows a multi-step energy transfer process from the QD to transmitter ligand to a soluble annihilator. Using a novel 10-R-anthracene-1,8-diphosphoric acid (R = octyl, 2-hexyldecyl, phenyl) ligand with high binding affinity for CdSe QD surfaces, we demonstrate a photon upconversion process that is limited by the transmitter to annihilator transfer efficiency. Using 1H NMR spectroscopy, we demonstrate that these bidentate diphosphate ligands rapidly and irreversibly displace two carboxylate ligands. These ligands mediate energy transfer from the photoexcited QDs to a triplet annihilator (1,10-diphenylanthracene), producing overall photon upconversion quantum efficiencies as high as 17%, the highest for QDs with no shells. Transient absorption spectroscopy shows that the anthracene dihydrogen phosphate (ADP) ligand supports a 3.4 fold longer triplet state lifetime compared to 9-anthracene carboxylic acid (299.9 ± 9.5 vs 88.2 ± 2.1 μs), increasing the probability of energy transfer
    corecore