3,609 research outputs found

    Netflix and Forget: Efficient and Exact Machine Unlearning from Bi-linear Recommendations

    Full text link
    People break up, miscarry, and lose loved ones. Their online streaming and shopping recommendations, however, do not necessarily update, and may serve as unhappy reminders of their loss. When users want to renege on their past actions, they expect the recommender platforms to erase selective data at the model level. Ideally, given any specified user history, the recommender can unwind or "forget", as if the record was not part of training. To that end, this paper focuses on simple but widely deployed bi-linear models for recommendations based on matrix completion. Without incurring the cost of re-training, and without degrading the model unnecessarily, we develop Unlearn-ALS by making a few key modifications to the fine-tuning procedure under Alternating Least Squares optimisation, thus applicable to any bi-linear models regardless of the training procedure. We show that Unlearn-ALS is consistent with retraining without \emph{any} model degradation and exhibits rapid convergence, making it suitable for a large class of existing recommenders.Comment: 8 pages, 8 figure

    A Systematic Evaluation of Node Embedding Robustness

    Full text link
    Node embedding methods map network nodes to low dimensional vectors that can be subsequently used in a variety of downstream prediction tasks. The popularity of these methods has significantly increased in recent years, yet, their robustness to perturbations of the input data is still poorly understood. In this paper, we assess the empirical robustness of node embedding models to random and adversarial poisoning attacks. Our systematic evaluation covers representative embedding methods based on Skip-Gram, matrix factorization, and deep neural networks. We compare edge addition, deletion and rewiring strategies computed using network properties as well as node labels. We also investigate the effect of label homophily and heterophily on robustness. We report qualitative results via embedding visualization and quantitative results in terms of downstream node classification and network reconstruction performances. We found that node classification suffers from higher performance degradation as opposed to network reconstruction, and that degree-based and label-based attacks are on average the most damaging

    Optimized common features selection and deep-autoencoder (OCFSDA) for lightweight intrusion detection in Internet of things.

    Get PDF
    Embedded systems, including the Internet of Things (IoT), play a crucial role in the functioning of critical infrastructure. However, these devices face significant challenges such as memory footprint, technical challenges, privacy concerns, performance trade-offs and vulnerability to cyber-attacks. One approach to address these concerns is minimising computational overhead and adopting lightweight intrusion detection techniques. In this study, we propose a highly efficient model called Optimized Common Features Selection and Deep-Autoencoder (OCFSDA) for lightweight intrusion detection in IoT environments. The proposed OCFSDA model incorporates feature selection, data compression, pruning and deparameterization. We deployed the model on a Raspberry Pi4 using the TFLite interpreter by leveraging optimisation and inferencing with semi-supervised learning. Using the MQTT-IoT-IDS2020 and CICIDS2017 datasets, our experimental results demonstrate a remarkable reduction in the computation cost in terms of time and memory use. Notably, the model achieved an overall average accuracies of 99% and 97%, along with comparable performance on other important metrics such as precision, recall and F1-score. Moreover, the model accomplished the classification tasks within 0.30 and 0.12s using only 2KB of memory

    LAVA: Data Valuation without Pre-Specified Learning Algorithms

    Full text link
    Traditionally, data valuation (DV) is posed as a problem of equitably splitting the validation performance of a learning algorithm among the training data. As a result, the calculated data values depend on many design choices of the underlying learning algorithm. However, this dependence is undesirable for many DV use cases, such as setting priorities over different data sources in a data acquisition process and informing pricing mechanisms in a data marketplace. In these scenarios, data needs to be valued before the actual analysis and the choice of the learning algorithm is still undetermined then. Another side-effect of the dependence is that to assess the value of individual points, one needs to re-run the learning algorithm with and without a point, which incurs a large computation burden. This work leapfrogs over the current limits of data valuation methods by introducing a new framework that can value training data in a way that is oblivious to the downstream learning algorithm. Our main results are as follows. (1) We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets. We show that the distance characterizes the upper bound of the validation performance for any given model under certain Lipschitz conditions. (2) We develop a novel method to value individual data based on the sensitivity analysis of the class-wise Wasserstein distance. Importantly, these values can be directly obtained for free from the output of off-the-shelf optimization solvers when computing the distance. (3) We evaluate our new data valuation framework over various use cases related to detecting low-quality data and show that, surprisingly, the learning-agnostic feature of our framework enables a significant improvement over SOTA performance while being orders of magnitude faster.Comment: ICLR 2023 Spotlight Latest Updated Version: 2023/12/1

    Robust Bayesian Tensor Factorization with Zero-Inflated Poisson Model and Consensus Aggregation

    Full text link
    Tensor factorizations (TF) are powerful tools for the efficient representation and analysis of multidimensional data. However, classic TF methods based on maximum likelihood estimation underperform when applied to zero-inflated count data, such as single-cell RNA sequencing (scRNA-seq) data. Additionally, the stochasticity inherent in TFs results in factors that vary across repeated runs, making interpretation and reproducibility of the results challenging. In this paper, we introduce Zero Inflated Poisson Tensor Factorization (ZIPTF), a novel approach for the factorization of high-dimensional count data with excess zeros. To address the challenge of stochasticity, we introduce Consensus Zero Inflated Poisson Tensor Factorization (C-ZIPTF), which combines ZIPTF with a consensus-based meta-analysis. We evaluate our proposed ZIPTF and C-ZIPTF on synthetic zero-inflated count data and synthetic and real scRNA-seq data. ZIPTF consistently outperforms baseline matrix and tensor factorization methods in terms of reconstruction accuracy for zero-inflated data. When the probability of excess zeros is high, ZIPTF achieves up to 2.4×2.4\times better accuracy. Additionally, C-ZIPTF significantly improves the consistency and accuracy of the factorization. When tested on both synthetic and real scRNA-seq data, ZIPTF and C-ZIPTF consistently recover known and biologically meaningful gene expression programs

    Understanding Deep Gradient Leakage via Inversion Influence Functions

    Full text link
    Deep Gradient Leakage (DGL) is a highly effective attack that recovers private training images from gradient vectors. This attack casts significant privacy challenges on distributed learning from clients with sensitive data, where clients are required to share gradients. Defending against such attacks requires but lacks an understanding of when and how privacy leakage happens, mostly because of the black-box nature of deep networks. In this paper, we propose a novel Inversion Influence Function (I2^2F) that establishes a closed-form connection between the recovered images and the private gradients by implicitly solving the DGL problem. Compared to directly solving DGL, I2^2F is scalable for analyzing deep networks, requiring only oracle access to gradients and Jacobian-vector products. We empirically demonstrate that I2^2F effectively approximated the DGL generally on different model architectures, datasets, attack implementations, and noise-based defenses. With this novel tool, we provide insights into effective gradient perturbation directions, the unfairness of privacy protection, and privacy-preferred model initialization. Our codes are provided in https://github.com/illidanlab/inversion-influence-function.Comment: 22 pages, 16 figures, accepted by NeurIPS202

    An analysis of competitive traits in pest ant species

    Get PDF
    The successful spread of invasive species can often be explained by specific behavioral, morphological, chemical and genetic traits. Studies suggest that those traits are also present in native species that expand strikingly fast and turn into an issue for the environment. Mass occurrences of the native pest ant species Formica fuscocinerea have recently become a concern for leisure areas in Southern Germany. This thesis investigates whether these mass occurrences can similarly be explained by traits known from invasive species, such as a high interspecific dominance and extensive colony networks. As cooperation among large numbers of individuals requires pronounced communication abilities, this thesis also investigates whether pheromone communication contributes to the superiority of invasive ants. Therefore, competitive strength and pheromone communication of the invasive garden ant Lasius neglectus is compared with those of the two closely related native sister species Lasius niger and Lasius platythorax. Identifying the pheromones used for communication can facilitate more specific control of pest ant species. Targeted controlmethods use baits or traps that are equipped with species-specific pheromone attractants. Ants naturally use pheromone attractants produced in pheromone glands for foraging. This thesis compares hindgut, poison gland and Dufour’s gland pheromones of L. neglectus against those of L. niger and L. platythorax to identify species-specific attractants for the invasive garden ant. The results show that the native pest ant species F. fuscocinerea is able to dominate other ant species by pronounced interspecific aggression. In contrast, F. fuscocinerea does not show intraspecific aggression among individuals from distant populations indicating weak or nonexistent colony boundaries. Thus, the striking mass occurrences of F. fuscocinerea can be attributed to traits known from invasive ant species. The trail communication of the invasive garden ant L. neglectus seems to be adapted to the exploitation of stable and productive food sources. Lasius neglectus shows a higher precision in following hindgut trails than the native Lasius species. The pheromone blends of the studied glands are notably different. Of 60 identified substances are 9 specific to the invasive L. neglectus, 26 to L. niger and 4 to L. platythorax. The chemical attractant 2,6-dimethyl-3-ethyl-5-hepten-1-ol can unambiguously be assigned to the hindgut of the invasive garden ant L. neglectus. Thus, this substance is a promising candidate for a species-specific attractant in the control of the invasive garden ant L. neglectus. High interspecific aggression and supercolonial structures are important traits of invasive ant species and this dissertation suggests that they likewise enable the native pest ant F. fuscocinerea to become dominant. A considerably more sophisticated pheromone communication does not necessarily belong to traits of invasive ants, particularly L. neglectus. However, the findings are provisional and require further investigation. Yet, the analyses of the communication pheromones provide a basis for the species-specific control of L. neglectus.Die erfolgreiche Ausbreitung invasiver Arten kann häufig mit bestimmten Verhaltensweisen, morphologischen, chemischen und genetischen Eigenschaften erklärt werden. Untersuchungen lassen vermuten, dass diese Eigenschaften auch bei den heimischen Arten vorkommen, die sich auffallend schnell ausbreiten und zu einem Problem für die Umwelt werden. Massenvorkommen der heimischen Pestameisenart Formica fuscocinerea wurden jüngst zu einem großen Problem auf Freizeitflächen in Süddeutschland. Diese Arbeit untersucht, inwiefern dieseMassenvorkommen auf ähnlicheWeise durch Eigenschaften erklärt werden können, wie sie von invasiven Arten bekannt sind, wie etwa eine hohe zwischenartliche Dominanz und ausgedehnte Kolonievernetzung. Da die Kooperation einer großen Anzahl von Individuen ausgeprägte Kommunikationsfähigkeiten benötigt, untersucht diese Arbeit zudem, ob die Pheromonkommunikation zur Überlegenheit invasiver Arten beiträgt. Dafür werden die Konkurrenzstärke und die Pheromonkommunikation der invasiven Gartenameise Lasius neglectus mit denen zweier nah verwandter heimischer Schwesternarten Lasius niger und Lasius platythorax verglichen. Eine Identifikation der Pheromone, die für die Kommunikation verwendet werden, kann eine spezifischere Bekämpfung von Pestameisenarten ermöglichen. Zielgerichtete Kontrollmethoden verwenden Köder oder Fallen, diemit artspezifischen Pheromonlockstoffen ausgestattet sind. Ameisen verwenden Pheromonlockstoffe, die in Pheromondrüsen produziert werden, naturgemäß bei der Futtersuche. Diese Arbeit vergleicht Pheromone aus dem Enddarm, der Giftdrüse und der Dufourdrüse von L. neglectus mit denen von L. niger and L. platythorax um artspezifische Lockstoffe für die invasive Gartenarmeise zu identifizieren. Die Ergebnisse zeigen, dass die heimische Pestameisenart F. fuscocinerea in der Lage ist, andere Ameisen durch ausgeprägte zwischenartliche Aggression zu dominieren. ImGegensatz dazu zeigt F. fuscocinerea keine innerartliche Aggression zwischen Individuen von entfernten Populationen, was auf schwache oder nicht vorhandene Koloniegrenzen hinweist. Folglich können die auffälligen Massenauftreten von F. fuscocinerea Eigenschaften zugeschrieben werden, die von invasive Ameisenarten bekannt sind. Die Spurkommunikation der invasiven Gartenameise L. neglectus scheint an die Ausbeutung stabiler und ergiebiger Nahrungsquellen angepasst zu sein. Lasius neglectus zeigt eine höhere Präzision beim Verfolgen von Enddarmspuren als die heimischen Lasius Arten. Die Pheromonzusammensetzungen der untersuchten Drüsen sind deutlich unterschiedlich. Von 60 identifizierten Substanzen sind 9 spezifisch für die invasive L. neglectus, 26 für L. niger und 4 für L. platythorax. Der chemische Lockstoff 2,6-Dimethyl-3-ethyl-5-hepten-1-ol kann eindeutig dem Enddarmder invasive Gartenameise L. neglectus zugeordnet werden. Diese Substanz ist somit ein vielversprechender Kandidat für einen artspezifischen Lockstoff zur Bekämpfung der invasiven Gartenameise L. neglectus. Hohe zwischenartliche Aggression und superkoloniale Strukturen sind wichtigeMerkmale invasiver Ameisenarten und diese Arbeit weist darauf hin, dass sie in gleicher Weise der heimischen Pestart F. fuscocinerea ermöglichen dominant zu werden. Eine deutlich raffiniertere Pheromonkommunikation gehört allerdings nicht notwendigerweise zu den Merkmalen invasiver Ameisen, insbesondere nicht zu denen von L. neglectus. Die Erkenntnisse gelten jedoch nur vorläufig und benötigen weitere Untersuchungen. Dennoch bietet die Analyse der Kommunikationspheromone eine Grundlage für die artspezifische Kontrolle von L. neglectus

    NASA/ASEE Summer Faculty Fellowship Program, 1990, Volume 1

    Get PDF
    The 1990 Johnson Space Center (JSC) NASA/American Society for Engineering Education (ASEE) Summer Faculty Fellowship Program was conducted by the University of Houston-University Park and JSC. A compilation of the final reports on the research projects are presented. The topics covered include: the Space Station; the Space Shuttle; exobiology; cell biology; culture techniques; control systems design; laser induced fluorescence; spacecraft reliability analysis; reduced gravity; biotechnology; microgravity applications; regenerative life support systems; imaging techniques; cardiovascular system; physiological effects; extravehicular mobility units; mathematical models; bioreactors; computerized simulation; microgravity simulation; and dynamic structural analysis

    Conceptualizing and measuring “industry resilience”: Composite indicators for postshock industrial policy decision-making

    Get PDF
    Can resilience be a relevant concept for industrial policy? Resilience is usually described as the ability of a socioeconomic system to recover from unexpected shocks. While this concept has caught the attention of regional economics researchers seeking to understand the different patterns behind regional recovery after a disruption, it is increasingly recognized that resilience can have policy-relevant conceptual applications in many other regards. In this paper, we apply it to industries and define the “industry resilience” concept and measurements. Our contribution is twofold. Theoretically, we frame industry resilience as a useful conceptual framework for policy-making to support the selection of industrial policy targets that are more capable of recovering after unexpected shocks. In addition, industry resilience can mitigate government failures by supporting decision-makers in promoting both economically and socially sustainable structural change. Methodologically, building on post-2008 U.S. data, we develop two composite indicators (CIs) to separately analyze quantitative and qualitative postshock variations in sectoral employment. Such CIs support policy-makers in visualizing sectoral performances dynamically and multidimensionally and can be used to compare each sector both to other sectors and to its counterfactual. Our results highlight that sectors react heterogeneously to shocks. This points to the relevance of tailoring vertical industrial policies according to sector features and the aims of industrial policy initiatives
    corecore