7,404 research outputs found

    A Political Theory of Engineered Systems and A Study of Engineering and Justice Workshops

    Get PDF
    Since there are good reasons to think that some engineered systems are socially undesirable—for example, internal combustion engines that cause climate change, algorithms that are racist, and nuclear weapons that can destroy all life—there is a well-established literature that attempts to identify best practices for designing and regulating engineered systems in order to prevent harm and promote justice. Most of this literature, especially the design theory and engineering justice literature meant to help guide engineers, focuses on environmental, physical, social, and mental harms such as ecosystem and bodily poisoning, racial and gender discrimination, and urban alienation. However, the literature that focuses on how engineered systems can produce political harms—harms to how we shape the way we live in community together—is not well established. The first part of this thesis contributes to identifying how particular types of engineered systems can harm a democratic politics. Building on democratic theory, philosophy of collective harms, and design theory, it argues that engineered systems that extend in space and time beyond a certain threshold subvert the knowledge and empowerment necessary for a democratic politics. For example, the systems of global shipping and the internet that fundamentally shape our lives are so large that people cannot attain the knowledge necessary to regulate them well nor the empowerment necessary to shape them. The second part of this thesis is an empirical study of a workshop designed to encourage engineering undergraduates to understand how engineered systems can subvert a democratic politics, with the ultimate goal of supporting students in incorporating that understanding into their work. 32 Dartmouth undergraduate engineering students participated in the study. Half were assigned to participate in a workshop group, half to a control group. The workshop group participants took a pretest; then participated in a 3-hour, semi-structured workshop with 4 participants per session (as well as a discussion leader and note-taker) over lunch or dinner; and then took a posttest. The control group participants took the same pre- and post- tests, but had no suggested activity in the intervening 3 hours. We find that the students who participated in workshops had a statistically significant test-score improvement as compared to the control group (Brunner-Munzel test, p \u3c .001). Using thematic analysis methods, we show the data is consistent with the hypothesis that workshops produced a score improvement because of certain structure (small size, long duration, discussion-based, over homemade food) and content (theoretically rich, challenging). Thematic analysis also reveals workshop failures and areas for improvement (too much content for the duration, not well enough organized). The thesis concludes with a discussion of limitations and suggestions for future theoretical, empirical, and pedagogical research

    Co-segmentation assisted cross-modality person re-identification

    Get PDF
    We present a deep learning-based method for Visible-Infrared person Re-Identification (VI-ReID). The major contribution lies in the incorporation of co-segmentation into a multi-task learning framework for VI-ReID, where the co-segmentation concept aids in making the distributions of RGB images and IR images the same for the same identity but diverse for different identities. Accordingly, a novel multi-task learning based model, i.e., co-segmentation assisted VI-ReID (CSVI), is proposed in this paper. Specifically, the co-segmentation network first takes as the inputs the modality-shared features extracted from a set of RGB and IR images by using the VI-ReID model. Then, it exploits their semantic similarities for predicting the person masks of the common identities within the input RGB and IR images by using a cross-modality center based weight generation module and a segmentation decoder. Doing so enables the VI-ReID model to extract more additional modality-shared shape features for boosting performance. Meanwhile, the co-segmentation network implicitly establishes the interactions among the set of RGB and IR images, thus further bridging the large modality discrepancies. Our model’s effectiveness and superiority are verified through experimental comparisons with state-of-the-art algorithms on several benchmark datasets

    Persuasionsstile in Europa II

    Get PDF
    Persuasionsstile in Europa ist der Name eines seit 2011 laufenden Projekts, an dem eine Gruppe von Forschern unterschiedlicher Disziplinen (Germanisten, Romanisten, Anglisten, Medienforscher, Computerlinguisten) aus acht europäischen Ländern arbeitet. Im Mittelpunkt steht die Analyse von Zeitungskommentaren

    Backpropagation Beyond the Gradient

    Get PDF
    Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models for which they could manually compute derivatives. Now, they can create sophisticated models with almost no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the gradient computation in these libraries. Their entire design centers around gradient backpropagation. These frameworks are specialized around one specific task—computing the average gradient in a mini-batch. This specialization often complicates the extraction of other information like higher-order statistical moments of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order information and there is evidence that focusing solely on the gradient has not lead to significant recent advances in deep learning optimization. To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient must be made available at the same level of computational efficiency, automation, and convenience. This thesis presents approaches to simplify experimentation with rich information beyond the gradient by making it more readily accessible. We present an implementation of these ideas as an extension to the backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information. First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation which enables computing approximate per-layer curvature. This perspective unifies recently proposed block- diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order derivatives is modular, and therefore simple to automate and extend to new operations. Based on the insight that rich information beyond the gradient can be computed efficiently and at the same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and convenient access to statistical moments of the gradient and approximate curvature information, often at a small overhead compared to computing just the gradient. Next, we showcase the utility of such information to better understand neural network training. We build the Cockpit library that visualizes what is happening inside the model during training through various instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide hyperparameter tuning, and study deep learning phenomena. Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information helps in understanding challenges to make second-order optimization methods work in practice. This work develops new tools to experiment more easily with higher-order information in complex deep learning models. These tools have impacted works on Bayesian applications with Laplace approximations, out-of-distribution generalization, differential privacy, and the design of automatic differentia- tion systems. They constitute one important step towards developing and establishing more efficient deep learning algorithms

    Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations

    Full text link
    The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search "unreasonable" local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and "reasonable" model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.Comment: 32 pages, 6 figures, 5 tables. Accepted for publication in the Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 202

    FedForgery: Generalized Face Forgery Detection with Residual Federated Learning

    Full text link
    With the continuous development of deep learning in the field of image generation models, a large number of vivid forged faces have been generated and spread on the Internet. These high-authenticity artifacts could grow into a threat to society security. Existing face forgery detection methods directly utilize the obtained public shared or centralized data for training but ignore the personal privacy and security issues when personal data couldn't be centralizedly shared in real-world scenarios. Additionally, different distributions caused by diverse artifact types would further bring adverse influences on the forgery detection task. To solve the mentioned problems, the paper proposes a novel generalized residual Federated learning for face Forgery detection (FedForgery). The designed variational autoencoder aims to learn robust discriminative residual feature maps to detect forgery faces (with diverse or even unknown artifact types). Furthermore, the general federated learning strategy is introduced to construct distributed detection model trained collaboratively with multiple local decentralized devices, which could further boost the representation generalization. Experiments conducted on publicly available face forgery detection datasets prove the superior performance of the proposed FedForgery. The designed novel generalized face forgery detection protocols and source code would be publicly available.Comment: The code is available at https://github.com/GANG370/FedForgery. The paper has been accepted in the IEEE Transactions on Information Forensics & Securit

    Local 3D Editing via 3D Distillation of CLIP Knowledge

    Full text link
    3D content manipulation is an important computer vision task with many real-world applications (e.g., product design, cartoon generation, and 3D Avatar editing). Recently proposed 3D GANs can generate diverse photorealistic 3D-aware contents using Neural Radiance fields (NeRF). However, manipulation of NeRF still remains a challenging problem since the visual quality tends to degrade after manipulation and suboptimal control handles such as 2D semantic maps are used for manipulations. While text-guided manipulations have shown potential in 3D editing, such approaches often lack locality. To overcome these problems, we propose Local Editing NeRF (LENeRF), which only requires text inputs for fine-grained and localized manipulation. Specifically, we present three add-on modules of LENeRF, the Latent Residual Mapper, the Attention Field Network, and the Deformation Network, which are jointly used for local manipulations of 3D features by estimating a 3D attention field. The 3D attention field is learned in an unsupervised way, by distilling the zero-shot mask generation capability of CLIP to the 3D space with multi-view guidance. We conduct diverse experiments and thorough evaluations both quantitatively and qualitatively.Comment: conference: CVPR 202

    Knowledge Distillation and Continual Learning for Optimized Deep Neural Networks

    Get PDF
    Over the past few years, deep learning (DL) has been achieving state-of-theart performance on various human tasks such as speech generation, language translation, image segmentation, and object detection. While traditional machine learning models require hand-crafted features, deep learning algorithms can automatically extract discriminative features and learn complex knowledge from large datasets. This powerful learning ability makes deep learning models attractive to both academia and big corporations. Despite their popularity, deep learning methods still have two main limitations: large memory consumption and catastrophic knowledge forgetting. First, DL algorithms use very deep neural networks (DNNs) with many billion parameters, which have a big model size and a slow inference speed. This restricts the application of DNNs in resource-constraint devices such as mobile phones and autonomous vehicles. Second, DNNs are known to suffer from catastrophic forgetting. When incrementally learning new tasks, the model performance on old tasks significantly drops. The ability to accommodate new knowledge while retaining previously learned knowledge is called continual learning. Since the realworld environments in which the model operates are always evolving, a robust neural network needs to have this continual learning ability for adapting to new changes

    Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring

    Full text link
    Artificially intelligent perception is increasingly present in the lives of every one of us. Vehicles are no exception, (...) In the near future, pattern recognition will have an even stronger role in vehicles, as self-driving cars will require automated ways to understand what is happening around (and within) them and act accordingly. (...) This doctoral work focused on advancing in-vehicle sensing through the research of novel computer vision and pattern recognition methodologies for both biometrics and wellbeing monitoring. The main focus has been on electrocardiogram (ECG) biometrics, a trait well-known for its potential for seamless driver monitoring. Major efforts were devoted to achieving improved performance in identification and identity verification in off-the-person scenarios, well-known for increased noise and variability. Here, end-to-end deep learning ECG biometric solutions were proposed and important topics were addressed such as cross-database and long-term performance, waveform relevance through explainability, and interlead conversion. Face biometrics, a natural complement to the ECG in seamless unconstrained scenarios, was also studied in this work. The open challenges of masked face recognition and interpretability in biometrics were tackled in an effort to evolve towards algorithms that are more transparent, trustworthy, and robust to significant occlusions. Within the topic of wellbeing monitoring, improved solutions to multimodal emotion recognition in groups of people and activity/violence recognition in in-vehicle scenarios were proposed. At last, we also proposed a novel way to learn template security within end-to-end models, dismissing additional separate encryption processes, and a self-supervised learning approach tailored to sequential data, in order to ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022 to the University of Port
    • …
    corecore