23 research outputs found

    Towards Improving Generalization of Multi-Task Learning

    Full text link
    Multi-task Learning (MTL), which involves the simultaneous learning of multiple tasks, can achieve better performance than learning each task independently. It has achieved great success in various applications, ranging from computer vision to bioinformatics. However, involving multiple tasks in a single learning process is complicated, for both cooperation and competition exist across the including tasks; furthermore, the cooperation boosts the generalization of MTL while the competition degenerates it. There lacks of a systematic study on how to improve MTL's generalization by handling the cooperation and competition. This thesis systematically studies this problem and proposed four novel MTL methods to enhance the between-task cooperation or reduce the between-task competition. Specifically, for the between-task cooperation, adversarial multi-task representation learning (AMTRL) and semi-supervised multi-task learning (Semi-MTL) are studied; furthermore, a novel adaptive AMTRL method and a novel representation consistency regularization-based Semi-MTL method are proposed respectively. As to the between-task competition, this thesis analyzes the task variance and task imbalance; furthermore, a novel task variance regularization-based MTL method and a novel task-imbalance-aware MTL method are proposed respectively. The above proposed methods can improve the generalization of MTL and achieve state-of-the-art performance in real-word MTL applications

    Local Differential Privacy based Federated Learning for Internet of Things

    Full text link
    Internet of Vehicles (IoV) is a promising branch of the Internet of Things. IoV simulates a large variety of crowdsourcing applications such as Waze, Uber, and Amazon Mechanical Turk, etc. Users of these applications report the real-time traffic information to the cloud server which trains a machine learning model based on traffic information reported by users for intelligent traffic management. However, crowdsourcing application owners can easily infer users' location information, which raises severe location privacy concerns of the users. In addition, as the number of vehicles increases, the frequent communication between vehicles and the cloud server incurs unexpected amount of communication cost. To avoid the privacy threat and reduce the communication cost, in this paper, we propose to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model. Specifically, we propose four LDP mechanisms to perturb gradients generated by vehicles. The Three-Outputs mechanism is proposed which introduces three different output possibilities to deliver a high accuracy when the privacy budget is small. The output possibilities of Three-Outputs can be encoded with two bits to reduce the communication cost. Besides, to maximize the performance when the privacy budget is large, an optimal piecewise mechanism (PM-OPT) is proposed. We further propose a suboptimal mechanism (PM-SUB) with a simple formula and comparable utility to PM-OPT. Then, we build a novel hybrid mechanism by combining Three-Outputs and PM-SUB.Comment: This paper appears in IEEE Internet of Things Journal (IoT-J

    SOME CONTRIBUTIONS TO DATA DRIVEN INDIVIDUALIZED DECISION MAKING PROBLEMS

    Get PDF
    Recent exploration of the optimal individualized decision rule (IDR) for patients in precision medicine has attracted a lot of attentions due to the potential heterogeneous response of patients to different treatments. In the current literature, an optimal IDR is a decision function based on patients’ characteristics for the treatment that maximizes the expected outcome. My dissertation research mainly focuses on how to estimate optimal IDRs under various criteria given experimental data. In the first part of this dissertation, focusing on maximizing expected outcome, we propose an angle-based direct learning (AD-learning) method to efficiently estimate optimal IDRs with multiple treatments for various types of outcomes. This contributes to the literature, where many existing methods are designed for binary treatment settings with the interest of a continuous outcome. In the second part, motivated by complex individualized decision making procedures, we propose two new robust criteria to estimate optimal IDRs: one is to control the average lower tail of the subjects’ outcomes and the other is to control the individualized lower tail of each subject’s outcome. In addition to optimizing the individualized expected outcome, our proposed criteria take risks into consideration, and thus the resulting IDRs can prevent adverse events caused by the heavy lower tail of the outcome distribution. In the third part of this dissertation, motivated by the concept of Optimized Certainty Equivalent (OCE), we generalize the second part and propose a decision-rule based optimized covariates dependent equivalent (CDE) for individualized decision making problems. Our proposed IDR-CDE not only broadens the existing expected outcome framework in precision medicine but also enriches the previous concept of the OCE in the risk management. We study the related mathematical problem of estimating an optimal IDRs both theoretically and numerically.Doctor of Philosoph

    Graph Priors, Optimal Transport, and Deep Learning in Biomedical Discovery

    Get PDF
    Recent advances in biomedical data collection allows the collection of massive datasets measuring thousands of features in thousands to millions of individual cells. This data has the potential to advance our understanding of biological mechanisms at a previously impossible resolution. However, there are few methods to understand data of this scale and type. While neural networks have made tremendous progress on supervised learning problems, there is still much work to be done in making them useful for discovery in data with more difficult to represent supervision. The flexibility and expressiveness of neural networks is sometimes a hindrance in these less supervised domains, as is the case when extracting knowledge from biomedical data. One type of prior knowledge that is more common in biological data comes in the form of geometric constraints. In this thesis, we aim to leverage this geometric knowledge to create scalable and interpretable models to understand this data. Encoding geometric priors into neural network and graph models allows us to characterize the models’ solutions as they relate to the fields of graph signal processing and optimal transport. These links allow us to understand and interpret this datatype. We divide this work into three sections. The first borrows concepts from graph signal processing to construct more interpretable and performant neural networks by constraining and structuring the architecture. The second borrows from the theory of optimal transport to perform anomaly detection and trajectory inference efficiently and with theoretical guarantees. The third examines how to compare distributions over an underlying manifold, which can be used to understand how different perturbations or conditions relate. For this we design an efficient approximation of optimal transport based on diffusion over a joint cell graph. Together, these works utilize our prior understanding of the data geometry to create more useful models of the data. We apply these methods to molecular graphs, images, single-cell sequencing, and health record data

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    Learning Neural Graph Representations in Non-Euclidean Geometries

    Get PDF
    The success of Deep Learning methods is heavily dependent on the choice of the data representation. For that reason, much of the actual effort goes into Representation Learning, which seeks to design preprocessing pipelines and data transformations that can support effective learning algorithms. The aim of Representation Learning is to facilitate the task of extracting useful information for classifiers and other predictor models. In this regard, graphs arise as a convenient data structure that serves as an intermediary representation in a wide range of problems. The predominant approach to work with graphs has been to embed them in an Euclidean space, due to the power and simplicity of this geometry. Nevertheless, data in many domains exhibit non-Euclidean features, making embeddings into Riemannian manifolds with a richer structure necessary. The choice of a metric space where to embed the data imposes a geometric inductive bias, with a direct impact on the performance of the models. This thesis is about learning neural graph representations in non-Euclidean geometries and showcasing their applicability in different downstream tasks. We introduce a toolkit formed by different graph metrics with the goal of characterizing the topology of the data. In that way, we can choose a suitable target embedding space aligned to the shape of the dataset. By virtue of the geometric inductive bias provided by the structure of the non-Euclidean manifolds, neural models can achieve higher performances with a reduced parameter footprint. As a first step, we study graphs with hierarchical structures. We develop different techniques to derive hierarchical graphs from large label inventories. Noticing the capacity of hyperbolic spaces to represent tree-like arrangements, we incorporate this information into an NLP model through hyperbolic graph embeddings and showcase the higher performance that they enable. Second, we tackle the question of how to learn hierarchical representations suited for different downstream tasks. We introduce a model that jointly learns task-specific graph embeddings from a label inventory and performs classification in hyperbolic space. The model achieves state-of-the-art results on very fine-grained labels, with a remarkable reduction of the parameter size. Next, we move to matrix manifolds to work on graphs with diverse structures and properties. We propose a general framework to implement the mathematical tools required to learn graph embeddings on symmetric spaces. These spaces are of particular interest given that they have a compound geometry that simultaneously contains Euclidean as well as hyperbolic subspaces, allowing them to automatically adapt to dissimilar features in the graph. We demonstrate a concrete implementation of the framework on Siegel spaces, showcasing their versatility on different tasks. Finally, we focus on multi-relational graphs. We devise the means to translate Euclidean and hyperbolic multi-relational graph embedding models into the space of symmetric positive definite (SPD) matrices. To do so we develop gyrocalculus in this geometry and integrate it with the aforementioned framework

    Crystal Plasticity at Micro- and Nano-scale Dimensions

    Get PDF
    The present collection of articles focuses on the mechanical strength properties at micro- and nanoscale dimensions of body-centered cubic, face-centered cubic and hexagonal close-packed crystal structures. The advent of micro-pillar test specimens is shown to provide a new dimensional scale for the investigation of crystal deformation properties. The ultra-small dimensional scale at which these properties are measured is shown to approach the atomic-scale level at which model dislocation mechanics descriptions of crystal slip and deformation twinning behaviors are proposed to be operative, including the achievement of atomic force microscopic measurements of dislocation pile-up interactions with crystal grain boundaries or with hard surface coatings. A special advantage of engineering designs made at such small crystal and polycrystalline dimensions is the achievement of an approximate order-of-magnitude increase in mechanical strength levels. Reasonable extrapolation of macro-scale continuum mechanics descriptions of crystal strength properties at micro- to nano-indentation hardness measurements are demonstrated, in addition to reports on persistent slip band observations and fatigue cracking behaviors. High-entropy alloy, superalloy and energetic crystal properties are reported along with descriptions of deformation rate sensitivities, grain boundary structures, nano-cutting, void nucleation/growth micromechanics and micro-composite electrical properties
    corecore