81 research outputs found
Trustworthy machine learning through the lens of privacy and security
Nowadays, machine learning (ML) becomes ubiquitous and it is transforming society. However, there are still many incidents caused by ML-based systems when ML is deployed in real-world scenarios. Therefore, to allow wide adoption of ML in the real world, especially in critical applications such as healthcare, finance, etc., it is crucial to develop ML models that are not only accurate but also trustworthy (e.g., explainable, privacy-preserving, secure, and robust). Achieving trustworthy ML with different machine learning paradigms (e.g., deep learning, centralized learning, federated learning, etc.), and application domains (e.g., computer vision, natural language, human study, malware systems, etc.) is challenging, given the complicated trade-off among utility, scalability, privacy, explainability, and security. To bring trustworthy ML to real-world adoption with the trust of communities, this study makes a contribution of introducing a series of novel privacy-preserving mechanisms in which the trade-off between model utility and trustworthiness is optimized in different application domains, including natural language models, federated learning with human and mobile sensing applications, image classification, and explainable AI. The proposed mechanisms reach deployment levels of commercialized systems in real-world trials while providing trustworthiness with marginal utility drops and rigorous theoretical guarantees. The developed solutions enable safe, efficient, and practical analyses of rich and diverse user-generated data in many application domains
Recommended from our members
Multiple Instance Learning for Histopathological Image Classification
In histopathological image analysis, image classification as well as pattern detection play a crucial role in the diagnosis and treatment process since the goal is to not only differentiate cancer types but also identify cancerous manifestations. Fully supervised learning strategies tend to address these problems using manually annotated cancerous regions and labeled cancer-type images. The success of these approaches heavily depends on manual segmentation from pathologists/experts. However, the manual process is challenging due to two major issues of histopathological images: manual segmentation process over the entire image is time-consuming and labor-intensive and boundaries of different cancerous regions in the image are naturally ambiguous, which may create inter- and intra-observation variations among experts. Therefore, weakly supervised learning approaches solely based on the label of images are well-suited for the data. Multiple instance learning (MIL), one of the weakly supervised learning methods, is recently considered as a machine learning paradigm to analyze histopathological images. Based on image labels/cancer types, MIL approaches learn to predict a cancer type as well as detect and localize cancerous regions in the image. In this report, existing strategies for modeling histopathological image analysis as MIL problems are reviewed. Recent trends and future directions are also discussed
XRand: Differentially Private Defense against Explanation-Guided Attacks
Recent development in the field of explainable artificial intelligence (XAI)
has helped improve trust in Machine-Learning-as-a-Service (MLaaS) systems, in
which an explanation is provided together with the model prediction in response
to each query. However, XAI also opens a door for adversaries to gain insights
into the black-box models in MLaaS, thereby making the models more vulnerable
to several attacks. For example, feature-based explanations (e.g., SHAP) could
expose the top important features that a black-box model focuses on. Such
disclosure has been exploited to craft effective backdoor triggers against
malware classifiers. To address this trade-off, we introduce a new concept of
achieving local differential privacy (LDP) in the explanations, and from that
we establish a defense, called XRand, against such attacks. We show that our
mechanism restricts the information that the adversary can learn about the top
important features, while maintaining the faithfulness of the explanations.Comment: To be published at AAAI 202
Active Membership Inference Attack under Local Differential Privacy in Federated Learning
Federated learning (FL) was originally regarded as a framework for
collaborative learning among clients with data privacy protection through a
coordinating server. In this paper, we propose a new active membership
inference (AMI) attack carried out by a dishonest server in FL. In AMI attacks,
the server crafts and embeds malicious parameters into global models to
effectively infer whether a target data sample is included in a client's
private training data or not. By exploiting the correlation among data features
through a non-linear decision boundary, AMI attacks with a certified guarantee
of success can achieve severely high success rates under rigorous local
differential privacy (LDP) protection; thereby exposing clients' training data
to significant privacy risk. Theoretical and experimental results on several
benchmark datasets show that adding sufficient privacy-preserving noise to
prevent our attack would significantly damage FL's model utility.Comment: Published at AISTATS 202
User-Entity Differential Privacy in Learning Natural Language Models
In this paper, we introduce a novel concept of user-entity differential
privacy (UeDP) to provide formal privacy protection simultaneously to both
sensitive entities in textual data and data owners in learning natural language
models (NLMs). To preserve UeDP, we developed a novel algorithm, called
UeDP-Alg, optimizing the trade-off between privacy loss and model utility with
a tight sensitivity bound derived from seamlessly combining user and sensitive
entity sampling processes. An extensive theoretical analysis and evaluation
show that our UeDP-Alg outperforms baseline approaches in model utility under
the same privacy budget consumption on several NLM tasks, using benchmark
datasets.Comment: Accepted at IEEE BigData 202
Explain by Evidence: An Explainable Memory-based Neural Network for Question Answering
Interpretability and explainability of deep neural networks are challenging
due to their scale, complexity, and the agreeable notions on which the
explaining process rests. Previous work, in particular, has focused on
representing internal components of neural networks through human-friendly
visuals and concepts. On the other hand, in real life, when making a decision,
human tends to rely on similar situations and/or associations in the past.
Hence arguably, a promising approach to make the model transparent is to design
it in a way such that the model explicitly connects the current sample with the
seen ones, and bases its decision on these samples. Grounded on that principle,
we propose in this paper an explainable, evidence-based memory network
architecture, which learns to summarize the dataset and extract supporting
evidences to make its decision. Our model achieves state-of-the-art performance
on two popular question answering datasets (i.e. TrecQA and WikiQA). Via
further analysis, we show that this model can reliably trace the errors it has
made in the validation step to the training instances that might have caused
these errors. We believe that this error-tracing capability provides
significant benefit in improving dataset quality in many applications.Comment: Accepted to COLING 202
TECHNICAL ASSESSMENT OF GAMMA-AMINO BUTYRIC ACID (GABA) PRODUCTION FROM RICE BRAN
This research focused on technical assessment of GABA production from rice bran through fermentation by Lactobacillus brevis. Influence of operating pressure on separation of GABA by nanofiltration membrane was investigated and 4 bar was suitable for the nanofiltration process. The purification of GABA by nanofiltration with constant feed volume was carried out and purity of GABA reached 4.8 folds, compared to feed, at 5 volumes of added water. At 40 of concentration factor in concentration of GABA solution by nanofiltration with full recycle of retentate, content of GABA reached 49.8 g/L. The production of GABA from defatted rice bran at pilot scale was carried out at 1,000 L/batch (equal to 200 kg of rice bran) of fermentation. Estimation of mass balance showed that, with 200 kg of defatted rice bran, 7.0 kg of GABA powder was obtained. Results indicated that, it is potential to produce GABA from rice bran through the fermentation by Lactobacillus brevis
Customer satisfaction and corporate investment policies
This paper examines the effect of satisfaction with firms’ products and services on their capital investment policies. Using data from the American Customer Satisfaction Index from 1994 to 2013, the results of the regression models show that firms with higher customer satisfaction will invest more heavily in capital expenditures in the future. The results further show that this positive effect is more pronounced for firms with less growth opportunities or a high cost of capital. This would include those firms with low market-to-book ratios, young and small firms, or firms in more competitive industries. Overall, this study argues that customer satisfaction is an important factor affecting the firm’s investment policy. The findings provide a better understanding of the role of customer satisfaction which can generate growth opportunities, reduce cost and motivate a firm to invest more in capital
Structure of the NheA Component of the Nhe Toxin from Bacillus cereus: Implications for Function
The structure of NheA, a component of the Bacillus cereus Nhe tripartite toxin, has been solved at 2.05 Å resolution using selenomethionine multiple-wavelength anomalous dispersion (MAD). The structure shows it to have a fold that is similar to the Bacillus cereus Hbl-B and E. coli ClyA toxins, and it is therefore a member of the ClyA superfamily of α-helical pore forming toxins (α-PFTs), although its head domain is significantly enlarged compared with those of ClyA or Hbl-B. The hydrophobic β-hairpin structure that is a characteristic of these toxins is replaced by an amphipathic β-hairpin connected to the main structure via a β-latch that is reminiscent of a similar structure in the β-PFT Staphylococcus aureus α-hemolysin. Taken together these results suggest that, although it is a member of an archetypal α-PFT family of toxins, NheA may be capable of forming a β rather than an α pore
- …