Search CORE

81 research outputs found

A Density Peak-Based Clustering Approach for Fault Diagnosis of Photovoltaic Arrays

Author: Lijun Wu
Lingchen Chen
Peijie Lin
Shuying Cheng
Yaohai Lin
Zhicong Chen
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Fault diagnosis of photovoltaic (PV) arrays plays a significant role in safe and reliable operation of PV systems. In this paper, the distribution of the PV systems’ daily operating data under different operating conditions is analyzed. The results show that the data distribution features significant nonspherical clustering, the cluster center has a relatively large distance from any points with a higher local density, and the cluster number cannot be predetermined. Based on these features, a density peak-based clustering approach is then proposed to automatically cluster the PV data. And then, a set of labeled data with various conditions are employed to compute the minimum distance vector between each cluster and the reference data. According to the distance vector, the clusters can be identified and categorized into various conditions and/or faults. Simulation results demonstrate the feasibility of the proposed method in the diagnosis of certain faults occurring in a PV array. Moreover, a 1.8 kW grid-connected PV system with 6×3 PV array is established and experimentally tested to investigate the performance of the developed method

Crossref

Directory of Open Access Journals

Privacy-preserving collaborative machine learning on genomic data using TensorFlow

Author: Dahl Morten
Hong Cheng
Huang Zhicong
Lu Wen-jie
Ma Li
Mancuso Jason
Qu Hunter
Publication venue
Publication date: 13/02/2020
Field of study

Machine learning (ML) methods have been widely used in genomic studies. However, genomic data are often held by different stakeholders (e.g. hospitals, universities, and healthcare companies) who consider the data as sensitive information, even though they desire to collaborate. To address this issue, recent works have proposed solutions using Secure Multi-party Computation (MPC), which train on the decentralized data in a way that the participants could learn nothing from each other beyond the final trained model. We design and implement several MPC-friendly ML primitives, including class weight adjustment and parallelizable approximation of activation function. In addition, we develop the solution as an extension to TF Encrypted~\citep{dahl2018private}, enabling us to quickly experiment with enhancements of both machine learning techniques and cryptographic protocols while leveraging the advantages of TensorFlow's optimizations. Our implementation compares favorably with state-of-the-art methods, winning first place in Track IV of the iDASH2019 secure genome analysis competition.Comment: Description of the winning solution at Track IV of iDASH competition 2019, to be presented at the Trustworthy ML workshop co-located with ICLR202

arXiv.org e-Print Archive

Crossref

Cryptology ePrint Archive

DiQAD: A Benchmark Dataset for End-to-End Open-domain Dialogue Assessment

Author: Cheng Zhicong
Meng Chong
Ren Zhaochun
Sun Weiwei
Wang Shuaiqiang
Yan Lingyong
Yin Dawei
Zhao Yukun
Publication venue
Publication date: 24/10/2023
Field of study

Dialogue assessment plays a critical role in the development of open-domain dialogue systems. Existing work are uncapable of providing an end-to-end and human-epistemic assessment dataset, while they only provide sub-metrics like coherence or the dialogues are conversed between annotators far from real user settings. In this paper, we release a large-scale dialogue quality assessment dataset (DiQAD), for automatically assessing open-domain dialogue quality. Specifically, we (1) establish the assessment criteria based on the dimensions conforming to human judgements on dialogue qualities, and (2) annotate large-scale dialogues that conversed between real users based on these annotation criteria, which contains around 100,000 dialogues. We conduct several experiments and report the performances of the baselines as the benchmark on DiQAD. The dataset is openly accessible at https://github.com/yukunZhao/Dataset_Dialogue_quality_evaluation.Comment: Accepted to Findings of EMNLP 202

arXiv.org e-Print Archive

PUMA: Secure Inference of LLaMA-7B in Five Minutes

Author: Cheng Wenguang
Dong Ye
Hong Cheng
Huang Zhicong
Lu Wen-jie
Tan Jin
Wei Tao
Wu Haoqi
Zhao Derun
Zheng Yancheng
Publication venue
Publication date: 24/07/2023
Field of study

With ChatGPT as a representative, tons of companies have began to provide services based on large Transformers models. However, using such a service inevitably leak users' prompts to the model provider. Previous studies have studied secure inference for Transformer models using secure multiparty computation (MPC), where model parameters and clients' prompts are kept secret. Despite this, these frameworks are still limited in terms of model performance, efficiency, and deployment. To address these limitations, we propose framework PUMA to enable fast and secure Transformer model inference. Our framework designs high quality approximations for expensive functions, such as GeLU and Softmax, which significantly reduce the cost of secure inference while preserving the model performance. Additionally, we design secure Embedding and LayerNorm procedures that faithfully implement the desired functionality without undermining the Transformer architecture. PUMA is about 2x faster than the state-of-the-art MPC framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning (which the previous works failed to achieve). One more thing, PUMA can evaluate LLaMA-7B in around 5 minutes to generate 1 token. To our best knowledge, this is the first time that a model with such a parameter size is able to be evaluated under MPC. PUMA has been open-sourced in the Github repository of SecretFlow-SPU

arXiv.org e-Print Archive

Cheetah: Lean and Fast Secure Two-Party Deep Neural Network Inference

Author: Cheng Hong
Jiansheng Ding
Wen-jie Lu
Zhicong Huang
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 24/03/2022
Field of study

Secure two-party neural network inference (2PC-NN) can offer privacy protection for both the client and the server and is a promising technique in the machine-learning-as-a-service setting. However, the large overhead of the current 2PC-NN in- ference systems is still being a headache, especially when applied to deep neural networks such as ResNet50. In this work, we present Cheetah, a new 2PC-NN inference system that is faster and more communication-efficient than state-of-the-arts. The main contributions of Cheetah are two-fold: the first part includes carefully designed homomorphic encryption-based protocols that can evaluate the linear layers (namely convolution, batch normalization, and fully-connection) without any expensive rotation operation. The second part includes several lean and communication-efficient primitives for the non-linear functions (e.g., ReLU and truncation). Using Cheetah, we present intensive benchmarks over several large-scale deep neural networks. Take ResNet50 for an example, an end- to-end execution of Cheetah under a WAN setting costs less than 2.5 minutes and 2.3 gigabytes of communication, which outperforms CrypTFlow2 (ACM CCS 2020) by about 5.6× and 12.9×, respectively

Cryptology ePrint Archive