6 research outputs found
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Instruction tuning has recently been recognized as an effective way of
aligning Large Language Models (LLMs) to enhance their generalization ability
across various tasks. However, when tuning publicly accessible, centralized
LLMs with private instruction data, privacy concerns are inevitable. While
direct transfer of parameterized modules between models is a plausible approach
to address this, its implications and effectiveness need further exploration.
This paper focuses on Offsite-Tuning (OFT), a representative technique that
transfers transformer blocks between centralized LLMs and downstream emulators.
Given the limited understanding of the underlying mechanism of OFT, we perform
an empirical analysis on LLMs from the perspectives of representation and
functional similarity. Interestingly, our findings reveal a unique modular
structure within the layers of LLMs that appears to emerge as the model size
expands. Simultaneously, we note subtle but potentially significant changes in
representation and intermediate predictions across the layers. Inspired by
these observations, we propose CRaSh, involving Clustering, Removing, and
Sharing, a training-free strategy to derive improved emulators from LLMs. CRaSh
significantly boosts performance of OFT with billions of parameters.
Furthermore, we investigate the optimal solutions yielded by fine-tuning with
and without full model through the lens of loss landscape. Our findings
demonstrate a linear connectivity among these optima falling over the same
basin, thereby highlighting the effectiveness of CRaSh and OFT. The source code
is publicly available at https://github.com/TsinghuaC3I/CRaSh.Comment: Accepted to EMNLP 2023 (Main Conference
Large Language Models are Zero Shot Hypothesis Proposers
Significant scientific discoveries have driven the progress of human
civilisation. The explosion of scientific literature and data has created
information barriers across disciplines that have slowed the pace of scientific
discovery. Large Language Models (LLMs) hold a wealth of global and
interdisciplinary knowledge that promises to break down these information
barriers and foster a new wave of scientific discovery. However, the potential
of LLMs for scientific discovery has not been formally explored. In this paper,
we start from investigating whether LLMs can propose scientific hypotheses. To
this end, we construct a dataset consist of background knowledge and hypothesis
pairs from biomedical literature. The dataset is divided into training, seen,
and unseen test sets based on the publication date to control visibility. We
subsequently evaluate the hypothesis generation capabilities of various
top-tier instructed models in zero-shot, few-shot, and fine-tuning settings,
including both closed and open-source LLMs. Additionally, we introduce an
LLM-based multi-agent cooperative framework with different role designs and
external tools to enhance the capabilities related to generating hypotheses. We
also design four metrics through a comprehensive review to evaluate the
generated hypotheses for both ChatGPT-based and human evaluations. Through
experiments and analyses, we arrive at the following findings: 1) LLMs
surprisingly generate untrained yet validated hypotheses from testing
literature. 2) Increasing uncertainty facilitates candidate generation,
potentially enhancing zero-shot hypothesis generation capabilities. These
findings strongly support the potential of LLMs as catalysts for new scientific
discoveries and guide further exploration.Comment: Instruction Workshop @ NeurIPS 202
DOA Estimation under GNSS Spoofing Attacks Using a Coprime Array: From a Sparse Reconstruction Viewpoint
The antispoofing method using the direction-of-arrival (DOA) feature can effectively improve the application security of the global navigation satellite system (GNSS) receivers. In this paper, a sparse reconstruction approach based on a coprime array of antennas is proposed to provide reliable DOA estimation under a GNSS spoofing attack. Specifically, the self-coherence property of genuine satellite signals and spoofing was fully exploited to construct a denoised covariance matrix that enables DOA estimation before receiver despreading. Based on this, an equivalent uniform linear array (ULA) was generated from the constructed covariance matrix via virtual array interpolation. By applying the ideal of sparse reconstruction to an equivalent ULA signal, the preliminary DOA estimation results could be obtained without the need for a number of signals. Considering that the sparse estimation technique suffers from basis mismatch effects, we designed an optimization problem with respect to off-grid error to compensate the initial DOA such that the performance loss of DOA estimation could be reduced. Numerical examples demonstrated the advantages of the proposed method in terms of degrees-of-freedom (DOFs), resolution and accuracy
Balanced Convolutional Neural Networks for Pneumoconiosis Detection
Pneumoconiosis remains one of the most common and harmful occupational diseases in China, leading to huge economic losses to society with its high prevalence and costly treatment. Diagnosis of pneumoconiosis still strongly depends on the experience of radiologists, which affects rapid detection on large populations. Recent research focuses on computer-aided detection based on machine learning. These have achieved high accuracy, among which artificial neural network (ANN) shows excellent performance. However, due to imbalanced samples and lack of interpretability, wide utilization in clinical practice meets difficulty. To address these problems, we first establish a pneumoconiosis radiograph dataset, including both positive and negative samples. Second, deep convolutional diagnosis approaches are compared in pneumoconiosis detection, and a balanced training is adopted to promote recall. Comprehensive experiments conducted on this dataset demonstrate high accuracy (88.6%). Third, we explain diagnosis results by visualizing suspected opacities on pneumoconiosis radiographs, which could provide solid diagnostic reference for surgeons