Search CORE

186 research outputs found

Domain Adaptation for Dense Retrieval through Self-Supervision by Pseudo-Relevance Labeling

Author: Gaussier Eric
Li Minghan
Publication venue
Publication date: 13/12/2022
Field of study

Although neural information retrieval has witnessed great improvements, recent works showed that the generalization ability of dense retrieval models on target domains with different distributions is limited, which contrasts with the results obtained with interaction-based models. To address this issue, researchers have resorted to adversarial learning and query generation approaches; both approaches nevertheless resulted in limited improvements. In this paper, we propose to use a self-supervision approach in which pseudo-relevance labels are automatically generated on the target domain. To do so, we first use the standard BM25 model on the target domain to obtain a first ranking of documents, and then use the interaction-based model T53B to re-rank top documents. We further combine this approach with knowledge distillation relying on an interaction-based teacher model trained on the source domain. Our experiments reveal that pseudo-relevance labeling using T53B and the MiniLM teacher performs on average better than other approaches and helps improve the state-of-the-art query generation approach GPL when it is fine-tuned on the pseudo-relevance labeled data.Comment: 16 page

arXiv.org e-Print Archive

OpenSD: Unified Open-Vocabulary Segmentation and Detection

Author: Li Minghan
Li Shuai
Wang Pengfei
Zhang Lei
Publication venue
Publication date: 10/12/2023
Field of study

Recently, a few open-vocabulary methods have been proposed by employing a unified architecture to tackle generic segmentation and detection tasks. However, their performance still lags behind the task-specific models due to the conflict between different tasks, and their open-vocabulary capability is limited due to the inadequate use of CLIP. To address these challenges, we present a universal transformer-based framework, abbreviated as OpenSD, which utilizes the same architecture and network parameters to handle open-vocabulary segmentation and detection tasks. First, we introduce a decoder decoupled learning strategy to alleviate the semantic conflict between thing and staff categories so that each individual task can be learned more effectively under the same framework. Second, to better leverage CLIP for end-to-end segmentation and detection, we propose dual classifiers to handle the in-vocabulary domain and out-of-vocabulary domain, respectively. The text encoder is further trained to be region-aware for both thing and stuff categories through decoupled prompt learning, enabling them to filter out duplicated and low-quality predictions, which is important to end-to-end segmentation and detection. Extensive experiments are conducted on multiple datasets under various circumstances. The results demonstrate that OpenSD outperforms state-of-the-art open-vocabulary segmentation and detection methods in both closed- and open-vocabulary settings. Code is available at https://github.com/strongwolf/OpenS

arXiv.org e-Print Archive

Controllable Synthesis of Fluorescent Carbon Dots and Their Detection Application as Nanoprobes

Author: Feng Gao
Hao Wei
Jing Zhang
Liying Zhang
Minghan Xu
Minghan Xu
Yanjie Su
Yanjie Su
Yujie Ma
Zhaohui Li
Zhi Yang
Publication venue: 'OAhost'
Publication date: 01/01/2013
Field of study

Crossref

Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval

Author: Jimmy Lin
Minghan Li
Sheng-Chieh Lin
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2023
Field of study

AbstractPre-trained language models have been successful in many knowledge-intensive NLP tasks. However, recent work has shown that models such as BERT are not “structurally ready” to aggregate textual information into a [CLS] vector for dense passage retrieval (DPR). This “lack of readiness” results from the gap between language model pre-training and DPR fine-tuning. Previous solutions call for computationally expensive techniques such as hard negative mining, cross-encoder distillation, and further pre-training to learn a robust DPR model. In this work, we instead propose to fully exploit knowledge in a pre-trained language model for DPR by aggregating the contextualized token embeddings into a dense vector, which we call agg★. By concatenating vectors from the [CLS] token and agg★, our Aggretriever model substantially improves the effectiveness of dense retrieval models on both in-domain and zero-shot evaluations without introducing substantial training overhead. Code is available at https://github.com/castorini/dhr

Directory of Open Access Journals

Dynamical Masses and Ages of Sirius-like Systems

Author: An Qier
Brandt Timothy D.
Chen Minghan
Kiman Rocio
Li Yiting
Venner Alexander
Zhang Hengyue
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/06/2023
Field of study

We measure precise orbits and dynamical masses and derive age constraints for six confirmed and one candidate Sirius-like systems, including the Hyades member HD 27483. Our orbital analysis incorporates radial velocities, relative astrometry, and Hipparcos-Gaia astrometric accelerations. We constrain the main-sequence lifetime of a white dwarf's progenitor from the remnant's dynamical mass and semi-empirical initial-final mass relations and infer the cooling age from mass and effective temperature. We present new relative astrometry of HD 27483 B from Keck/NIRC2 observations and archival HST data, and obtain the first dynamical mass of

{0.798}_{-0.041}^{+0.10}

M_{\odot}

, and an age of

{450}_{-180}^{+570}

Myr, consistent with previous age estimates of Hyades. We also measure precise dynamical masses for HD 114174 B (

0.591 \pm 0.011

M_{\odot}

) and HD 169889 B (

{0.526}_{-0.037}^{+0.039}

M_{\odot}

), but their age precisions are limited by their uncertain temperatures. For HD 27786 B, the unusually small mass of

0.443 \pm 0.012

M_{\odot}

suggests a history of rapid mass loss, possibly due to binary interaction in its progenitor's AGB phase. The orbits of HD 118475 and HD 136138 from our RV fitting are overall in good agreement with Gaia DR3 astrometric two-body solutions, despite moderate differences in the eccentricity and period of HD 136138. The mass of

{0.580}_{-0.039}^{+0.052}

M_{\odot}

for HD 118475 B and a speckle imaging non-detection confirms that the companion is a white dwarf. Our analysis shows examples of a rich number of precise WD dynamical mass measurements enabled by Gaia DR3 and later releases, which will improve empirical calibrations of the white dwarf initial-final mass relation.Comment: 21 pages, 7 figures. Submitted to MNRA

arXiv.org e-Print Archive

Pathology Steered Stratification Network for Subtype Identification in Alzheimer's Disease

Author: Chen Minghan
Li Jiadi
Song Qianqian
Wu Guorong
Xu Enze
Yang Defu
Zhang Jingwen
Publication venue
Publication date: 25/08/2023
Field of study

Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder characterized by beta-amyloid, pathologic tau, and neurodegeneration. There are no effective treatments for Alzheimer's disease at a late stage, urging for early intervention. However, existing statistical inference approaches of AD subtype identification ignore the pathological domain knowledge, which could lead to ill-posed results that are sometimes inconsistent with the essential neurological principles. Integrating systems biology modeling with machine learning, we propose a novel pathology steered stratification network (PSSN) that incorporates established domain knowledge in AD pathology through a reaction-diffusion model, where we consider non-linear interactions between major biomarkers and diffusion along brain structural network. Trained on longitudinal multimodal neuroimaging data, the biological model predicts long-term trajectories that capture individual progression pattern, filling in the gaps between sparse imaging data available. A deep predictive neural network is then built to exploit spatiotemporal dynamics, link neurological examinations with clinical profiles, and generate subtype assignment probability on an individual basis. We further identify an evolutionary disease graph to quantify subtype transition probabilities through extensive simulations. Our stratification achieves superior performance in both inter-cluster heterogeneity and intra-cluster homogeneity of various clinical scores. Applying our approach to enriched samples of aging populations, we identify six subtypes spanning AD spectrum, where each subtype exhibits a distinctive biomarker pattern that is consistent with its clinical outcome. PSSN provides insights into pre-symptomatic diagnosis and practical guidance on clinical treatments, which may be further generalized to other neurodegenerative diseases

arXiv.org e-Print Archive