54 research outputs found

    Towards Robust Text Retrieval with Progressive Learning

    Full text link
    Retrieval augmentation has become an effective solution to empower large language models (LLMs) with external and verified knowledge sources from the database, which overcomes the limitations and hallucinations of LLMs in handling up-to-date and domain-specific information. However, existing embedding models for text retrieval usually have three non-negligible limitations. First, the number and diversity of samples in a batch are too restricted to supervise the modeling of textual nuances at scale. Second, the high proportional noise are detrimental to the semantic correctness and consistency of embeddings. Third, the equal treatment to easy and difficult samples would cause sub-optimum convergence of embeddings with poorer generalization. In this paper, we propose the PEG, a progressively learned embeddings for robust text retrieval. Specifically, we increase the training in-batch negative samples to 80,000, and for each query, we extracted five hard negatives. Concurrently, we incorporated a progressive learning mechanism, enabling the model to dynamically modulate its attention to the samples throughout the entire training process. Additionally, PEG is trained on more than 100 million data, encompassing a wide range of domains (e.g., finance, medicine, and tourism) and covering various tasks (e.g., question-answering, machine reading comprehension, and similarity matching). Extensive experiments conducted on C-MTEB and DuReader demonstrate that PEG surpasses state-of-the-art embeddings in retrieving true positives, highlighting its significant potential for applications in LLMs. Our model is publicly available at https://huggingface.co/TownsWu/PEG

    MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

    Full text link
    Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image. However, it is difficult for these case studies to fully reflect the performance of MLLM, lacking a comprehensive evaluation. In this paper, we fill in this blank, presenting the first MLLM Evaluation benchmark MME. It measures both perception and cognition abilities on a total of 14 subtasks. In order to avoid data leakage that may arise from direct use of public datasets for evaluation, the annotations of instruction-answer pairs are all manually designed. The concise instruction design allows us to fairly compare MLLMs, instead of struggling in prompt engineering. Besides, with such an instruction, we can also easily carry out quantitative statistics. A total of 10 advanced MLLMs are comprehensively evaluated on our MME, which not only suggests that existing MLLMs still have a large room for improvement, but also reveals the potential directions for the subsequent model optimization.Comment: https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Model

    808 nm driven Nd 3+

    Get PDF
    The in vivo biological applications of upconversion nanoparticles (UCNPs) prefer excitation at 700-850 nm, instead of 980 nm, due to the absorption of water. Recent approaches in constructing robust Nd3+ doped UCNPs with 808 nm excitation properties rely on a thick Nd3+ sensitized shell. However, for the very important and popular Frster resonance energy transfer (FRET)-based applications, such as photodynamic therapy (PDT) or switchable biosensors, this type of structure has restrictions resulting in a poor energy transfer. In this work, we have designed a NaYF4:Yb/Ho@NaYF4:Nd@NaYF4 core-shell-shell nanostructure. We have proven that this optimal structure balances the robustness of the upconversion emission and the FRET efficiency for FRET-based bioapplications. A proof of the concept was demonstrated for photodynamic therapy and simultaneous fluorescence imaging of HeLa cells triggered by 808 nm light, where low heating and a high PDT efficacy were achieved

    Etude des méthodes d'apprentissage profond pour la classification et la segmentation des chromosome et des images pulmonaires

    No full text
    Les maladies pulmonaires peuvent causer des dommages mortels à la santé humaine. La tomographie par rayons X (CT) permet d'obtenir les structures pulmonaires et les lésions pour la mesure et le diagnostic. L'avancée de la microscopie et du caryotypage profite à l'étude de la pathogenèse sur la relation entre les anomalies chromosomiques et les maladies pulmonaires. Dans cette thèse, pour aider à l'analyse des maladies pulmonaires, nous étudions des méthodes d'apprentissage en profondeur pour deux objectifs. Le premier est la classification des chromosomes colorés au Giemsa en imagerie microscopique. Le second est la segmentation des voies respiratoires pulmonaires, des artères, des veines et des nodules en CT. Nous proposons le Varifocal-Net pour la classification simultanée du type et de la polarité des chromosomes via les réseaux de neurones convolutifs (CNN). Il fonctionne de manière robuste pour différentes courbures, formes et motifs de bandes chromosomiques. Pour la segmentation des nodules, nous proposons une méthode de CNN composé de deux parties pour toutes les textures et tous les environnements des nodules. La première partie consiste à synthétiser des échantillons via un réseau antagoniste génératif (GAN). La deuxième partie vise à développer un modèle de segmentation. Pour les voies respiratoires, leur structure arborescente pose des problèmes de segmentation. Nous proposons AirwayNet pour modéliser explicitement la connectivité entre les voxels voisins. Nous proposons en outre AirwayNet-SE, plus sophistiqué que AirwayNet, en utilisant les caractéristiques des contextes à deux échelles. Enfin, nous proposons une méthode de segmentation des voies respiratoires, des artères et des veines. Pour faire face à des cibles désirées parcimonieux, causées par un sévère déséquilibre des classes, nous présentons les modules de recalibrage des caractéristiques et de distillation de l'attention. L'anatomie a priori est incorporée pour une meilleure différenciation artère-veine.Pulmonary diseases can cause fatal damage to human health. Computed tomogra- phy (CT) helps display pulmonary structures and lesions for measurement and diag- nosis. The advance of microscopy and karyotyping benefits pathogenesis study on the relationship between chromosomal abnormalities and lung diseases. In this thesis, to assist pulmonary disease analysis, we investigate deep learning methods for two purposes. The first is to classify Giemsa-stained chromosomes in microscopic imaging. The second is to segment pulmonary airways, arteries, veins, and nodules in CT. We propose the Varifocal-Net for simultaneous classification of chromosome type and polarity via convolutional neural networks (CNNs). It performs robustly to different chromosome curvature, shape, and banding pattern. For nodule segmentation, we propose a two-part CNNs-based method for all nodule textures and surroundings. The first part is to synthesize samples via generative adversarial network (GAN). The second part is to develop a segmentation model. For airways, their tree-like structure poses challenges to segmentation. We propose the AirwayNet to explicitly model connectivity between neighboring voxels. We further propose the AirwayNet-SE, more sophisticated than AirwayNet, by utilizing features of two context-scales. Finally, we propose a segmentation method for airways, arteries, and veins. To tackle sparse desired targets caused by severe class imbalance, we present the feature recalibration and attention distillation modules. Anatomy prior is incorporated for better artery-vein differentiation
    corecore