149,595 research outputs found

    Search-Adaptor: Text Embedding Customization for Information Retrieval

    Full text link
    Text embeddings extracted by pre-trained Large Language Models (LLMs) have significant potential to improve information retrieval and search. Beyond the zero-shot setup in which they are being conventionally used, being able to take advantage of the information from the relevant query-corpus paired data has the power to further boost the LLM capabilities. In this paper, we propose a novel method, Search-Adaptor, for customizing LLMs for information retrieval in an efficient and robust way. Search-Adaptor modifies the original text embedding generated by pre-trained LLMs, and can be integrated with any LLM, including those only available via APIs. On multiple real-world English and multilingual retrieval datasets, we show consistent and significant performance benefits for Search-Adaptor -- e.g., more than 5.2% improvements over the Google Embedding APIs in nDCG@10 averaged over 13 BEIR datasets.Comment: 9 pages, 2 figure

    Medical image retrieval for augmenting diagnostic radiology

    Get PDF
    Even though the use of medical imaging to diagnose patients is ubiquitous in clinical settings, their interpretations are still challenging for radiologists. Many factors make this interpretation task difficult, one of which is that medical images sometimes present subtle clues yet are crucial for diagnosis. Even worse, on the other hand, similar clues could indicate multiple diseases, making it challenging to figure out the definitive diagnoses. To help radiologists quickly and accurately interpret medical images, there is a need for a tool that can augment their diagnostic procedures and increase efficiency in their daily workflow. A general-purpose medical image retrieval system can be such a tool as it allows them to search and retrieve similar cases that are already diagnosed to make comparative analyses that would complement their diagnostic decisions. In this thesis, we contribute to developing such a system by proposing approaches to be integrated as modules of a single system, enabling it to handle various information needs of radiologists and thus augment their diagnostic processes during the interpretation of medical images. We have mainly studied the following retrieval approaches to handle radiologists’different information needs; i) Retrieval Based on Contents, ii) Retrieval Based on Contents, Patients’ Demographics, and Disease Predictions, and iii) Retrieval Based on Contents and Radiologists’ Text Descriptions. For the first study, we aimed to find an effective feature representation method to distinguish medical images considering their semantics and modalities. To do that, we have experimented different representation techniques based on handcrafted methods (mainly texture features) and deep learning (deep features). Based on the experimental results, we propose an effective feature representation approach and deep learning architectures for learning and extracting medical image contents. For the second study, we present a multi-faceted method that complements image contents with patients’ demographics and deep learning-based disease predictions, making it able to identify similar cases accurately considering the clinical context the radiologists seek. For the last study, we propose a guided search method that integrates an image with a radiologist’s text description to guide the retrieval process. This method guarantees that the retrieved images are suitable for the comparative analysis to confirm or rule out initial diagnoses (the differential diagnosis procedure). Furthermore, our method is based on a deep metric learning technique and is better than traditional content-based approaches that rely on only image features and, thus, sometimes retrieve insignificant random images

    ACCESSING REFERENTIAL INFORMATION DURING TEXT COMPOSITION : WHEN AND WHY ?

    Get PDF
    When composing a text, writers have to continually shift between content planning and content translating. This continuous shifting gives the writing activity its cyclic nature. The first section of this paper will analyse the writing process as a hierarchical cyclic activity. A methodological paradigm will be proposed for the investigation of the writing process. In the second section, we will partially present two experiments that were conducted independently, with this paradigm. Both give a coherent and interesting picture of what happens with content while the writer is planning. The characteristics of cycles depend both on the nature of the content information being recovered and on the complexity of the processes applied to this content

    VITALAS at TRECVID-2008

    Get PDF
    In this paper, we present our experiments in TRECVID 2008 about High-Level feature extraction task. This is the first year for our participation in TRECVID, our system adopts some popular approaches that other workgroups proposed before. We proposed 2 advanced low-level features NEW Gabor texture descriptor and the Compact-SIFT Codeword histogram. Our system applied well-known LIBSVM to train the SVM classifier for the basic classifier. In fusion step, some methods were employed such as the Voting, SVM-base, HCRF and Bootstrap Average AdaBoost(BAAB)

    General guidelines for designing bilingual low cost digital library services suitable for special library users in developing countries and the Arabic speaking world

    Get PDF
    The World is witnessing a considerable transformation from print based-formats to elec-tronic-based formats thanks to advanced computing technology, which has a profound impact on the dissemination of nearly all previous formats of publications into digital formats on computer networks. Text, still and moving images, sound tracks, music, and almost all known formats can be stored and retrieved on computer magnetic disk. Over the last two decades, a number of special libraries and information centres in the Arab world have introduced electronic resources into their library services. Very few have implemented automated and integrated systems. Despite the im-portance of designing digital libraries not merely for accessing to or retrieval of information but rather for the provision of electronic services, hardly any special library has started the design of digital library services. Managers of special libraries and information centres in developing countries in general and in the Arab world in particular should start building their local digital libraries, as the benefit of establishing such electronic services is considerably massive and well known for expansion of re-search activities and for delivering services that satisfy the needs of targeted end-users. The aim of this paper is to provide general guideline for design of special low cost digital library providing ser-vices that are most frequently required by various categories of special library users in developing countries. This paper also aims at illustrating strategies and method approaches that can be adopted for building such projects. Seeing the importance of designing an inexpensive digital li-brary as basic principle for the design accordingly, the utilisation of today's ICTs and freely avail-able open sources software is the right path for accomplishing such goal. The paper intends to de-scribe the phases and stages required for building such projects from scratch. It also aims at high-lighting the barriers and obstacles facing Arabic content and how could such problems overcome

    Text Analytics for Android Project

    Get PDF
    Most advanced text analytics and text mining tasks include text classification, text clustering, building ontology, concept/entity extraction, summarization, deriving patterns within the structured data, production of granular taxonomies, sentiment and emotion analysis, document summarization, entity relation modelling, interpretation of the output. Already existing text analytics and text mining cannot develop text material alternatives (perform a multivariant design), perform multiple criteria analysis, automatically select the most effective variant according to different aspects (citation index of papers (Scopus, ScienceDirect, Google Scholar) and authors (Scopus, ScienceDirect, Google Scholar), Top 25 papers, impact factor of journals, supporting phrases, document name and contents, density of keywords), calculate utility degree and market value. However, the Text Analytics for Android Project can perform the aforementioned functions. To the best of the knowledge herein, these functions have not been previously implemented; thus this is the first attempt to do so. The Text Analytics for Android Project is briefly described in this article

    Semantic annotation, publication, and discovery of Java software components: an integrated approach

    Get PDF
    Component-based software development has matured into standard practice in software engineering. Among the advantages of reusing software modules are lower costs, faster development, more manageable code, increased productivity, and improved software quality. As the number of available software components has grown, so has the need for effective component search and retrieval. Traditional search approaches, such as keyword matching, have proved ineffective when applied to software components. Applying a semantically- enhanced approach to component classification, publication, and discovery can greatly increase the efficiency of searching and retrieving software components. This has been already applied in the context of Web technologies, and Web services in particular, in the frame of Semantic Web Services research. This paper examines the similarities between software components and Web services and adapts an existing Semantic Web Service publication and discovery solution into a software component annotation and discovery tool which is implemented as an Eclipse plug-in
    • 

    corecore