149,595 research outputs found
Search-Adaptor: Text Embedding Customization for Information Retrieval
Text embeddings extracted by pre-trained Large Language Models (LLMs) have
significant potential to improve information retrieval and search. Beyond the
zero-shot setup in which they are being conventionally used, being able to take
advantage of the information from the relevant query-corpus paired data has the
power to further boost the LLM capabilities. In this paper, we propose a novel
method, Search-Adaptor, for customizing LLMs for information retrieval in an
efficient and robust way. Search-Adaptor modifies the original text embedding
generated by pre-trained LLMs, and can be integrated with any LLM, including
those only available via APIs. On multiple real-world English and multilingual
retrieval datasets, we show consistent and significant performance benefits for
Search-Adaptor -- e.g., more than 5.2% improvements over the Google Embedding
APIs in nDCG@10 averaged over 13 BEIR datasets.Comment: 9 pages, 2 figure
Medical image retrieval for augmenting diagnostic radiology
Even though the use of medical imaging to diagnose patients is ubiquitous in clinical settings, their interpretations are still challenging for radiologists. Many factors make this interpretation task difficult, one of which is that medical images sometimes present subtle clues yet are crucial for diagnosis. Even worse, on the other hand, similar clues could indicate multiple diseases, making it challenging to figure out the definitive diagnoses. To help radiologists quickly and accurately interpret medical images, there is a need for a tool that can augment their diagnostic procedures and increase efficiency in their daily workflow. A general-purpose medical image retrieval system can be such a
tool as it allows them to search and retrieve similar cases that are already diagnosed to make comparative analyses that would complement their diagnostic decisions. In this thesis, we contribute to developing such a system by proposing approaches to be integrated as modules of a single system, enabling it to handle various information needs of radiologists and thus augment their diagnostic processes during the interpretation of medical images.
We have mainly studied the following retrieval approaches to handle radiologistsâdifferent information needs; i) Retrieval Based on Contents, ii) Retrieval Based on Contents, Patientsâ Demographics, and Disease Predictions, and iii) Retrieval Based on Contents and Radiologistsâ Text Descriptions. For the first study, we aimed to find an effective feature representation method to distinguish medical images considering their semantics and modalities. To do that, we have experimented different representation techniques based on handcrafted methods (mainly texture features) and deep learning (deep features). Based on the experimental results, we propose an effective feature representation approach and deep learning architectures for learning and extracting medical image contents. For the second study, we present a multi-faceted method that complements image contents with patientsâ demographics and deep learning-based disease predictions, making it able to identify similar cases accurately considering the clinical context the radiologists seek.
For the last study, we propose a guided search method that integrates an image with a radiologistâs text description to guide the retrieval process. This method guarantees that the retrieved images are suitable for the comparative analysis to confirm or rule
out initial diagnoses (the differential diagnosis procedure). Furthermore, our method is based on a deep metric learning technique and is better than traditional content-based approaches that rely on only image features and, thus, sometimes retrieve insignificant random images
ACCESSING REFERENTIAL INFORMATION DURING TEXT COMPOSITION : WHEN AND WHY ?
When composing a text, writers have to continually shift between content planning and content translating. This continuous shifting gives the writing activity its cyclic nature. The first section of this paper will analyse the writing process as a hierarchical cyclic activity. A methodological paradigm will be proposed for the investigation of the writing process. In the second section, we will partially present two experiments that were conducted independently, with this paradigm. Both give a coherent and interesting picture of what happens with content while the writer is planning. The characteristics of cycles depend both on the nature of the content information being recovered and on the complexity of the processes applied to this content
VITALAS at TRECVID-2008
In this paper, we present our experiments in TRECVID 2008 about High-Level feature extraction task. This is the first year for our participation in TRECVID, our system adopts some popular approaches that other workgroups proposed before. We proposed 2 advanced low-level features NEW Gabor texture descriptor and the Compact-SIFT Codeword histogram. Our system applied well-known LIBSVM to train the SVM classifier for the basic classifier. In fusion step, some methods were employed such as the Voting, SVM-base, HCRF and Bootstrap Average AdaBoost(BAAB)
General guidelines for designing bilingual low cost digital library services suitable for special library users in developing countries and the Arabic speaking world
The World is witnessing a considerable transformation from print based-formats to elec-tronic-based formats thanks to advanced computing technology, which has a profound impact on the dissemination of nearly all previous formats of publications into digital formats on computer networks. Text, still and moving images, sound tracks, music, and almost all known formats can be stored and retrieved on computer magnetic disk. Over the last two decades, a number of special libraries and information centres in the Arab world have introduced electronic resources into their library services. Very few have implemented automated and integrated systems. Despite the im-portance of designing digital libraries not merely for accessing to or retrieval of information but rather for the provision of electronic services, hardly any special library has started the design of digital library services. Managers of special libraries and information centres in developing countries in general and in the Arab world in particular should start building their local digital libraries, as the benefit of establishing such electronic services is considerably massive and well known for expansion of re-search activities and for delivering services that satisfy the needs of targeted end-users. The aim of this paper is to provide general guideline for design of special low cost digital library providing ser-vices that are most frequently required by various categories of special library users in developing countries. This paper also aims at illustrating strategies and method approaches that can be adopted for building such projects. Seeing the importance of designing an inexpensive digital li-brary as basic principle for the design accordingly, the utilisation of today's ICTs and freely avail-able open sources software is the right path for accomplishing such goal. The paper intends to de-scribe the phases and stages required for building such projects from scratch. It also aims at high-lighting the barriers and obstacles facing Arabic content and how could such problems overcome
Text Analytics for Android Project
Most advanced text analytics and text mining tasks include text classification, text clustering, building ontology, concept/entity extraction, summarization, deriving patterns within the structured data, production of granular taxonomies, sentiment and emotion analysis, document summarization, entity relation modelling, interpretation of the output. Already existing text analytics and text mining cannot develop text material alternatives (perform a multivariant design), perform multiple criteria analysis,
automatically select the most effective variant according to different aspects (citation index of papers (Scopus, ScienceDirect, Google Scholar) and authors (Scopus, ScienceDirect, Google Scholar), Top 25 papers, impact factor of journals, supporting phrases, document name and contents, density of keywords), calculate utility degree and market value. However, the Text Analytics for Android Project can perform the aforementioned functions. To the best of the knowledge herein, these functions have not been previously implemented; thus this is the first attempt to do so. The Text Analytics for Android Project is briefly described in this article
Semantic annotation, publication, and discovery of Java software components: an integrated approach
Component-based software development has matured into standard practice in software engineering. Among the advantages of reusing software modules are lower costs, faster development, more manageable code, increased productivity, and improved software quality. As the number of available software components has grown, so has the need for effective component search and retrieval. Traditional search approaches, such as keyword matching, have proved ineffective when applied to software components. Applying a semantically- enhanced approach to component classification, publication, and discovery can greatly increase the efficiency of searching and retrieving software components. This has been already applied in the context of Web technologies, and Web services in particular, in the frame of Semantic Web Services research. This paper examines the similarities between software components and Web services and adapts an existing Semantic Web Service publication and discovery solution into a software component annotation and discovery tool which is implemented as an Eclipse plug-in
- âŠ