5,402 research outputs found
Late Semantic Fusion Approach for the Retrieval of Multimedia Data
In Multimedia information retrieval late semantic fusion is used to combine textual pre-filtering with an image re-ranking. Three steps are used for retrieval processes. Visual and textual techniques are combined to help the developed Multimedia Information Retrieval System to minimize the semantic gap for given query. In the paper, different late semantic fusion approaches i.e. Product, Enrich, MaxMerge and FilterN are used and for experiments publicly available ImageCLEF Wikipedia Collection is used.
DOI: 10.17762/ijritcc2321-8169.150610
Infectious Disease Ontology
Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain
Adaptation of machine translation for multilingual information retrieval in the medical domain
Objective. We investigate machine translation (MT) of user search queries in the context of cross-lingual information retrieval (IR) in the medical domain. The main focus is on techniques to adapt MT to increase translation quality; however, we also explore MT adaptation to improve eectiveness of cross-lingual IR.
Methods and Data. Our MT system is Moses, a state-of-the-art phrase-based statistical machine translation system. The IR system is based on the BM25 retrieval model implemented in the Lucene search engine. The MT techniques employed in this work include in-domain training and tuning, intelligent training data selection, optimization of phrase table configuration, compound
splitting, and exploiting synonyms as translation variants. The IR methods include morphological normalization and using multiple translation variants for query expansion. The experiments are performed and thoroughly evaluated on three language pairs: Czech–English, German–English, and French–English. MT quality is evaluated on data sets created within the Khresmoi project and IR eectiveness is tested on the CLEF eHealth 2013 data sets.
Results. The search query translation results achieved in our experiments are outstanding – our systems outperform not only our strong baselines, but also Google Translate and Microsoft Bing Translator in direct comparison carried out on all the language pairs. The baseline BLEU scores increased from 26.59 to 41.45 for Czech–English, from 23.03 to 40.82 for German–English, and from 32.67 to 40.82 for French–English. This is a 55% improvement on average. In terms of the IR performance on this
particular test collection, a significant improvement over the baseline is achieved only for French–English. For Czech–English and German–English, the increased MT quality does not lead to better IR results.
Conclusions. Most of the MT techniques employed in our experiments improve MT of medical search queries. Especially the intelligent training data selection proves to be very successful for domain adaptation of MT. Certain improvements are also obtained from German compound splitting on the source language side. Translation quality, however, does not appear to correlate with the IR performance – better translation does not necessarily yield better retrieval. We discuss in detail the contribution of the individual techniques and state-of-the-art features and provide future research directions
Recommended from our members
Computational cytometer based on magnetically modulated coherent imaging and deep learning.
Detecting rare cells within blood has numerous applications in disease diagnostics. Existing rare cell detection techniques are typically hindered by their high cost and low throughput. Here, we present a computational cytometer based on magnetically modulated lensless speckle imaging, which introduces oscillatory motion to the magnetic-bead-conjugated rare cells of interest through a periodic magnetic force and uses lensless time-resolved holographic speckle imaging to rapidly detect the target cells in three dimensions (3D). In addition to using cell-specific antibodies to magnetically label target cells, detection specificity is further enhanced through a deep-learning-based classifier that is based on a densely connected pseudo-3D convolutional neural network (P3D CNN), which automatically detects rare cells of interest based on their spatio-temporal features under a controlled magnetic force. To demonstrate the performance of this technique, we built a high-throughput, compact and cost-effective prototype for detecting MCF7 cancer cells spiked in whole blood samples. Through serial dilution experiments, we quantified the limit of detection (LoD) as 10 cells per millilitre of whole blood, which could be further improved through multiplexing parallel imaging channels within the same instrument. This compact, cost-effective and high-throughput computational cytometer can potentially be used for rare cell detection and quantification in bodily fluids for a variety of biomedical applications
Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference
No abstract available
Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference
No abstract available
Recommended from our members
Response Retrieval in Information-seeking Conversations
The increasing popularity of mobile Internet has led to several crucial changes in the way that people use search engines compared with traditional Web search on desktops. On one hand, there is limited output bandwidth with the small screen sizes of most mobile devices. Mobile Internet users prefer direct answers on the search engine result page (SERP). On the other hand, voice-based / text-based conversational interfaces are becoming increasing popular as shown in the wide adoption of intelligent assistant services and devices such as Amazon Echo, Microsoft Cortana and Google Assistant around the world. These important changes have triggered several new challenges that search engines have had to adapt to in order to better satisfy the information needs of mobile Internet users. In this dissertation, we investigate several aspects of single-turn answer retrieval and multi-turn information-seeking conversations to handle the new challenges of search on the mobile Internet.
We start from the research on single-turn answer retrieval and analyze the weaknesses of existing deep learning architectures for answer ranking. Then we propose an attention based neural matching model with a value-shared weighting scheme and attention mechanism to improve existing deep neural answer ranking models. Our proposed model achieves state-of-the-art performance for answer sentence retrieval compared with both feature engineering based methods and other neural models.
Then we move on to study response retrieval in multi-turn information-seeking conversations beyond single-turn interactions. Much research on response selection in conversation systems is modeling the matching patterns between user input message (either with context or not) and response candidates, which ignores external knowledge beyond the dialog utterances. We propose a learning framework on top of deep neural matching networks that leverages external knowledge with pseudo-relevance feedback and QA correspondence knowledge distillation for response retrieval. We also study how to integrate user intent modeling into neural ranking models to improve response retrieval performance. Finally, hybrid models of response retrieval and generation are investigated in order to combine the merits of these two different paradigms of conversation models.
Our goal is to develop effective learning models for answer retrieval and information-seeking conversations, in order to improve the effectiveness and user experience when accessing information with a touch screen interface or a conversational interface, as commonly adopted by millions of mobile Internet devices
EMMA-X: An EM-like Multilingual Pre-training Algorithm for Cross-lingual Representation Learning
Expressing universal semantics common to all languages is helpful in
understanding the meanings of complex and culture-specific sentences. The
research theme underlying this scenario focuses on learning universal
representations across languages with the usage of massive parallel corpora.
However, due to the sparsity and scarcity of parallel data, there is still a
big challenge in learning authentic ``universals'' for any two languages. In
this paper, we propose EMMA-X: an EM-like Multilingual pre-training Algorithm,
to learn (X)Cross-lingual universals with the aid of excessive multilingual
non-parallel data. EMMA-X unifies the cross-lingual representation learning
task and an extra semantic relation prediction task within an EM framework.
Both the extra semantic classifier and the cross-lingual sentence encoder
approximate the semantic relation of two sentences, and supervise each other
until convergence. To evaluate EMMA-X, we conduct experiments on XRETE, a newly
introduced benchmark containing 12 widely studied cross-lingual tasks that
fully depend on sentence-level representations. Results reveal that EMMA-X
achieves state-of-the-art performance. Further geometric analysis of the built
representation space with three requirements demonstrates the superiority of
EMMA-X over advanced models.Comment: Accepted by NeurIPS 202
- …