24 research outputs found
Scenemash: Multimodal Route Summarization for City Exploration
The potential of mining tourist information from social multimedia data gives rise to new applications offering much richer impressions of the city. In this paper we propose Scenemash, a system that generates multimodal summaries of multiple alternative routes between locations in a city. To get insight into the geographic areas on the route, we collect a dataset of community-contributed images and their associated annotations from Foursquare and Flickr. We identify images and terms representative of a geographic area by jointly analysing distributions of a large number of semantic concepts detected in the visual content and latent topics extracted from associated text. Scenemash prototype is implemented as an Android app for smartphones and smartwatches
A lexicon based approach to classification of ICD10 codes. IMS unipd at CLEF eHealth task 1
International audienc
First international workshop on recent trends in news information retrieval (NewsIR’16)
The news industry has gone through seismic shifts in the past decade with digital content and social media completely redefining how people consume news. Readers check for accurate fresh news from multiple sources throughout the day using dedicated apps or social media on their smartphones and tablets. At the same time, news publishers rely more and more on social networks and citizen journalism as a frontline to breaking news. In this new era of fast-flowing instant news delivery and consumption, publishers and aggregators have to overcome a great number of challenges. These include the verification or assessment of a source’s reliability; the integration of news with other sources of information; real-time processing of both news content and social streams in multiple languages, in different formats and in high volumes; deduplication; entity detection and disambiguation; automatic summarization; and news recommendation. Although Information Retrieval (IR) applied to news has been a popular research area for decades, fresh approaches are needed due to the changing type and volume of media content available and the way people consume this content. The goal of this workshop is to stimulate discussion around new and powerful uses of IR applied to news sources and the intersection of multiple IR tasks to solve real user problems. To promote research efforts in this area, we released a new dataset consisting of one million news articles to the research community and introduced a data challenge track as part of the workshop
Advances in information retrieval: 38th European conference on IR research, ECIR 2016 Padua, Italy, march 20–23, 2016 proceedings
This is the abstrac
MedCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval
Information retrieval (IR) is essential in biomedical knowledge acquisition
and clinical decision support. While recent progress has shown that language
model encoders perform better semantic retrieval, training such models requires
abundant query-article annotations that are difficult to obtain in biomedicine.
As a result, most biomedical IR systems only conduct lexical matching. In
response, we introduce MedCPT, a first-of-its-kind Contrastively Pre-trained
Transformer model for zero-shot semantic IR in biomedicine. To train MedCPT, we
collected an unprecedented scale of 255 million user click logs from PubMed.
With such data, we use contrastive learning to train a pair of
closely-integrated retriever and re-ranker. Experimental results show that
MedCPT sets new state-of-the-art performance on six biomedical IR tasks,
outperforming various baselines including much larger models such as
GPT-3-sized cpt-text-XL. In addition, MedCPT also generates better biomedical
article and sentence representations for semantic evaluations. As such, MedCPT
can be readily applied to various real-world biomedical IR tasks.Comment: The MedCPT code and API are available at
https://github.com/ncbi/MedCP