9,804 research outputs found
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Benchmarking some Portuguese S&T system research units: 2nd Edition
The increasing use of productivity and impact metrics for evaluation and
comparison, not only of individual researchers but also of institutions,
universities and even countries, has prompted the development of bibliometrics.
Currently, metrics are becoming widely accepted as an easy and balanced way to
assist the peer review and evaluation of scientists and/or research units,
provided they have adequate precision and recall.
This paper presents a benchmarking study of a selected list of representative
Portuguese research units, based on a fairly complete set of parameters:
bibliometric parameters, number of competitive projects and number of PhDs
produced. The study aimed at collecting productivity and impact data from the
selected research units in comparable conditions i.e., using objective metrics
based on public information, retrievable on-line and/or from official sources
and thus verifiable and repeatable. The study has thus focused on the activity
of the 2003-06 period, where such data was available from the latest official
evaluation.
The main advantage of our study was the application of automatic tools,
achieving relevant results at a reduced cost. Moreover, the results over the
selected units suggest that this kind of analyses will be very useful to
benchmark scientific productivity and impact, and assist peer review.Comment: 26 pages, 20 figures F. Couto, D. Faria, B. Tavares, P.
Gon\c{c}alves, and P. Verissimo, Benchmarking some portuguese S\&T system
research units: 2nd edition, DI/FCUL TR 13-03, Department of Informatics,
University of Lisbon, February 201
A Continuously Growing Dataset of Sentential Paraphrases
A major challenge in paraphrase research is the lack of parallel corpora. In
this paper, we present a new method to collect large-scale sentential
paraphrases from Twitter by linking tweets through shared URLs. The main
advantage of our method is its simplicity, as it gets rid of the classifier or
human in the loop needed to select data before annotation and subsequent
application of paraphrase identification algorithms in the previous work. We
present the largest human-labeled paraphrase corpus to date of 51,524 sentence
pairs and the first cross-domain benchmarking for automatic paraphrase
identification. In addition, we show that more than 30,000 new sentential
paraphrases can be easily and continuously captured every month at ~70%
precision, and demonstrate their utility for downstream NLP tasks through
phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201
- …