86,500 research outputs found
The Robust Reading Competition Annotation and Evaluation Platform
The ICDAR Robust Reading Competition (RRC), initiated in 2003 and
re-established in 2011, has become a de-facto evaluation standard for robust
reading systems and algorithms. Concurrent with its second incarnation in 2011,
a continuous effort started to develop an on-line framework to facilitate the
hosting and management of competitions. This paper outlines the Robust Reading
Competition Annotation and Evaluation Platform, the backbone of the
competitions. The RRC Annotation and Evaluation Platform is a modular
framework, fully accessible through on-line interfaces. It comprises a
collection of tools and services for managing all processes involved with
defining and evaluating a research task, from dataset definition to annotation
management, evaluation specification and results analysis. Although the
framework has been designed with robust reading research in mind, many of the
provided tools are generic by design. All aspects of the RRC Annotation and
Evaluation Framework are available for research use.Comment: 6 pages, accepted to DAS 201
Finding Person Relations in Image Data of the Internet Archive
The multimedia content in the World Wide Web is rapidly growing and contains
valuable information for many applications in different domains. For this
reason, the Internet Archive initiative has been gathering billions of
time-versioned web pages since the mid-nineties. However, the huge amount of
data is rarely labeled with appropriate metadata and automatic approaches are
required to enable semantic search. Normally, the textual content of the
Internet Archive is used to extract entities and their possible relations
across domains such as politics and entertainment, whereas image and video
content is usually neglected. In this paper, we introduce a system for person
recognition in image content of web news stored in the Internet Archive. Thus,
the system complements entity recognition in text and allows researchers and
analysts to track media coverage and relations of persons more precisely. Based
on a deep learning face recognition approach, we suggest a system that
automatically detects persons of interest and gathers sample material, which is
subsequently used to identify them in the image data of the Internet Archive.
We evaluate the performance of the face recognition system on an appropriate
standard benchmark dataset and demonstrate the feasibility of the approach with
two use cases
Digital Preservation Services : State of the Art Analysis
Research report funded by the DC-NET project.An overview of the state of the art in service provision for digital preservation and curation. Its focus is on the areas where bridging the gaps is needed between e-Infrastructures and efficient and forward-looking digital preservation services. Based on a desktop study and a rapid analysis of some 190 currently available tools and services for digital preservation, the deliverable provides a high-level view on the range of instruments currently on offer to support various functions within a preservation system.European Commission, FP7peer-reviewe
Accessibility-based reranking in multimedia search engines
Traditional multimedia search engines retrieve results based mostly on the query submitted by the user, or using a log of previous searches to provide personalized results, while not considering the accessibility of the results for users with vision or other types of impairments. In this paper, a novel approach is presented which incorporates the accessibility of images for users with various vision impairments, such as color blindness, cataract and glaucoma, in order to rerank the results of an image search engine. The accessibility of individual images is measured through the use of vision simulation filters. Multi-objective optimization techniques utilizing the image accessibility scores are used to handle users with multiple vision impairments, while the impairment profile of a specific user is used to select one from the Pareto-optimal solutions. The proposed approach has been tested with two image datasets, using both simulated and real impaired users, and the results verify its applicability. Although the proposed method has been used for vision accessibility-based reranking, it can also be extended for other types of personalization context
Optical tomography: Image improvement using mixed projection of parallel and fan beam modes
Mixed parallel and fan beam projection is a technique used to increase the quality images. This research focuses on enhancing the image quality in optical tomography. Image quality can be deďŹned by measuring the Peak Signal to Noise Ratio (PSNR) and Normalized Mean Square Error (NMSE) parameters. The ďŹndings of this research prove that by combining parallel and fan beam projection, the image quality can be increased by more than 10%in terms of its PSNR value and more than 100% in terms of its NMSE value compared to a single parallel beam
Interactive searching and browsing of video archives: using text and using image matching
Over the last number of decades much research work has been done in the general area of video and audio analysis. Initially the applications driving this included capturing video in digital form and then being able to store, transmit
and render it, which involved a large effort to develop compression and encoding standards. The technology needed to do all this is now easily available and cheap, with applications of digital video processing now commonplace,
ranging from CCTV (Closed Circuit TV) for security, to home capture of broadcast TV on home DVRs for personal viewing.
One consequence of the development in technology for creating, storing and distributing digital video is that there has been a huge increase in the volume of digital video, and this in turn has created a need for techniques to allow effective management of this video, and by that we mean content management. In the BBC, for example, the archives department receives approximately 500,000 queries per year and has over 350,000 hours of content in its library. Having huge archives of video information is hardly any benefit if we have no effective means of being able to locate video clips which are of relevance to whatever our information needs may be. In this chapter we report our work on developing two specific retrieval and browsing tools for digital video information. Both of these are based on an analysis of the captured video for the purpose of automatically structuring into shots or higher level semantic units like TV news stories. Some also include analysis of the video for the automatic detection of features such as the presence or absence of faces. Both include some elements of searching, where a user specifies a query or information need, and browsing, where a user is allowed to browse through sets of retrieved video shots. We support the presentation of these tools with illustrations of actual video retrieval systems developed and working on hundreds of hours of video content
- âŚ