2,171 research outputs found
An MPEG-7 scheme for semantic content modelling and filtering of digital video
Abstract Part 5 of the MPEG-7 standard specifies Multimedia Description Schemes (MDS); that is, the format multimedia content models should conform to in order to ensure interoperability across multiple platforms and applications. However, the standard does not specify how the content or the associated model may be filtered. This paper proposes an MPEG-7 scheme which can be deployed for digital video content modelling and filtering. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user. We present details of the scheme, front-end systems used for content modelling and filtering and experiences with a number of users
Digital Image
This paper considers the ontological significance of invisibility in relation to the question ‘what is a digital image?’ Its argument in a nutshell is that the emphasis on visibility comes at the expense of latency and is symptomatic of the style of thinking that dominated Western philosophy since Plato. This privileging of visible content necessarily binds images to linguistic (semiotic and structuralist) paradigms of interpretation which promote representation, subjectivity, identity and negation over multiplicity, indeterminacy and affect. Photography is the case in point because until recently critical approaches to photography had one thing in common: they all shared in the implicit and incontrovertible understanding that photographs are a medium that must be approached visually; they took it as a given that photographs are there to be looked at and they all agreed that it is only through the practices of spectatorship that the secrets of the image can be unlocked. Whatever subsequent interpretations followed, the priori- ty of vision in relation to the image remained unperturbed. This undisputed belief in the visibility of the image has such a strong grasp on theory that it imperceptibly bonded together otherwise dissimilar and sometimes contradictory methodol- ogies, preventing them from noticing that which is the most unexplained about images: the precedence of looking itself. This self-evident truth of visibility casts a long shadow on im- age theory because it blocks the possibility of inquiring after everything that is invisible, latent and hidden
MirBot: A collaborative object recognition system for smartphones using convolutional neural networks
MirBot is a collaborative application for smartphones that allows users to
perform object recognition. This app can be used to take a photograph of an
object, select the region of interest and obtain the most likely class (dog,
chair, etc.) by means of similarity search using features extracted from a
convolutional neural network (CNN). The answers provided by the system can be
validated by the user so as to improve the results for future queries. All the
images are stored together with a series of metadata, thus enabling a
multimodal incremental dataset labeled with synset identifiers from the WordNet
ontology. This dataset grows continuously thanks to the users' feedback, and is
publicly available for research. This work details the MirBot object
recognition system, analyzes the statistics gathered after more than four years
of usage, describes the image classification methodology, and performs an
exhaustive evaluation using handcrafted features, convolutional neural codes
and different transfer learning techniques. After comparing various models and
transformation methods, the results show that the CNN features maintain the
accuracy of MirBot constant over time, despite the increasing number of new
classes. The app is freely available at the Apple and Google Play stores.Comment: Accepted in Neurocomputing, 201
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Linking archival data to location A case study at the UK National Archives
Purpose
The National Archives (TNA) is the UK Government's official archive. It stores and maintains records spanning over a 1,000 years in both physical and digital form. Much of the information held by TNA includes references to place and frequently user queries to TNA's online catalogue involve searches for location. The purpose of this paper is to illustrate how TNA have extracted the geographic references in their historic data to improve access to the archives.
Design/methodology/approach
To be able to quickly enhance the existing archival data with geographic information, existing technologies from Natural Language Processing (NLP) and Geographical Information Retrieval (GIR) have been utilised and adapted to historical archives.
Findings
Enhancing the archival records with geographic information has enabled TNA to quickly develop a number of case studies highlighting how geographic information can improve access to large‐scale archival collections. The use of existing methods from the GIR domain and technologies, such as OpenLayers, enabled one to quickly implement this process in a way that is easily transferable to other institutions.
Practical implications
The methods and technologies described in this paper can be adapted, by other archives, to similarly enhance access to their historic data. Also the data‐sharing methods described can be used to enable the integration of knowledge held at different archival institutions.
Originality/value
Place is one of the core dimensions for TNA's archival data. Many of the records which are held make reference to place data (wills, legislation, court cases), and approximately one fifth of users' searches involve place names. However, there are still a number of open questions regarding the adaptation of existing GIR methods to the history domain. This paper presents an overview over available GIR methods and the challenges in applying them to historical data
RELATIONSHIP ANALYSIS OF IMAGE DESCRIPTIONS: AN ONTOLOGICAL, CONTENT ANALYTIC APPROACH
The relationships humans express when describing images have powerful, but poorly understood, effects on how visual information is represented, structured, and processed in information systems. This study evaluates the benefits and difficulties of using content analysis and ontological analysis as predictors of relationship instances and types occurring in image descriptions. A random sample of 36 documented reference transactions from the administrative files of the Pittsburgh Photographic Library is analyzed in light of three describing contexts: image searcher, curator, and cataloger. Through the qualitative and quantitative assessment of image descriptions, the research leads to several key findings and contributions. The most important findings vindicate the claim that recognition, capture, and classification of relationship instances can be empirically grounded utilizing content analysis and ontological tools and methods. Evidence comes in successfully ascertaining and capturing in a Corpus the existence of 1,655 relationship instances. Further, the analysis finds evidence of relationship types and subtypes of relationships whose members share certain recognizable properties in common. The study also brings useful, new insights to the capture of background information surrounding events using situation-templates, introduces methods for formulating case relations and image attributes as binary predicates, and it offers a new, finer-grained definition of relationship. Contributions of this study include a corpus of relationship instances, an ontology of relationship types, and a methodological framework that provides significantly better results than earlier studies in the prediction of relationships (the architecture of which is depicted in Figure 22 on page 102). There are a number of ways this research could be extended and corroborated. First, event analysis ought to be tied to a system of semantic frame analysis. Second, test the content analysis form against other texts, which will result in elaboration of the core ontology of relationship types. Third, expand image description analysis beyond the current domain to include image description in visual ethnography, art history and criticism, and photography practices. Fourth, test how inference engines reason over relationships in knowledge-based environments. Finally, to aid in the analysis of the meanings of relationships, more work is needed in formalizing the ontological concepts used in image descriptions
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
The Digital Anatomist Information System and Its Use in the Generation and Delivery of Web-Based Anatomy Atlases
Advances in network and imaging technology, coupled with the availability of 3-D datasets
such as the Visible Human, provide a unique opportunity for developing information systems
in anatomy that can deliver relevant knowledge directly to the clinician, researcher or educator. A software framework is described for developing such a system within a distributed architecture that includes spatial and symbolic anatomy information resources, Web and custom servers, and authoring and end-user client programs. The authoring tools have been used to create 3-D atlases of the brain, knee and thorax that are used both locally and throughout the world. For the one and a half year period from June 1995–January 1997, the on-line atlases were accessed by over 33,000 sites from 94 countries, with an average of over 4000 ‘‘hits’’ per day, and 25,000 hits per day during peak exam periods. The atlases have been linked to by over 500 sites, and have received at least six unsolicited awards by outside rating institutions. The flexibility of the software framework has allowed the information system to evolve with advances in technology and representation methods. Possible new features include knowledge-based image retrieval and tutoring, dynamic generation of 3-D scenes, and eventually, real-time virtual reality navigation through the body. Such features, when coupled with other on-line biomedical information resources, should lead to interesting new ways for
managing and accessing structural information in medicine
- …