42,661 research outputs found
A lightweight web video model with content and context descriptions for integration with linked data
The rapid increase of video data on the Web has warranted an urgent need for effective representation, management and retrieval of web videos. Recently, many studies have been carried out for ontological representation of videos, either using domain dependent or generic schemas such as MPEG-7, MPEG-4, and COMM. In spite of their extensive coverage and sound theoretical grounding, they are yet to be widely used by users. Two main possible reasons are the complexities involved and a lack of tool support. We propose a lightweight video content model for content-context description and integration. The uniqueness of the model is that it tries to model the emerging social context to describe and interpret the video. Our approach is grounded on exploiting easily extractable evolving contextual metadata and on the availability of existing data on the Web. This enables representational homogeneity and a firm basis for information integration among semantically-enabled data sources. The model uses many existing schemas to describe various ontology classes and shows the scope of interlinking with the Linked Data cloud
Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project
In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events ” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks. Index Terms: Networked multimedia events; audio processing: speech recognition; speaker diarization and linking; multimedia indexing and searching; hyper-events. 1
Analyzing Large Collections of Electronic Text Using OLAP
Computer-assisted reading and analysis of text has various applications in
the humanities and social sciences. The increasing size of many electronic text
archives has the advantage of a more complete analysis but the disadvantage of
taking longer to obtain results. On-Line Analytical Processing is a method used
to store and quickly analyze multidimensional data. By storing text analysis
information in an OLAP system, a user can obtain solutions to inquiries in a
matter of seconds as opposed to minutes, hours, or even days. This analysis is
user-driven allowing various users the freedom to pursue their own direction of
research
Evolving Lucene search queries for text classification
We describe a method for generating accurate, compact, human
understandable text classifiers. Text datasets are indexed using Apache Lucene and Genetic Programs are used to construct
Lucene search queries. Genetic programs acquire fitness by
producing queries that are effective binary classifiers for a
particular category when evaluated against a set of training
documents. We describe a set of functions and terminals and
provide results from classification tasks
- …