13,698 research outputs found
A toolkit of mechanism and context independent widgets
Most human-computer interfaces are designed to run on a static platform (e.g. a workstation with a monitor) in a static environment (e.g. an office). However, with mobile devices becoming ubiquitous and capable of running applications similar to those found on static devices, it is no longer valid to design static interfaces. This paper describes a user-interface architecture which allows interactors to be flexible about the way they are presented. This flexibility is defined by the different input and output mechanisms used. An interactor may use different mechanisms depending upon their suitability in the current context, user preference and the resources available for presentation using that mechanism
Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot
We explore new aspects of assistive living on smart human-robot interaction
(HRI) that involve automatic recognition and online validation of speech and
gestures in a natural interface, providing social features for HRI. We
introduce a whole framework and resources of a real-life scenario for elderly
subjects supported by an assistive bathing robot, addressing health and hygiene
care issues. We contribute a new dataset and a suite of tools used for data
acquisition and a state-of-the-art pipeline for multimodal learning within the
framework of the I-Support bathing robot, with emphasis on audio and RGB-D
visual streams. We consider privacy issues by evaluating the depth visual
stream along with the RGB, using Kinect sensors. The audio-gestural recognition
task on this new dataset yields up to 84.5%, while the online validation of the
I-Support system on elderly users accomplishes up to 84% when the two
modalities are fused together. The results are promising enough to support
further research in the area of multimodal recognition for assistive social
HRI, considering the difficulties of the specific task. Upon acceptance of the
paper part of the data will be publicly available
CHORUS Deliverable 4.3: Report from CHORUS workshops on national initiatives and metadata
Minutes of the following Workshops:
âą National Initiatives on Multimedia Content Description and Retrieval, Geneva, October 10th, 2007.
âą Metadata in Audio-Visual/Multimedia production and archiving, Munich, IRT, 21st â 22nd November 2007
Workshop in Geneva 10/10/2007
This highly successful workshop was organised in cooperation with the European Commission. The event brought together
the technical, administrative and financial representatives of the various national initiatives, which have been established
recently in some European countries to support research and technical development in the area of audio-visual content
processing, indexing and searching for the next generation Internet using semantic technologies, and which may lead to an
internet-based knowledge infrastructure. The objective of this workshop was to provide a platform for mutual information
and exchange between these initiatives, the European Commission and the participants. Top speakers were present from
each of the national initiatives. There was time for discussions with the audience and amongst the European National
Initiatives. The challenges, communalities, difficulties, targeted/expected impact, success criteria, etc. were tackled. This
workshop addressed how these national initiatives could work together and benefit from each other.
Workshop in Munich 11/21-22/2007
Numerous EU and national research projects are working on the automatic or semi-automatic generation of descriptive and
functional metadata derived from analysing audio-visual content. The owners of AV archives and production facilities are
eagerly awaiting such methods which would help them to better exploit their assets.Hand in hand with the digitization of
analogue archives and the archiving of digital AV material, metadatashould be generated on an as high semantic level as
possible, preferably fully automatically. All users of metadata rely on a certain metadata model. All AV/multimedia search
engines, developed or under current development, would have to respect some compatibility or compliance with the
metadata models in use. The purpose of this workshop is to draw attention to the specific problem of metadata models in the
context of (semi)-automatic multimedia search
Recommended from our members
Mobile Learning Revolution: Implications for Language Pedagogy
Mobile technologies including cell phones and tablets are a pervasive feature of everyday life with potential impact on teaching and learning. âMobile pedagogyâ may seem like a contradiction in terms, since mobile learning often takes place physically beyond the teacher's reach, outside the walls of the classroom. While pedagogy implies careful planning, mobility exposes learners to the unexpected. A thoughtful pedagogical response to this reality involves new conceptualizations of what is to be learned and new activity designs. This approach recognizes that learners may act in more self-determined ways beyond the classroom walls, where online interactions and mobile encounters influence their target language communication needs and interests. The chapter sets out a range of opportunities for out-of-class mobile language learning that give learners an active role and promote communication. It then considers the implications of these developments for language content and curricula and the evolving roles and competences of teachers
Developing a comprehensive framework for multimodal feature extraction
Feature extraction is a critical component of many applied data science
workflows. In recent years, rapid advances in artificial intelligence and
machine learning have led to an explosion of feature extraction tools and
services that allow data scientists to cheaply and effectively annotate their
data along a vast array of dimensions---ranging from detecting faces in images
to analyzing the sentiment expressed in coherent text. Unfortunately, the
proliferation of powerful feature extraction services has been mirrored by a
corresponding expansion in the number of distinct interfaces to feature
extraction services. In a world where nearly every new service has its own API,
documentation, and/or client library, data scientists who need to combine
diverse features obtained from multiple sources are often forced to write and
maintain ever more elaborate feature extraction pipelines. To address this
challenge, we introduce a new open-source framework for comprehensive
multimodal feature extraction. Pliers is an open-source Python package that
supports standardized annotation of diverse data types (video, images, audio,
and text), and is expressly with both ease-of-use and extensibility in mind.
Users can apply a wide range of pre-existing feature extraction tools to their
data in just a few lines of Python code, and can also easily add their own
custom extractors by writing modular classes. A graph-based API enables rapid
development of complex feature extraction pipelines that output results in a
single, standardized format. We describe the package's architecture, detail its
major advantages over previous feature extraction toolboxes, and use a sample
application to a large functional MRI dataset to illustrate how pliers can
significantly reduce the time and effort required to construct sophisticated
feature extraction workflows while increasing code clarity and maintainability
Sharing Human-Generated Observations by Integrating HMI and the Semantic Sensor Web
Current âInternet of Thingsâ concepts point to a future where connected objects gather meaningful information about their environment and share it with other objects and people. In particular, objects embedding Human Machine Interaction (HMI), such as mobile devices and, increasingly, connected vehicles, home appliances, urban interactive infrastructures, etc., may not only be conceived as sources of sensor information, but, through interaction with their users, they can also produce highly valuable context-aware human-generated observations. We believe that the great promise offered by combining and sharing all of the different sources of information available can be realized through the integration of HMI and Semantic Sensor Web technologies. This paper presents a technological framework that harmonizes two of the most influential HMI and Sensor Web initiatives: the W3Câs Multimodal Architecture and Interfaces (MMI) and the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) with its semantic extension, respectively. Although the proposed framework is general enough to be applied in a variety of connected objects integrating HMI, a particular development is presented for a connected car scenario where driversâ observations about the traffic or their environment are shared across the Semantic Sensor Web. For implementation and evaluation purposes an on-board OSGi (Open Services Gateway Initiative) architecture was built, integrating several available HMI, Sensor Web and Semantic Web technologies. A technical performance test and a conceptual validation of the scenario with potential users are reported, with results suggesting the approach is soun
Recommended from our members
NoTube â making TV a medium for personalized interaction
In this paper, we introduce NoTubeâs vision on deploying semantics in interactive TV context in order to contextualize distributed applications and lift them to a new level of service that provides context-dependent and personalized selection of TV content. Additionally, lifting content consumption from a single-user activity to a community-based experience in a connected multi-device environment is central to the project. Main research questions relate to (1) data integration and enrichment - how to achieve unified and simple access to dynamic, growing and distributed multimedia content of diverse formats? (2) user and context modeling - what is an appropriate framework for context modeling, incorporating task-, domain and device-specific viewpoints? (3) context-aware discovery of resources - how could rather fuzzy matchmaking between potentially infinite contexts and available media resources be achieved? (4) collaborative architecture for TV content personalization - how can the combined information about data, context and user be put at disposal of both content providers and end-users in the view of creating extremely personalized services under controlled privacy and security policies? Thus, with the grand challenge in mind - to put the TV viewer back in the driver's seat â we focus on TV content as a medium for personalized interaction between people based on a service architecture that caters for a variety of content metadata, delivery channels and rendering devices
Recommended from our members
uC: Ubiquitous Collaboration Platform for Multimodal Team Interaction Support
A human-centered computing platform that improves teamwork and transforms the âhuman- computer interaction experienceâ for distributed teams is presented. This Ubiquitous Collaboration, or uC (âyou seeâ), platform\u27s objective is to transform distributed teamwork (i.e., work occurring when teams of workers and learners are geographically dispersed and often interacting at different times). It achieves this goal through a multimodal team interaction interface realized through a reconfigurable open architecture. The approach taken is to integrate: (1) an intuitive speech- and video-centric multi-modal interface to augment more conventional methods (e.g., mouse, stylus and touch), (2) an open and reconfigurable architecture supporting information gathering, and (3) a machine intelligent approach to analysis and management of heterogeneous live and stored sensor data to support collaboration. The system will transform how teams of people interact with computers by drawing on both the virtual and physical environment
- âŠ