9,176 research outputs found

    PANEL: Challenges for multimedia/multimodal research in the next decade

    Get PDF
    The multimedia and multimodal community is witnessing an explosive transformation in the recent years with major societal impact. With the unprecedented deployment of multimedia devices and systems, multimedia research is critical to our abilities and prospects in advancing state-of-theart technologies and solving real-world challenges facing the society and the nation. To respond to these challenges and further advance the frontiers of the field of multimedia, this panel will discuss the challenges and visions that may guide future research in the next ten years

    Transportation mode recognition fusing wearable motion, sound and vision sensors

    Get PDF
    We present the first work that investigates the potential of improving the performance of transportation mode recognition through fusing multimodal data from wearable sensors: motion, sound and vision. We first train three independent deep neural network (DNN) classifiers, which work with the three types of sensors, respectively. We then propose two schemes that fuse the classification results from the three mono-modal classifiers. The first scheme makes an ensemble decision with fixed rules including Sum, Product, Majority Voting, and Borda Count. The second scheme is an adaptive fuser built as another classifier (including Naive Bayes, Decision Tree, Random Forest and Neural Network) that learns enhanced predictions by combining the outputs from the three mono-modal classifiers. We verify the advantage of the proposed method with the state-of-the-art Sussex-Huawei Locomotion and Transportation (SHL) dataset recognizing the eight transportation activities: Still, Walk, Run, Bike, Bus, Car, Train and Subway. We achieve F1 scores of 79.4%, 82.1% and 72.8% with the mono-modal motion, sound and vision classifiers, respectively. The F1 score is remarkably improved to 94.5% and 95.5% by the two data fusion schemes, respectively. The recognition performance can be further improved with a post-processing scheme that exploits the temporal continuity of transportation. When assessing generalization of the model to unseen data, we show that while performance is reduced - as expected - for each individual classifier, the benefits of fusion are retained with performance improved by 15 percentage points. Besides the actual performance increase, this work, most importantly, opens up the possibility for dynamically fusing modalities to achieve distinct power-performance trade-off at run time

    Services surround you:physical-virtual linkage with contextual bookmarks

    Get PDF
    Our daily life is pervaded by digital information and devices, not least the common mobile phone. However, a seamless connection between our physical world, such as a movie trailer on a screen in the main rail station and its digital counterparts, such as an online ticket service, remains difficult. In this paper, we present contextual bookmarks that enable users to capture information of interest with a mobile camera phone. Depending on the user’s context, the snapshot is mapped to a digital service such as ordering tickets for a movie theater close by or a link to the upcoming movie’s Web page

    Characteristics of pervasive learning environments in museum contexts

    Get PDF
    There is no appropriate learning model for pervasive learning environments (PLEs), and museums maintain authenticity at the cost of unmarked information. To address these problems, we present the LieksaMyst PLE developed for Pielinen Museum and we derive a set of characteristics that an effective PLE should meet and which form the basis of a new learning model currently under development. We discuss how the characteristics are addressed in LieksaMyst and present an evaluation of the game component of LieksaMyst. Results indicate that, while some usability issues remain to be resolved, the game was received well by the participants enabling them to immerse themselves in the story and to interact effectively with its virtual characters

    Multimodal Generic Framework for Multimedia Documents Adaptation

    Get PDF
    Today, people are increasingly capable of creating and sharing documents (which generally are multimedia oriented) via the internet. These multimedia documents can be accessed at anytime and anywhere (city, home, etc.) on a wide variety of devices, such as laptops, tablets and smartphones. The heterogeneity of devices and user preferences has raised a serious issue for multimedia contents adaptation. Our research focuses on multimedia documents adaptation with a strong focus on interaction with users and exploration of multimodality. We propose a multimodal framework for adapting multimedia documents based on a distributed implementation of W3C’s Multimodal Architecture and Interfaces applied to ubiquitous computing. The core of our proposed architecture is the presence of a smart interaction manager that accepts context related information from sensors in the environment as well as from other sources, including information available on the web and multimodal user inputs. The interaction manager integrates and reasons over this information to predict the user’s situation and service use. A key to realizing this framework is the use of an ontology that undergirds the communication and representation, and the use of the cloud to insure the service continuity on heterogeneous mobile devices. Smart city is assumed as the reference scenario
    corecore