1,794 research outputs found

    Personalized Acoustic Modeling by Weakly Supervised Multi-Task Deep Learning using Acoustic Tokens Discovered from Unlabeled Data

    Full text link
    It is well known that recognizers personalized to each user are much more effective than user-independent recognizers. With the popularity of smartphones today, although it is not difficult to collect a large set of audio data for each user, it is difficult to transcribe it. However, it is now possible to automatically discover acoustic tokens from unlabeled personal data in an unsupervised way. We therefore propose a multi-task deep learning framework called a phoneme-token deep neural network (PTDNN), jointly trained from unsupervised acoustic tokens discovered from unlabeled data and very limited transcribed data for personalized acoustic modeling. We term this scenario "weakly supervised". The underlying intuition is that the high degree of similarity between the HMM states of acoustic token models and phoneme models may help them learn from each other in this multi-task learning framework. Initial experiments performed over a personalized audio data set recorded from Facebook posts demonstrated that very good improvements can be achieved in both frame accuracy and word accuracy over popularly-considered baselines such as fDLR, speaker code and lightly supervised adaptation. This approach complements existing speaker adaptation approaches and can be used jointly with such techniques to yield improved results.Comment: 5 pages, 5 figures, published in IEEE ICASSP 201

    User-centred design of flexible hypermedia for a mobile guide: Reflections on the hyperaudio experience

    Get PDF
    A user-centred design approach involves end-users from the very beginning. Considering users at the early stages compels designers to think in terms of utility and usability and helps develop the system on what is actually needed. This paper discusses the case of HyperAudio, a context-sensitive adaptive and mobile guide to museums developed in the late 90s. User requirements were collected via a survey to understand visitors’ profiles and visit styles in Natural Science museums. The knowledge acquired supported the specification of system requirements, helping defining user model, data structure and adaptive behaviour of the system. User requirements guided the design decisions on what could be implemented by using simple adaptable triggers and what instead needed more sophisticated adaptive techniques, a fundamental choice when all the computation must be done on a PDA. Graphical and interactive environments for developing and testing complex adaptive systems are discussed as a further step towards an iterative design that considers the user interaction a central point. The paper discusses how such an environment allows designers and developers to experiment with different system’s behaviours and to widely test it under realistic conditions by simulation of the actual context evolving over time. The understanding gained in HyperAudio is then considered in the perspective of the developments that followed that first experience: our findings seem still valid despite the passed time

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Social and Semantic Contexts in Tourist Mobile Applications

    Get PDF
    The ongoing growth of the World Wide Web along with the increase possibility of access information through a variety of devices in mobility, has defi nitely changed the way users acquire, create, and personalize information, pushing innovative strategies for annotating and organizing it. In this scenario, Social Annotation Systems have quickly gained a huge popularity, introducing millions of metadata on di fferent Web resources following a bottom-up approach, generating free and democratic mechanisms of classi cation, namely folksonomies. Moving away from hierarchical classi cation schemas, folksonomies represent also a meaningful mean for identifying similarities among users, resources and tags. At any rate, they suff er from several limitations, such as the lack of specialized tools devoted to manage, modify, customize and visualize them as well as the lack of an explicit semantic, making di fficult for users to bene fit from them eff ectively. Despite appealing promises of Semantic Web technologies, which were intended to explicitly formalize the knowledge within a particular domain in a top-down manner, in order to perform intelligent integration and reasoning on it, they are still far from reach their objectives, due to di fficulties in knowledge acquisition and annotation bottleneck. The main contribution of this dissertation consists in modeling a novel conceptual framework that exploits both social and semantic contextual dimensions, focusing on the domain of tourism and cultural heritage. The primary aim of our assessment is to evaluate the overall user satisfaction and the perceived quality in use thanks to two concrete case studies. Firstly, we concentrate our attention on contextual information and navigation, and on authoring tool; secondly, we provide a semantic mapping of tags of the system folksonomy, contrasted and compared to the expert users' classi cation, allowing a bridge between social and semantic knowledge according to its constantly mutual growth. The performed user evaluations analyses results are promising, reporting a high level of agreement on the perceived quality in use of both the applications and of the speci c analyzed features, demonstrating that a social-semantic contextual model improves the general users' satisfactio
    • …
    corecore