2,615 research outputs found

    IDeixis : image-based deixis for recognizing locations

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 31-32).In this thesis, we describe an approach to recognizing location from camera-equipped mobile devices using image-based web search. This is an image-based deixis capable of pointing at a distant location away from the user's current location. We demonstrate our approach on an application allowing users to browse web pages matching the image of a nearby location. Common image search metrics can match images captured with a camera-equipped mobile device to images found on the World Wide Web. The users can recognize the location if those pages contain information about this location (e.g. name, facts, stories ... etc). Since the amount of information displayable on the device is limited, automatic keyword extraction methods can be applied to help efficiently identify relevant pieces of location information. Searching the entire web can be computationally overwhelming, so we devise a hybrid image-and-keyword searching technique. First, image-search is performed over images and links to their source web pages in a database that indexes only a small fraction of the web. Then, relevant keywords on these web pages are automatically identified and submitted to an existing text-based search engine (e.g. Google) that indexes a much larger portion of the web. Finally, the resulting image set is filtered to retain images close to the original query in terms of visual similarity. It is thus possible to efficiently search hundreds of millions of images that are not only textually related but also visually relevant.by Pei-Hsiu Yeh.S.M

    Video summarisation: A conceptual framework and survey of the state of the art

    Get PDF
    This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users

    LifeLogging: personal big data

    Get PDF
    We have recently observed a convergence of technologies to foster the emergence of lifelogging as a mainstream activity. Computer storage has become significantly cheaper, and advancements in sensing technology allows for the efficient sensing of personal activities, locations and the environment. This is best seen in the growing popularity of the quantified self movement, in which life activities are tracked using wearable sensors in the hope of better understanding human performance in a variety of tasks. This review aims to provide a comprehensive summary of lifelogging, to cover its research history, current technologies, and applications. Thus far, most of the lifelogging research has focused predominantly on visual lifelogging in order to capture life details of life activities, hence we maintain this focus in this review. However, we also reflect on the challenges lifelogging poses to an information retrieval scientist. This review is a suitable reference for those seeking a information retrieval scientist’s perspective on lifelogging and the quantified self

    Web Cube: A New Model for 3-D Web Browsing Based on Hand Gesture Interaction

    Get PDF
    3-D web browsing is a promising trend for interaction with web content. However it is still illusive between virtual reality applications on the one side, and conventional web browsing on the other. In this research we propose a new model for 3-D web browsing that capitalizes on features of virtual reality technology with those of conventional browsing in order to provide an enhanced interactive user experience with web content. The new model is based on representing information content elements in 3-D perspective and organizing them inside a 3-D container that we call a “Web Cube” for 3-D web browsing. Furthermore, the model defines appropriate interaction mechanisms based on hand gestures. The model has been evaluated using an experimental technique to evaluate its efficiency, and a questionnaire to evaluate user satisfaction

    Capturing Situational Context in an Augmented Memory System

    Get PDF
    Bookmarking a moment is a new approach being introduced to capture past experience and insert information into an augmented memory system. This idea is inspired from the concept of the bookmark in web browsers. Semi-automatic bookmarking different moments when time is limited and revisiting these moments before inserting them into an augmented memory system will help people to remember their past experience. An exploratory study was conducted to discover and shape the design requirements for a system called CatchIt. It aims to understand end-users’ needs to capture their personal experience, which is an important and complex issue in the case of capture and access of personal experiences. CatchIt is a system to bookmark the significant moments during the day before enriching them, and entering them into the augmented memory system called Digital Parrot. The conceptual design of CatchIt will be the main aim of this study. The primary requirements were derived from the scenarios and analysis of the findings of five different study stages were designed to inspect these: unobserved field visit, shadowing, using indictors, Wizard of Oz and using technology. Thirty participants were involved in field visit, survey and follows up interview. Each stage had different tasks to be performed and the findings of each stage contributed to understanding different parts of user needs and system design requirements. The results of this study indicated the system should automatically record the context information, especially the time and location since they were typically neglected by the participants. Different information such as textual and visual information should be manually recorded based on the users’ setting or situations. A single button is a promising input mechanism to bookmark a moment and it should be fast and effort- less. The result showed no clear correlation between learning style and type of the information that had been captured. Also, we found that there might be a correlation between passive capture and false memories. All these findings were used to provide a foundation for further work to implement the bookmark system and evaluate this approach. Some issues raised in this study need further research. The work will contribute to a greater understanding of human memory and selective capture

    Serviço de vídeo sob demanda com base na inferência de emoções de usuário

    Get PDF
    El tráfico de video en las redes aumenta de forma exponencial, y con ello la cantidad de tiempo que se debe emplear navegando por catálogos de contenidos. Por ello, son necesarios sistemas de video bajo demanda [VoD] que tengan en cuenta las emociones como parámetro para agilizar el acceso a los contenidos. Este artículo presenta el diseño e implementación de un servicio de VoD basado en emociones, cuyos componentes principales son: el catálogo de contenidos musicales conformado y el sistema hardware-software que permite establecer el nivel de estrés mental y la inferencia de emociones del consumidor, mientras éste interactúa con el sistema. El producto final fue sometido a pruebas de eficiencia y estrés, con resultados satisfactorios: el tiempo empleado por el servidor web con 200 conexiones secuenciales, osciló entre 0,050 y 0,675 segundos, y entre 0,030 y 0,675 segundos, cuando son simultaneas. Logró además responder adecuadamente ante 20.000 conexiones secuenciales, con tiempos de respuesta de menos de 1 a 36 segundos, y soportar, sin colapsar, 18.000 conexiones concurrentes, con tiempos de respuesta entre 7 y 62 segundos. El proyecto proporciona un servicio open source que plantea las bases para futuros proyectos.Video traffic on networks increases exponentially, and thus the amount of time that should be used browsing content catalogs. Therefore, systems are needed video on demand [VoD] taking into account the emotions as a parameter for fast access to content. This paper presents the design and implementation of a VoD service based on emotions, whose main components are: the musical content catalog forming and hardware-software system that allows you to set the level of mental stress and inference of emotions of the consumer, while it interacts with the system. The final product was tested for efficiency and stress, with satisfactory results: the time spent by the web server with 200 sequential connections, ranged from 0.050 to 0.675 seconds and between 0.030 and 0.675 seconds when they are simultaneous. It also managed to respond adequately to 20,000 sequential connections, with response times of less than 1 to 36 seconds, and withstand, without collapsing, 18,000 concurrent connections, with response times between 7 and 62 seconds. The project provides an open source service that raises the groundwork for future projects.O tráfego de vídeo em redes aumenta exponencialmente, e, assim, a quantidade de tempo que deve ser usado para navegar por catálogos de conteúdos. Portanto, são necessários sistemas de vídeo sob demanda [VoD] que possam ter em conta as emoções como parâmetro para acelerar o acesso aos conteúdos. Este artigo apresenta o desenho e implementação de um serviço de VoD baseado em emoções, cujos principais componentes são: o catálogo de conteúdos musicais conformado e o sistema hardware-software que permite definir o nível de estresse mental e a inferência de emoções do consumidor enquanto este interage com o sistema. O produto final foi submetido a testes de eficiência e stress, com resultados satisfatórios: o tempo utilizado pelo servidor web com 200 conexões sequenciais variou entre 0,050 e 0,675 segundos, e entre 0,030 e 0,675 segundos quando simultâneas. Também conseguiu responder adequadamente perante 20.000 conexões sequenciais, com tempos de resposta de menos de 1 a 36 segundos e suportar sem entrar em colapso, 18.000 conexões simultâneas, com tempos de resposta entre 7 e 62 segundos. O projeto oferece um serviço open source que fornece as bases para futuros projetos
    corecore