2,859 research outputs found

    Large-Scale Mapping of Human Activity using Geo-Tagged Videos

    Full text link
    This paper is the first work to perform spatio-temporal mapping of human activity using the visual content of geo-tagged videos. We utilize a recent deep-learning based video analysis framework, termed hidden two-stream networks, to recognize a range of activities in YouTube videos. This framework is efficient and can run in real time or faster which is important for recognizing events as they occur in streaming video or for reducing latency in analyzing already captured video. This is, in turn, important for using video in smart-city applications. We perform a series of experiments to show our approach is able to accurately map activities both spatially and temporally. We also demonstrate the advantages of using the visual content over the tags/titles.Comment: Accepted at ACM SIGSPATIAL 201

    Computational rim illumination of dynamic subjects using aerial robots

    Get PDF
    Lighting plays a major role in photography. Professional photographers use elaborate installations to light their subjects and achieve sophisticated styles. However, lighting moving subjects performing dynamic tasks presents significant challenges and requires expensive manual intervention. A skilled additional assistant might be needed to reposition lights as the subject changes pose or moves, and the extra logistics significantly raises costs and time. The associated latencies as the assistant lights the subject, and the communication required from the photographer to achieve optimum lighting could mean missing a critical shot. We present a new approach to lighting dynamic subjects where an aerial robot equipped with a portable light source lights the subject to automatically achieve a desired lighting effect. We focus on rim lighting, a particularly challenging effect to achieve with dynamic subjects, and allow the photographer to specify a required rim width. Our algorithm processes the images from the photographer׳s camera and provides necessary motion commands to the aerial robot to achieve the desired rim width in the resulting photographs. With an indoor setup, we demonstrate a control approach that localizes the aerial robot with reference to the subject and tracks the subject to achieve the necessary motion. In addition to indoor experiments, we perform open-loop outdoor experiments in a realistic photo-shooting scenario to understand lighting ergonomics. Our proof-of-concept results demonstrate the utility of robots in computational lighting

    ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence

    Full text link
    Our work examines the way in which large language models can be used for robotic planning and sampling, specifically the context of automated photographic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description of an event we use an LM to generate a natural-language list of photo descriptions that one would expect a photographer to capture at the event. We then use a VLM to identify the best matches to these descriptions in the robot's video stream. The photo portfolios generated by our method are consistently rated as more appropriate to the event by human evaluators than those generated by existing methods.Comment: ICRA 202

    Storytelling with salient stills

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. 59-63).Michale J. Massey.M.S

    A Method for Automatic Image Rectification and Stitching for Vehicle Yaw Marks Trajectory Estimation

    Get PDF
    The aim of this study has been to propose a new method for automatic rectification and stitching of the images taken on the accident site. The proposed method does not require any measurements to be performed on the accident site and thus it is frsjebalaee of measurement errors. The experimental investigation was performed in order to compare the vehicle trajectory estimation according to the yaw marks in the stitched image and the trajectory, reconstructed using the GPS data. The overall mean error of the trajectory reconstruction, produced by the method proposed in this paper was 0.086 m. It was only 0.18% comparing to the whole trajectory length.</p

    Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

    Full text link
    It is common to implicitly assume access to intelligently captured inputs (e.g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge. We address the problem of learning to look around: if a visual agent has the ability to voluntarily acquire new views to observe its environment, how can it learn efficient exploratory behaviors to acquire informative observations? We propose a reinforcement learning solution, where the agent is rewarded for actions that reduce its uncertainty about the unobserved portions of its environment. Based on this principle, we develop a recurrent neural network-based approach to perform active completion of panoramic natural scenes and 3D object shapes. Crucially, the learned policies are not tied to any recognition task nor to the particular semantic content seen during training. As a result, 1) the learned "look around" behavior is relevant even for new tasks in unseen environments, and 2) training data acquisition involves no manual labeling. Through tests in diverse settings, we demonstrate that our approach learns useful generic policies that transfer to new unseen tasks and environments. Completion episodes are shown at https://goo.gl/BgWX3W

    砕け散る現実 : 土門拳の写真集『室生寺』(1954)で上演された仏教美術

    Get PDF
    Against the backdrop of the immediate postwar, photographer Domon Ken (1909–1990) embarked on a journey to the Murōji Temple in Nara Prefecture to capture its Buddhist treasures. The body of work was published in his photobook Murōji (1954), and has often been interpreted as a nostalgic spectacle that romanticizes Japan’s Buddhist heritage for mass consumption. Yet, a close examination of the images and their arrangement in the photobook reveals Domon’s indifference to reconstructing an accessible past. Contrary to the resurgence of Zen Buddhism in the 1950s, Domon’s project absconded from any politicized attempt that sought to authenticate the “tradition” or spiritual “essence” of Japan. While beholders are granted with unprecedented proximity to the icons, Domon’s interest in tactility and his manipulation of scale paradoxically render these statues illegible and unfamiliar. Equally significant is his juxtaposition of legible and abstract close-ups, which shatters the past into incongruent fragments. The photobook Murōji thereby raises questions that continue to resonate today: what is the role of documentary photography in postwar Japanese culture? In what ways can photography function as a metaphorical ground upon which competing ideas of nation, cultural memory, and subjectivity are mediated

    Determining the Geographical Location of Image Scenes based on Object Shadow Lengths

    Get PDF
    Many studies have addressed various applications of geo-spatial image tagging such as image retrieval, image organisation and browsing. Geo-spatial image tagging can be done manually or automatically with GPS enabled cameras that allow the current position of the photographer to be incorporated into the meta-data of an image. However, current GPS-equipment needs certain time to lock onto navigation satellites and these are therefore not suitable for spontaneous photography. Moreover, GPS units are still costly, energy hungry and not common in most digital cameras on sale. This study explores the potential of, and limitations associated with, extracting geo-spatial information from the image contents. The elevation of the sun is estimated indirectly from the contents of image collections by measuring the relative length of objects and their shadows in image scenes. The observed sun elevation and the creation time of the image is input into a celestial model to estimate the approximate geographical location of the photographer. The strategy is demonstrated on a set of manually measured photographs
    corecore