2,859 research outputs found
Large-Scale Mapping of Human Activity using Geo-Tagged Videos
This paper is the first work to perform spatio-temporal mapping of human
activity using the visual content of geo-tagged videos. We utilize a recent
deep-learning based video analysis framework, termed hidden two-stream
networks, to recognize a range of activities in YouTube videos. This framework
is efficient and can run in real time or faster which is important for
recognizing events as they occur in streaming video or for reducing latency in
analyzing already captured video. This is, in turn, important for using video
in smart-city applications. We perform a series of experiments to show our
approach is able to accurately map activities both spatially and temporally. We
also demonstrate the advantages of using the visual content over the
tags/titles.Comment: Accepted at ACM SIGSPATIAL 201
Computational rim illumination of dynamic subjects using aerial robots
Lighting plays a major role in photography. Professional photographers use elaborate installations to light their subjects and achieve sophisticated styles. However, lighting moving subjects performing dynamic tasks presents significant challenges and requires expensive manual intervention. A skilled additional assistant might be needed to reposition lights as the subject changes pose or moves, and the extra logistics significantly raises costs and time. The associated latencies as the assistant lights the subject, and the communication required from the photographer to achieve optimum lighting could mean missing a critical shot.
We present a new approach to lighting dynamic subjects where an aerial robot equipped with a portable light source lights the subject to automatically achieve a desired lighting effect. We focus on rim lighting, a particularly challenging effect to achieve with dynamic subjects, and allow the photographer to specify a required rim width. Our algorithm processes the images from the photographer׳s camera and provides necessary motion commands to the aerial robot to achieve the desired rim width in the resulting photographs. With an indoor setup, we demonstrate a control approach that localizes the aerial robot with reference to the subject and tracks the subject to achieve the necessary motion. In addition to indoor experiments, we perform open-loop outdoor experiments in a realistic photo-shooting scenario to understand lighting ergonomics. Our proof-of-concept results demonstrate the utility of robots in computational lighting
ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence
Our work examines the way in which large language models can be used for
robotic planning and sampling, specifically the context of automated
photographic documentation. Specifically, we illustrate how to produce a
photo-taking robot with an exceptional level of semantic awareness by
leveraging recent advances in general purpose language (LM) and vision-language
(VLM) models. Given a high-level description of an event we use an LM to
generate a natural-language list of photo descriptions that one would expect a
photographer to capture at the event. We then use a VLM to identify the best
matches to these descriptions in the robot's video stream. The photo portfolios
generated by our method are consistently rated as more appropriate to the event
by human evaluators than those generated by existing methods.Comment: ICRA 202
Storytelling with salient stills
Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. 59-63).Michale J. Massey.M.S
A Method for Automatic Image Rectification and Stitching for Vehicle Yaw Marks Trajectory Estimation
The aim of this study has been to propose a new method for automatic rectification and stitching of the images taken on the accident site. The proposed method does not require any measurements to be performed on the accident site and thus it is frsjebalaee of measurement errors. The experimental investigation was performed in order to compare the vehicle trajectory estimation according to the yaw marks in the stitched image and the trajectory, reconstructed using the GPS data. The overall mean error of the trajectory reconstruction, produced by the method proposed in this paper was 0.086 m. It was only 0.18% comparing to the whole trajectory length.</p
Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks
It is common to implicitly assume access to intelligently captured inputs
(e.g., photos from a human photographer), yet autonomously capturing good
observations is itself a major challenge. We address the problem of learning to
look around: if a visual agent has the ability to voluntarily acquire new views
to observe its environment, how can it learn efficient exploratory behaviors to
acquire informative observations? We propose a reinforcement learning solution,
where the agent is rewarded for actions that reduce its uncertainty about the
unobserved portions of its environment. Based on this principle, we develop a
recurrent neural network-based approach to perform active completion of
panoramic natural scenes and 3D object shapes. Crucially, the learned policies
are not tied to any recognition task nor to the particular semantic content
seen during training. As a result, 1) the learned "look around" behavior is
relevant even for new tasks in unseen environments, and 2) training data
acquisition involves no manual labeling. Through tests in diverse settings, we
demonstrate that our approach learns useful generic policies that transfer to
new unseen tasks and environments. Completion episodes are shown at
https://goo.gl/BgWX3W
砕け散る現実 : 土門拳の写真集『室生寺』(1954)で上演された仏教美術
Against the backdrop of the immediate postwar, photographer Domon Ken (1909–1990) embarked on a journey to the Murōji Temple in Nara Prefecture to capture its Buddhist treasures. The body of work was published in his photobook Murōji (1954), and has often been interpreted as a nostalgic spectacle that romanticizes Japan’s Buddhist heritage for mass consumption. Yet, a close examination of the images and their arrangement in the photobook reveals Domon’s indifference to reconstructing an accessible past. Contrary to the resurgence of Zen Buddhism in the 1950s, Domon’s project absconded from any politicized attempt that sought to authenticate the “tradition” or spiritual “essence” of Japan. While beholders are granted with unprecedented proximity to the icons, Domon’s interest in tactility and his manipulation of scale paradoxically render these statues illegible and unfamiliar. Equally significant is his juxtaposition of legible and abstract close-ups, which shatters the past into incongruent fragments. The photobook Murōji thereby raises questions that continue to resonate today: what is the role of documentary photography in postwar Japanese culture? In what ways can photography function as a metaphorical ground upon which competing ideas of nation, cultural memory, and subjectivity are mediated
Determining the Geographical Location of Image Scenes based on Object Shadow Lengths
Many studies have addressed various applications
of geo-spatial image tagging such as image retrieval,
image organisation and browsing. Geo-spatial image
tagging can be done manually or automatically with GPS
enabled cameras that allow the current position of the
photographer to be incorporated into the meta-data of an
image. However, current GPS-equipment needs certain time
to lock onto navigation satellites and these are therefore not
suitable for spontaneous photography. Moreover, GPS units
are still costly, energy hungry and not common in most
digital cameras on sale. This study explores the potential of,
and limitations associated with, extracting geo-spatial
information from the image contents. The elevation of the
sun is estimated indirectly from the contents of image
collections by measuring the relative length of objects and
their shadows in image scenes. The observed sun elevation
and the creation time of the image is input into a celestial
model to estimate the approximate geographical location of
the photographer. The strategy is demonstrated on a set of
manually measured photographs
- …