383 research outputs found
Recommended from our members
Towards a Smart Drone Cinematographer for Filming Human Motion
Affordable consumer drones have made capturing aerial footage more convenient and accessible. However, shooting cinematic motion videos using a drone is challenging because it requires users to analyze dynamic scenarios while operating the controller. In this thesis, our task is to develop an autonomous drone cinematography system to capture cinematic videos of human motion. We understand the system's filming performance to be influenced by three key components: 1) video quality metric, which measures the aesthetic quality -- the angle, the distance, the image composition -- of the captured video, 2) visual feature, which encapsulates the visual elements that influence the filming style, and 3) camera planning, which is a decision-making model that predicts the next best movement. By analyzing these three components, we designed two autonomous drone cinematography systems using both heuristic-based methods and learning-based methods.For the first system, we designed an Autonomous CinemaTography system -- "ACT" by proposing a viewpoint quality metric focusing on the visibility of the 3D human skeleton of the subject. We expanded the application of human motion analysis and simplified manual control by assisting viewpoint selection using a through-the-lens method. For the second system, we designed an imitation-based system that learns the artistic intention of the cameramen through watching professional aerial videos. We designed a camera planner that analyzes the video contents and previous camera motion to predict future camera motion. Furthermore, we propose a planning framework, which can imitate a filming style by ``seeing" only one single demonstration video of such style. We named it ``one-shot imitation filming." To the best of our knowledge, this is the first work that extends imitation learning to autonomous filming. Experimental results in both simulation and field test exhibit significant improvements over existing techniques and our approach managed to help inexperienced pilots capture cinematic videos
Optimal Multi-UAV Trajectory Planning for Filming Applications
Teams of multiple Unmanned Aerial Vehicles (UAVs) can be used to record large-scale
outdoor scenarios and complementary views of several action points as a promising
system for cinematic video recording. Generating the trajectories of the UAVs plays
a key role, as it should be ensured that they comply with requirements for system
dynamics, smoothness, and safety. The rise of numerical methods for nonlinear
optimization is finding a
ourishing field in optimization-based approaches to multi-
UAV trajectory planning. In particular, these methods are rather promising for
video recording applications, as they enable multiple constraints and objectives to
be formulated, such as trajectory smoothness, compliance with UAV and camera
dynamics, avoidance of obstacles and inter-UAV con
icts, and mutual UAV visibility.
The main objective of this thesis is to plan online trajectories for multi-UAV teams in
video applications, formulating novel optimization problems and solving them in real
time.
The thesis begins by presenting a framework for carrying out autonomous cinematography
missions with a team of UAVs. This framework enables media directors
to design missions involving different types of shots with one or multiple cameras,
running sequentially or concurrently. Second, the thesis proposes a novel non-linear
formulation for the challenging problem of computing optimal multi-UAV trajectories
for cinematography, integrating UAV dynamics and collision avoidance constraints,
together with cinematographic aspects such as smoothness, gimbal mechanical limits,
and mutual camera visibility. Lastly, the thesis describes a method for autonomous
aerial recording with distributed lighting by a team of UAVs. The multi-UAV trajectory
optimization problem is decoupled into two steps in order to tackle non-linear cinematographic aspects and obstacle avoidance at separate stages. This allows the
trajectory planner to perform in real time and to react online to changes in dynamic
environments.
It is important to note that all the methods in the thesis have been validated
by means of extensive simulations and field experiments. Moreover, all the software
components have been developed as open source.Los equipos de vehículos aéreos no tripulados (UAV) son sistemas prometedores para grabar
eventos cinematográficos, en escenarios exteriores de grandes dimensiones difíciles de cubrir
o para tomar vistas complementarias de diferentes puntos de acción. La generación de
trayectorias para este tipo de vehículos desempeña un papel fundamental, ya que debe
garantizarse que se cumplan requisitos dinámicos, de suavidad y de seguridad.
Los enfoques basados en la optimización para la planificación de trayectorias de múltiples
UAVs se pueden ver beneficiados por el auge de los métodos numéricos para la resolución de
problemas de optimización no lineales. En particular, estos métodos son bastante
prometedores para las aplicaciones de grabación de vídeo, ya que permiten formular múltiples
restricciones y objetivos, como la suavidad de la trayectoria, el cumplimiento de la dinámica
del UAV y de la cámara, la evitación de obstáculos y de conflictos entre UAVs, y la visibilidad
mutua.
El objetivo principal de esta tesis es planificar trayectorias para equipos multi-UAV en
aplicaciones de vídeo, formulando novedosos problemas de optimización y resolviéndolos en
tiempo real.
La tesis comienza presentando un marco de trabajo para la realización de misiones
cinematográficas autónomas con un equipo de UAVs. Este marco permite a los directores de
medios de comunicación diseñar misiones que incluyan diferentes tipos de tomas con una o
varias cámaras, ejecutadas de forma secuencial o concurrente. En segundo lugar, la tesis
propone una novedosa formulación no lineal para el difícil problema de calcular las
trayectorias óptimas de los vehículos aéreos no tripulados en cinematografía, integrando en el
problema la dinámica de los UAVs y las restricciones para evitar colisiones, junto con aspectos
cinematográficos como la suavidad, los límites mecánicos del cardán y la visibilidad mutua de
las cámaras. Por último, la tesis describe un método de grabación aérea autónoma con
iluminación distribuida por un equipo de UAVs. El problema de optimización de trayectorias se
desacopla en dos pasos para abordar los aspectos cinematográficos no lineales y la evitación
de obstáculos en etapas separadas. Esto permite al planificador de trayectorias actuar en
tiempo real y reaccionar en línea a los cambios en los entornos dinámicos.
Es importante señalar que todos los métodos de la tesis han sido validados mediante extensas
simulaciones y experimentos de campo. Además, todos los componentes del software se han
desarrollado como código abierto
Transitioning360: Content-aware NFoV Virtual Camera Paths for 360° Video Playback
Despite the increasing number of head-mounted displays, many 360° VR videos are still being viewed by users on existing 2D displays. To this end, a subset of the 360° video content is often shown inside a manually or semi-automatically selected normal-field-of-view (NFoV) window. However, during the playback, simply watching an NFoV video can easily miss concurrent off-screen content. We present Transitioning360, a tool for 360° video navigation and playback on 2D displays by transitioning between multiple NFoV views that track potentially interesting targets or events. Our method computes virtual NFoV camera paths considering content awareness and diversity in an offline preprocess. During playback, the user can watch any NFoV view corresponding to a precomputed camera path. Moreover, our interface shows other candidate views, providing a sense of concurrent events. At any time, the user can transition to other candidate views for fast navigation and exploration. Experimental results including a user study demonstrate that the viewing experience using our method is more enjoyable and convenient than previous methods
Transitioning360: Content-aware NFoV Virtual Camera Paths for 360° Video Playback
Despite the increasing number of head-mounted displays, many 360° VR videos are still being viewed by users on existing 2D displays. To this end, a subset of the 360° video content is often shown inside a manually or semi-automatically selected normal-field-of-view (NFoV) window. However, during the playback, simply watching an NFoV video can easily miss concurrent off-screen content. We present Transitioning360, a tool for 360° video navigation and playback on 2D displays by transitioning between multiple NFoV views that track potentially interesting targets or events. Our method computes virtual NFoV camera paths considering content awareness and diversity in an offline preprocess. During playback, the user can watch any NFoV view corresponding to a precomputed camera path. Moreover, our interface shows other candidate views, providing a sense of concurrent events. At any time, the user can transition to other candidate views for fast navigation and exploration. Experimental results including a user study demonstrate that the viewing experience using our method is more enjoyable and convenient than previous methods
View recommendation for multi-camera demonstration-based training
While humans can effortlessly pick a view from multiple streams, automatically choosing the best view is a challenge. Choosing the best view from multi-camera streams poses a problem regarding which objective metrics should be considered. Existing works on view selection lack consensus about which metrics should be considered to select the best view. The literature on view selection describes diverse possible metrics. And strategies such as information-theoretic, instructional design, or aesthetics-motivated fail to incorporate all approaches. In this work, we postulate a strategy incorporating information-theoretic and instructional design-based objective metrics to select the best view from a set of views. Traditionally, information-theoretic measures have been used to find the goodness of a view, such as in 3D rendering. We adapted a similar measure known as the viewpoint entropy for real-world 2D images. Additionally, we incorporated similarity penalization to get a more accurate measure of the entropy of a view, which is one of the metrics for the best view selection. Since the choice of the best view is domain-dependent, we chose demonstration-based training scenarios as our use case. The limitation of our chosen scenarios is that they do not include collaborative training and solely feature a single trainer. To incorporate instructional design considerations, we included the trainer’s body pose, face, face when instructing, and hands visibility as metrics. To incorporate domain knowledge we included predetermined regions’ visibility as another metric. All of those metrics are taken into account to produce a parameterized view recommendation approach for demonstration-based training. An online study using recorded multi-camera video streams from a simulation environment was used to validate those metrics. Furthermore, the responses from the online study were used to optimize the view recommendation performance with a normalized discounted cumulative gain (NDCG) value of 0.912, which shows good performance with respect to matching user choices
- …