58,990 research outputs found
FlightGoggles: A Modular Framework for Photorealistic Camera, Exteroceptive Sensor, and Dynamics Simulation
FlightGoggles is a photorealistic sensor simulator for perception-driven
robotic vehicles. The key contributions of FlightGoggles are twofold. First,
FlightGoggles provides photorealistic exteroceptive sensor simulation using
graphics assets generated with photogrammetry. Second, it provides the ability
to combine (i) synthetic exteroceptive measurements generated in silico in real
time and (ii) vehicle dynamics and proprioceptive measurements generated in
motio by vehicle(s) in a motion-capture facility. FlightGoggles is capable of
simulating a virtual-reality environment around autonomous vehicle(s). While a
vehicle is in flight in the FlightGoggles virtual reality environment,
exteroceptive sensors are rendered synthetically in real time while all complex
extrinsic dynamics are generated organically through the natural interactions
of the vehicle. The FlightGoggles framework allows for researchers to
accelerate development by circumventing the need to estimate complex and
hard-to-model interactions such as aerodynamics, motor mechanics, battery
electrochemistry, and behavior of other agents. The ability to perform
vehicle-in-the-loop experiments with photorealistic exteroceptive sensor
simulation facilitates novel research directions involving, e.g., fast and
agile autonomous flight in obstacle-rich environments, safe human interaction,
and flexible sensor selection. FlightGoggles has been utilized as the main test
for selecting nine teams that will advance in the AlphaPilot autonomous drone
racing challenge. We survey approaches and results from the top AlphaPilot
teams, which may be of independent interest.Comment: Initial version appeared at IROS 2019. Supplementary material can be
found at https://flightgoggles.mit.edu. Revision includes description of new
FlightGoggles features, such as a photogrammetric model of the MIT Stata
Center, new rendering settings, and a Python AP
Image-Based Flexible Endoscope Steering
Manually steering the tip of a flexible endoscope to navigate through an endoluminal path relies on the physician’s dexterity and experience. In this paper we present the realization of a robotic flexible endoscope steering system that uses the endoscopic images to control the tip orientation towards the direction of the lumen. Two image-based control algorithms are investigated, one is based on the optical flow and the other is based on the image intensity. Both are evaluated using simulations in which the endoscope was steered through the lumen. The RMS distance to the lumen center was less than 25% of the lumen width. An experimental setup was built using a standard flexible endoscope, and the image-based control algorithms were used to actuate the wheels of the endoscope for tip steering. Experiments were conducted in an anatomical model to simulate gastroscopy. The image intensity- based algorithm was capable of steering the endoscope tip through an endoluminal path from the mouth to the duodenum accurately. Compared to manual control, the robotically steered endoscope performed 68% better in terms of keeping the lumen centered in the image
PlaceRaider: Virtual Theft in Physical Spaces with Smartphones
As smartphones become more pervasive, they are increasingly targeted by
malware. At the same time, each new generation of smartphone features
increasingly powerful onboard sensor suites. A new strain of sensor malware has
been developing that leverages these sensors to steal information from the
physical environment (e.g., researchers have recently demonstrated how malware
can listen for spoken credit card numbers through the microphone, or feel
keystroke vibrations using the accelerometer). Yet the possibilities of what
malware can see through a camera have been understudied. This paper introduces
a novel visual malware called PlaceRaider, which allows remote attackers to
engage in remote reconnaissance and what we call virtual theft. Through
completely opportunistic use of the camera on the phone and other sensors,
PlaceRaider constructs rich, three dimensional models of indoor environments.
Remote burglars can thus download the physical space, study the environment
carefully, and steal virtual objects from the environment (such as financial
documents, information on computer monitors, and personally identifiable
information). Through two human subject studies we demonstrate the
effectiveness of using mobile devices as powerful surveillance and virtual
theft platforms, and we suggest several possible defenses against visual
malware
Evaluating methods for controlling depth perception in stereoscopic cinematography.
Existing stereoscopic imaging algorithms can create static stereoscopic images with perceived depth control function to ensure a compelling 3D viewing experience without visual discomfort. However, current algorithms do not normally support standard Cinematic Storytelling techniques. These techniques, such as object movement, camera motion, and zooming, can result in dynamic scene depth change within and between a series of frames (shots) in stereoscopic cinematography. In this study, we empirically evaluate the following three types of stereoscopic imaging approaches that aim to address this problem. (1) Real-Eye Configuration: set camera separation equal to the nominal human eye interpupillary distance. The perceived depth on the display is identical to the scene depth without any distortion. (2) Mapping Algorithm: map the scene depth to a predefined range on the display to avoid excessive perceived depth. A new method that dynamically adjusts the depth mapping from scene space to display space is presented in addition to an existing fixed depth mapping method. (3) Depth of Field Simulation: apply Depth of Field (DOF) blur effect to stereoscopic images. Only objects that are inside the DOF are viewed in full sharpness. Objects that are far away from the focus plane are blurred. We performed a human-based trial using the ITU-R BT.500-11 Recommendation to compare the depth quality of stereoscopic video sequences generated by the above-mentioned imaging methods. Our results indicate that viewers' practical 3D viewing volumes are different for individual stereoscopic displays and viewers can cope with much larger perceived depth range in viewing stereoscopic cinematography in comparison to static stereoscopic images. Our new dynamic depth mapping method does have an advantage over the fixed depth mapping method in controlling stereo depth perception. The DOF blur effect does not provide the expected improvement for perceived depth quality control in 3D cinematography. We anticipate the results will be of particular interest to 3D filmmaking and real time computer games
GART: The Gesture and Activity Recognition Toolkit
Presented at the 12th International Conference on Human-Computer Interaction, Beijing, China, July 2007.The original publication is available at www.springerlink.comThe Gesture and Activity Recognition Toolit (GART) is
a user interface toolkit designed to enable the development of gesture-based
applications. GART provides an abstraction to machine learning
algorithms suitable for modeling and recognizing different types of
gestures. The toolkit also provides support for the data collection and
the training process. In this paper, we present GART and its machine
learning abstractions. Furthermore, we detail the components of the
toolkit and present two example gesture recognition applications
Augmented reality usage for prototyping speed up
The first part of the article describes our approach for solution of this
problem by means of Augmented Reality. The merging of the real world model and
digital objects allows streamline the work with the model and speed up the
whole production phase significantly. The main advantage of augmented reality
is the possibility of direct manipulation with the scene using a portable
digital camera. Also adding digital objects into the scene could be done using
identification markers placed on the surface of the model. Therefore it is not
necessary to work with special input devices and lose the contact with the real
world model. Adjustments are done directly on the model. The key problem of
outlined solution is the ability of identification of an object within the
camera picture and its replacement with the digital object. The second part of
the article is focused especially on the identification of exact position and
orientation of the marker within the picture. The identification marker is
generalized into the triple of points which represents a general plane in
space. There is discussed the space identification of these points and the
description of representation of their position and orientation be means of
transformation matrix. This matrix is used for rendering of the graphical
objects (e. g. in OpenGL and Direct3D).Comment: Keywords: augmented reality, prototyping, pose estimation,
transformation matri
- …