58,990 research outputs found

    FlightGoggles: A Modular Framework for Photorealistic Camera, Exteroceptive Sensor, and Dynamics Simulation

    Full text link
    FlightGoggles is a photorealistic sensor simulator for perception-driven robotic vehicles. The key contributions of FlightGoggles are twofold. First, FlightGoggles provides photorealistic exteroceptive sensor simulation using graphics assets generated with photogrammetry. Second, it provides the ability to combine (i) synthetic exteroceptive measurements generated in silico in real time and (ii) vehicle dynamics and proprioceptive measurements generated in motio by vehicle(s) in a motion-capture facility. FlightGoggles is capable of simulating a virtual-reality environment around autonomous vehicle(s). While a vehicle is in flight in the FlightGoggles virtual reality environment, exteroceptive sensors are rendered synthetically in real time while all complex extrinsic dynamics are generated organically through the natural interactions of the vehicle. The FlightGoggles framework allows for researchers to accelerate development by circumventing the need to estimate complex and hard-to-model interactions such as aerodynamics, motor mechanics, battery electrochemistry, and behavior of other agents. The ability to perform vehicle-in-the-loop experiments with photorealistic exteroceptive sensor simulation facilitates novel research directions involving, e.g., fast and agile autonomous flight in obstacle-rich environments, safe human interaction, and flexible sensor selection. FlightGoggles has been utilized as the main test for selecting nine teams that will advance in the AlphaPilot autonomous drone racing challenge. We survey approaches and results from the top AlphaPilot teams, which may be of independent interest.Comment: Initial version appeared at IROS 2019. Supplementary material can be found at https://flightgoggles.mit.edu. Revision includes description of new FlightGoggles features, such as a photogrammetric model of the MIT Stata Center, new rendering settings, and a Python AP

    Image-Based Flexible Endoscope Steering

    Get PDF
    Manually steering the tip of a flexible endoscope to navigate through an endoluminal path relies on the physician’s dexterity and experience. In this paper we present the realization of a robotic flexible endoscope steering system that uses the endoscopic images to control the tip orientation towards the direction of the lumen. Two image-based control algorithms are investigated, one is based on the optical flow and the other is based on the image intensity. Both are evaluated using simulations in which the endoscope was steered through the lumen. The RMS distance to the lumen center was less than 25% of the lumen width. An experimental setup was built using a standard flexible endoscope, and the image-based control algorithms were used to actuate the wheels of the endoscope for tip steering. Experiments were conducted in an anatomical model to simulate gastroscopy. The image intensity- based algorithm was capable of steering the endoscope tip through an endoluminal path from the mouth to the duodenum accurately. Compared to manual control, the robotically steered endoscope performed 68% better in terms of keeping the lumen centered in the image

    PlaceRaider: Virtual Theft in Physical Spaces with Smartphones

    Full text link
    As smartphones become more pervasive, they are increasingly targeted by malware. At the same time, each new generation of smartphone features increasingly powerful onboard sensor suites. A new strain of sensor malware has been developing that leverages these sensors to steal information from the physical environment (e.g., researchers have recently demonstrated how malware can listen for spoken credit card numbers through the microphone, or feel keystroke vibrations using the accelerometer). Yet the possibilities of what malware can see through a camera have been understudied. This paper introduces a novel visual malware called PlaceRaider, which allows remote attackers to engage in remote reconnaissance and what we call virtual theft. Through completely opportunistic use of the camera on the phone and other sensors, PlaceRaider constructs rich, three dimensional models of indoor environments. Remote burglars can thus download the physical space, study the environment carefully, and steal virtual objects from the environment (such as financial documents, information on computer monitors, and personally identifiable information). Through two human subject studies we demonstrate the effectiveness of using mobile devices as powerful surveillance and virtual theft platforms, and we suggest several possible defenses against visual malware

    Evaluating methods for controlling depth perception in stereoscopic cinematography.

    Get PDF
    Existing stereoscopic imaging algorithms can create static stereoscopic images with perceived depth control function to ensure a compelling 3D viewing experience without visual discomfort. However, current algorithms do not normally support standard Cinematic Storytelling techniques. These techniques, such as object movement, camera motion, and zooming, can result in dynamic scene depth change within and between a series of frames (shots) in stereoscopic cinematography. In this study, we empirically evaluate the following three types of stereoscopic imaging approaches that aim to address this problem. (1) Real-Eye Configuration: set camera separation equal to the nominal human eye interpupillary distance. The perceived depth on the display is identical to the scene depth without any distortion. (2) Mapping Algorithm: map the scene depth to a predefined range on the display to avoid excessive perceived depth. A new method that dynamically adjusts the depth mapping from scene space to display space is presented in addition to an existing fixed depth mapping method. (3) Depth of Field Simulation: apply Depth of Field (DOF) blur effect to stereoscopic images. Only objects that are inside the DOF are viewed in full sharpness. Objects that are far away from the focus plane are blurred. We performed a human-based trial using the ITU-R BT.500-11 Recommendation to compare the depth quality of stereoscopic video sequences generated by the above-mentioned imaging methods. Our results indicate that viewers' practical 3D viewing volumes are different for individual stereoscopic displays and viewers can cope with much larger perceived depth range in viewing stereoscopic cinematography in comparison to static stereoscopic images. Our new dynamic depth mapping method does have an advantage over the fixed depth mapping method in controlling stereo depth perception. The DOF blur effect does not provide the expected improvement for perceived depth quality control in 3D cinematography. We anticipate the results will be of particular interest to 3D filmmaking and real time computer games

    GART: The Gesture and Activity Recognition Toolkit

    Get PDF
    Presented at the 12th International Conference on Human-Computer Interaction, Beijing, China, July 2007.The original publication is available at www.springerlink.comThe Gesture and Activity Recognition Toolit (GART) is a user interface toolkit designed to enable the development of gesture-based applications. GART provides an abstraction to machine learning algorithms suitable for modeling and recognizing different types of gestures. The toolkit also provides support for the data collection and the training process. In this paper, we present GART and its machine learning abstractions. Furthermore, we detail the components of the toolkit and present two example gesture recognition applications

    Augmented reality usage for prototyping speed up

    Full text link
    The first part of the article describes our approach for solution of this problem by means of Augmented Reality. The merging of the real world model and digital objects allows streamline the work with the model and speed up the whole production phase significantly. The main advantage of augmented reality is the possibility of direct manipulation with the scene using a portable digital camera. Also adding digital objects into the scene could be done using identification markers placed on the surface of the model. Therefore it is not necessary to work with special input devices and lose the contact with the real world model. Adjustments are done directly on the model. The key problem of outlined solution is the ability of identification of an object within the camera picture and its replacement with the digital object. The second part of the article is focused especially on the identification of exact position and orientation of the marker within the picture. The identification marker is generalized into the triple of points which represents a general plane in space. There is discussed the space identification of these points and the description of representation of their position and orientation be means of transformation matrix. This matrix is used for rendering of the graphical objects (e. g. in OpenGL and Direct3D).Comment: Keywords: augmented reality, prototyping, pose estimation, transformation matri
    corecore