610 research outputs found

    FlightGoggles: A Modular Framework for Photorealistic Camera, Exteroceptive Sensor, and Dynamics Simulation

    Full text link
    FlightGoggles is a photorealistic sensor simulator for perception-driven robotic vehicles. The key contributions of FlightGoggles are twofold. First, FlightGoggles provides photorealistic exteroceptive sensor simulation using graphics assets generated with photogrammetry. Second, it provides the ability to combine (i) synthetic exteroceptive measurements generated in silico in real time and (ii) vehicle dynamics and proprioceptive measurements generated in motio by vehicle(s) in a motion-capture facility. FlightGoggles is capable of simulating a virtual-reality environment around autonomous vehicle(s). While a vehicle is in flight in the FlightGoggles virtual reality environment, exteroceptive sensors are rendered synthetically in real time while all complex extrinsic dynamics are generated organically through the natural interactions of the vehicle. The FlightGoggles framework allows for researchers to accelerate development by circumventing the need to estimate complex and hard-to-model interactions such as aerodynamics, motor mechanics, battery electrochemistry, and behavior of other agents. The ability to perform vehicle-in-the-loop experiments with photorealistic exteroceptive sensor simulation facilitates novel research directions involving, e.g., fast and agile autonomous flight in obstacle-rich environments, safe human interaction, and flexible sensor selection. FlightGoggles has been utilized as the main test for selecting nine teams that will advance in the AlphaPilot autonomous drone racing challenge. We survey approaches and results from the top AlphaPilot teams, which may be of independent interest.Comment: Initial version appeared at IROS 2019. Supplementary material can be found at https://flightgoggles.mit.edu. Revision includes description of new FlightGoggles features, such as a photogrammetric model of the MIT Stata Center, new rendering settings, and a Python AP

    Real-time Video Fusion for Surveillance Applications

    Get PDF
    A área da vigilância está cada vez mais presente nas nossas vidas. Desde em ambientes urbanos para prevenção, detecção e resolução de crimes, prevenção de vandalismo e controlo de tráfego rodoviário até em ambientes mais remotos como é o caso de aplicações militares para identificação e localização de forças inimigas. Com a evolução da tecnologia a área da vigilância está também a ficar mais sofisticada, os sistemas cada vez têm melhor qualidade, são mais seguros, os custos são mais baixos, há uma maior escalabilidade de sistemas e uma melhor integração entre vários tipos de sistemas de vigilância diferentes. Um dos principais tipos de vigilância é a vídeo-vigilância. Como o nome indica, esta técnica consiste na constante captura de imagens de forma a obter uma sequência dos acontecimentos num determinado local. No entanto, uma das desvantagens deste sistema é a dependência das condições de visibilidade no local. No ambiente desta dissertação foi desenvolvido um sistema em tempo real que seja capaz de recolher imagens com informação útil mesmo em condições de pouca visibilidade, isto inclui, por exemplo, ambientes nocturnos, de nevoeiro ou com fumo. Para tal foi usada a técnica de fusão de imagens, neste caso entre uma imagem no espectro infravermelho e uma imagem no espectro visível. A recolha complementar de imagens no espectro infravermelho vai introduzir mais informação sobre o ambiente, nomeadamente acerca da temperatura. Esta informação extra vai depois ser fundida com a imagem do espectro visível de forma a gerar uma só imagem que contém as informações de ambas as imagens visível e infravermelha.Surveillance is becoming a really important element in our daily lives. From urban environments as crime prevention, detection and resolution, vandalism prevention and traffic flow control to more remote environments such as military applications, for instance, to identify and locate enemy forces. As technology develops, the surveillance subject is also getting more sophisticated. The systems are improving quality wise, are getting safer, costs are getting lower, there is higher scalability of systems and there is better integration between different types of surveillance systems. One of the main types of surveillance is known as video surveillance. As the name states, this technique consists of a constant capture of images in order to obtain a sequence of events happening in a given location. However, one of the main disadvantages of these systems is the dependency on the visibility conditions available in the location. In the scope of this dissertation a real-time system capable of capturing images containing useful information even in low visibility conditions, such as nighttime, fog or smoke, was developed. For this purpose, a technique known as image fusion was used. In this case, a fusion between an image contained in the infrared spectrum and another contained in the visible spectrum. Sensing a complementary image of the environment in the infrared spectrum will provide extra information, such as the temperature. This extra information will then be fused with the visible spectrum image, generating just one image containing the information from both the visible and infrared images

    Thermal Cameras and Applications:A Survey

    Get PDF

    Biometric Spoofing: A JRC Case Study in 3D Face Recognition

    Get PDF
    Based on newly available and affordable off-the-shelf 3D sensing, processing and printing technologies, the JRC has conducted a comprehensive study on the feasibility of spoofing 3D and 2.5D face recognition systems with low-cost self-manufactured models and presents in this report a systematic and rigorous evaluation of the real risk posed by such attacking approach which has been complemented by a test campaign. The work accomplished and presented in this report, covers theories, methodologies, state of the art techniques, evaluation databases and also aims at providing an outlook into the future of this extremely active field of research.JRC.G.6-Digital Citizen Securit

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Target classification in multimodal video

    Get PDF
    The presented thesis focuses on enhancing scene segmentation and target recognition methodologies via the mobilisation of contextual information. The algorithms developed to achieve this goal utilise multi-modal sensor information collected across varying scenarios, from controlled indoor sequences to challenging rural locations. Sensors are chiefly colour band and long wave infrared (LWIR), enabling persistent surveillance capabilities across all environments. In the drive to develop effectual algorithms towards the outlined goals, key obstacles are identified and examined: the recovery of background scene structure from foreground object ’clutter’, employing contextual foreground knowledge to circumvent training a classifier when labeled data is not readily available, creating a labeled LWIR dataset to train a convolutional neural network (CNN) based object classifier and the viability of spatial context to address long range target classification when big data solutions are not enough. For an environment displaying frequent foreground clutter, such as a busy train station, we propose an algorithm exploiting foreground object presence to segment underlying scene structure that is not often visible. If such a location is outdoors and surveyed by an infra-red (IR) and visible band camera set-up, scene context and contextual knowledge transfer allows reasonable class predictions for thermal signatures within the scene to be determined. Furthermore, a labeled LWIR image corpus is created to train an infrared object classifier, using a CNN approach. The trained network demonstrates effective classification accuracy of 95% over 6 object classes. However, performance is not sustainable for IR targets acquired at long range due to low signal quality and classification accuracy drops. This is addressed by mobilising spatial context to affect network class scores, restoring robust classification capability

    Vehicle classification in intelligent transport systems: an overview, methods and software perspective

    Get PDF
    Vehicle Classification (VC) is a key element of Intelligent Transportation Systems (ITS). Diverse ranges of ITS applications like security systems, surveillance frameworks, fleet monitoring, traffic safety, and automated parking are using VC. Basically, in the current VC methods, vehicles are classified locally as a vehicle passes through a monitoring area, by fixed sensors or using a compound method. This paper presents a pervasive study on the state of the art of VC methods. We introduce a detailed VC taxonomy and explore the different kinds of traffic information that can be extracted via each method. Subsequently, traditional and cutting edge VC systems are investigated from different aspects. Specifically, strengths and shortcomings of the existing VC methods are discussed and real-time alternatives like Vehicular Ad-hoc Networks (VANETs) are investigated to convey physical as well as kinematic characteristics of the vehicles. Finally, we review a broad range of soft computing solutions involved in VC in the context of machine learning, neural networks, miscellaneous features, models and other methods
    corecore