14 research outputs found

    Mobile app with steganography functionalities

    Get PDF
    [Abstract]: Steganography is the practice of hiding information within other data, such as images, audios, videos, etc. In this research, we consider applying this useful technique to create a mobile application that lets users conceal their own secret data inside other media formats, send that encoded data to other users, and even perform analysis to images that may have been under a steganography attack. For image steganography, lossless compression formats employ Least Significant Bit (LSB) encoding within Red Green Blue (RGB) pixel values. Reciprocally, lossy compression formats, such as JPEG, utilize data concealment in the frequency domain by altering the quantized matrices of the files. Video steganography follows two similar methods. In lossless video formats that permit compression, the LSB approach is applied to the RGB pixel values of individual frames. Meanwhile, in lossy High Efficient Video Coding (HEVC) formats, a displaced bit modification technique is used with the YUV components.[Resumo]: A esteganografía é a práctica de ocultar determinada información dentro doutros datos, como imaxes, audio, vídeos, etc. Neste proxecto pretendemos aplicar esta técnica como visión para crear unha aplicación móbil que permita aos usuarios ocultar os seus propios datos secretos dentro doutros formatos multimedia, enviar eses datos cifrados a outros usuarios e mesmo realizar análises de imaxes que puidesen ter sido comprometidas por un ataque esteganográfico. Para a esteganografía de imaxes, os formatos con compresión sen perdas empregan a codificación Least Significant Bit (LSB) dentro dos valores Red Green Blue (RGB) dos seus píxeles. Por outra banda, os formatos de compresión con perdas, como JPEG, usan a ocultación de datos no dominio de frecuencia modificando as matrices cuantificadas dos ficheiros. A esteganografía de vídeo segue dous métodos similares. En formatos de vídeo sen perdas, o método LSB aplícase aos valores RGB de píxeles individuais de cadros. En cambio, nos formatos High Efficient Video Coding (HEVC) con compresión con perdas, úsase unha técnica de cambio de bits nos compoñentes YUV.Traballo fin de grao (UDC.FIC). Enxeñaría Informática. Curso 2022/202

    Towards Highly-Integrated Stereovideoscopy for \u3ci\u3ein vivo\u3c/i\u3e Surgical Robots

    Get PDF
    When compared to traditional surgery, laparoscopic procedures result in better patient outcomes: shorter recovery, reduced post-operative pain, and less trauma to incisioned tissue. Unfortunately, laparoscopic procedures require specialized training for surgeons, as these minimally-invasive procedures provide an operating environment that has limited dexterity and limited vision. Advanced surgical robotics platforms can make minimally-invasive techniques safer and easier for the surgeon to complete successfully. The most common type of surgical robotics platforms -- the laparoscopic robots -- accomplish this with multi-degree-of-freedom manipulators that are capable of a diversified set of movements when compared to traditional laparoscopic instruments. Also, these laparoscopic robots allow for advanced kinematic translation techniques that allow the surgeon to focus on the surgical site, while the robot calculates the best possible joint positions to complete any surgical motion. An important component of these systems is the endoscopic system used to transmit a live view of the surgical environment to the surgeon. Coupled with 3D high-definition endoscopic cameras, the entirety of the platform, in effect, eliminates the peculiarities associated with laparoscopic procedures, which allows less-skilled surgeons to complete minimally-invasive surgical procedures quickly and accurately. A much newer approach to performing minimally-invasive surgery is the idea of using in-vivo surgical robots -- small robots that are inserted directly into the patient through a single, small incision; once inside, an in-vivo robot can perform surgery at arbitrary positions, with a much wider range of motion. While laparoscopic robots can harness traditional endoscopic video solutions, these in-vivo robots require a fundamentally different video solution that is as flexible as possible and free of bulky cables or fiber optics. This requires a miniaturized videoscopy system that incorporates an image sensor with a transceiver; because of severe size constraints, this system should be deeply embedded into the robotics platform. Here, early results are presented from the integration of a miniature stereoscopic camera into an in-vivo surgical robotics platform. A 26mm X 24mm stereo camera was designed and manufactured. The proposed device features USB connectivity and 1280 X 720 resolution at 30 fps. Resolution testing indicates the device performs much better than similarly-priced analog cameras. Suitability of the platform for 3D computer vision tasks -- including stereo reconstruction -- is examined. The platform was also tested in a living porcine model at the University of Nebraska Medical Center. Results from this experiment suggest that while the platform performs well in controlled, static environments, further work is required to obtain usable results in true surgeries. Concluding, several ideas for improvement are presented, along with a discussion of core challenges associated with the platform. Adviser: Lance C. Pérez [Document = 28 Mb

    Low Complexity Image Recognition Algorithms for Handheld devices

    Get PDF
    Content Based Image Retrieval (CBIR) has gained a lot of interest over the last two decades. The need to search and retrieve images from databases, based on information (“features”) extracted from the image itself, is becoming increasingly important. CBIR can be useful for handheld image recognition devices in which the image to be recognized is acquired with a camera, and thus there is no additional metadata associated to it. However, most CBIR systems require large computations, preventing their use in handheld devices. In this PhD work, we have developed low-complexity algorithms for content based image retrieval in handheld devices for camera acquired images. Two novel algorithms, ‘Color Density Circular Crop’ (CDCC) and ‘DCT-Phase Match’ (DCTPM), to perform image retrieval along with a two-stage image retrieval algorithm that combines CDCC and DCTPM, to achieve the low complexity required in handheld devices are presented. The image recognition algorithms run on a handheld device over a large database with fast retrieval time besides having high accuracy, precision and robustness to environment variations. Three algorithms for Rotation, Scale, and Translation (RST) compensation for images were also developed in this PhD work to be used in conjunction with the two-stage image retrieval algorithm. The developed algorithms are implemented, using a commercial fixed-point Digital Signal Processor (DSP), into a device, called ‘PictoBar’, in the domain of Alternative and Augmentative Communication (AAC). The PictoBar is intended to be used in the field of electronic aid for disabled people, in areas like speech rehabilitation therapy, education etc. The PictoBar is able to recognize pictograms and pictures contained in a database. Once an image is found in the database, a corresponding associated speech message is played. A methodology for optimal implementation and systematic testing of the developed image retrieval algorithms on a fixed point DSP is also established as part of this PhD work

    Audio module integration into high quality video transmission software

    Get PDF
    El projecte tracta d'afegir noves funcionalitats al sistema de transmissio HD, basat en software lliure, UltraGrid. El projecte consistira en l'integracio d'un modul d'audio tant en captura des d'una mateixa font, audio i video, com desde una font independent de forma sincronitzada, per cada usuari. Aixo donara lloc a un sistema complet de videoconferencia adaptable. A mes en una segona fase del projecte, l'alumne adaptara el modul d'audio per a entorns multiusuari (multiconferencia), en els modes, N:1 i N:N, afegint les millores d'audio aconseguides. Per ultim, s'estudiara la viabilitat de desenvolupar un cancel?ladors d'eco basat en software i un modul de deteccio d'activitat per donar posicio preferent al usuari que esta parlant a la multi- conferenci

    TinySurveillance: A Low-power Event-based Surveillance Method for Unmanned Aerial Vehicles

    Get PDF
    Unmanned Aerial Vehicles (UAVs) have always been faced with power management challenges. Managing power consumption becomes critical, especially in surveillance applications where the longer flight time results in wider coverage and a cheaper solution. While most current studies focus on utilizing new models for improving event detection without considering the power constraints, our design's first priority is our platform's power efficiency. Implementing an algorithm on a portable device with minimal access to power supply sources requires special hardware and software considerations. An improved algorithm may need more powerful hardware, which can surge power consumption. Therefore, we aim to propose a method to be suitable for such devices with power consumption constraints. In this work, we propose an event-driven surveillance method with an efficient video transmission algorithm that reduces power consumption while preserving image quality. The surveillance will start automatically once the low-power AI-based onboard processor detects the desired event. The drone repeatedly solves a classification problem by employing a lightweight deep learning algorithm. When the UAV detects the defined event, a sample image is sent to the server for validation. Afterwards, if the server validates the drone decision, the drone, which can be a UAV, starts sending a colored image accompanied by a group of N grayscale images. Then, in the server, the grayscale images will be colorized using a convolution neural network trained by the colored images. By adopting this method, the sent data rate decreases and the server's computation load increases. The former part results in a drop in the UAV's power consumption, which is our aim. In this work, an application of wildfire detection and surveillance has been implemented to show the proof of concept of the TinySurveillance method. Using four videos of similar scenarios with different spatial and temporal information that a UAV may face, with various spatial and temporal characteristics, we show the effectiveness of our method. Our results show that the power consumption of the onboard processing unit in detection mode will be reduced by at least 4 times, reaching a detection accuracy of 85%, while in surveillance mode, we can decrease the data transmission rate by almost 66% while achieving a competent image quality with PSNR_Avg of 41.35 dB, PSNR of 30.94 dB, and output frame rate of 5.2. Also, the reproduced images show the outstanding performance of the algorithm by generating colorized images identical to the original scenes. There are main features that affect our method's power consumption and output quality, like the number of grayscale images, sent video bitrate, learning rate, and video characteristics that are discussed comprehensively

    Development of an integrated interface between SAGE and Ultragrid

    Get PDF
    In this document the Master thesis called “Development of an integrated interface between SAGE and Ultragrid” is presented. During this document, new users’ and companies’ necessities, that come from the knowledge sharing to the productivity improvements, in the scope of the advanced tools for videoconferencing are set out. From these new necessities and after the analysis of the state of the art in videoconference and high definition, a new technological challenge to solve these necessities appears. During the master a novel design is set out, a design for a new kind of High Definition (uncompressed HD-SDI) videoconferencing system fully adaptable and scalable. By joining different technologies of distributed visualization and technologies of advanced streaming of high definition audiovisual contents over IP networks, a new prototype has been deployed, able to solve the new technological requirements. The new deployed system is able to visualize several HD-SDI streams simultaneously in a unique application. Also the new transmission/visualization module, allows to divide the HD-SDI stream in different self-content substreams, in order to give to the receptor user the possibility to choose, according his capabilities, the number of sub-streams that will be able to receive and process. This procedure will allow the user to always work with the best quality he is able to. The result of the thesis has been a high definition multi-videoconference low latency system, able to work point to multi-point where each user receive different resolutions, without transcoding. Finally, the obtained results have been analyzed, opening new research lines, and possible system improvements has been raised

    End-to-End Multiview Gesture Recognition for Autonomous Car Parking System

    Get PDF
    The use of hand gestures can be the most intuitive human-machine interaction medium. The early approaches for hand gesture recognition used device-based methods. These methods use mechanical or optical sensors attached to a glove or markers, which hinders the natural human-machine communication. On the other hand, vision-based methods are not restrictive and allow for a more spontaneous communication without the need of an intermediary between human and machine. Therefore, vision gesture recognition has been a popular area of research for the past thirty years. Hand gesture recognition finds its application in many areas, particularly the automotive industry where advanced automotive human-machine interface (HMI) designers are using gesture recognition to improve driver and vehicle safety. However, technology advances go beyond active/passive safety and into convenience and comfort. In this context, one of America’s big three automakers has partnered with the Centre of Pattern Analysis and Machine Intelligence (CPAMI) at the University of Waterloo to investigate expanding their product segment through machine learning to provide an increased driver convenience and comfort with the particular application of hand gesture recognition for autonomous car parking. In this thesis, we leverage the state-of-the-art deep learning and optimization techniques to develop a vision-based multiview dynamic hand gesture recognizer for self-parking system. We propose a 3DCNN gesture model architecture that we train on a publicly available hand gesture database. We apply transfer learning methods to fine-tune the pre-trained gesture model on a custom-made data, which significantly improved the proposed system performance in real world environment. We adapt the architecture of the end-to-end solution to expand the state of the art video classifier from a single image as input (fed by monocular camera) to a multiview 360 feed, offered by a six cameras module. Finally, we optimize the proposed solution to work on a limited resources embedded platform (Nvidia Jetson TX2) that is used by automakers for vehicle-based features, without sacrificing the accuracy robustness and real time functionality of the system

    Mapping of a partially known area using collaboration of two nao robots

    Get PDF
    Localization and mapping of unknown areas is a key topic in nowadays human-oid robotics and all the application and services that this field of engineering may have in the future. Furthermore, collaboration always makes tasks easier and improve the productivity of processes. Taking this into account, the aim of this master thesis is to develop an algorithm to make two humanoid robots work together to realize the mapping of a partially known area. For that purpose, NAO humanoid robots have been used and some land-marks that those robots are able to recognize. The location of the robots is acquired based on the relative position of some common landmarks whose situation in the envi-ronment is previously known and all the robots can detect. After the localization of the robots, they will locate other objects in the map, represented with landmarks as well, using the relative position with that landmarks and their own position in the map. Thus, all ro-bots will have the data of the location of all the robots and landmarks without the need of seeing them, using the information sharing. The code has been implemented in the programming language C++ and using the NAOqi API. After the implementation of the program, some experiments have been carried out with two NAO robots in an indoor environment to measure the accuracy of the lo-cation of the robots as well as the unknown landmarks in the map. It has been studied the effect of some drivers as the number of known landmarks, the distance between them or the speed of the head movement in the accuracy and reliability of the mapping task

    Rate-control for conversational H.264 video communication in heterogeneous networks

    Get PDF
    The transmission bit rate available along a communication path in a heterogeneous network is highly variable. The wireless link quality may vary due to interference and fading phenomena and, peered with radio layer reconfiguration and link layer protection mechanisms, lead to varying error rates, latencies, and, most importantly, changes in the available bit rate. And in both fixed and wireless networks, varying amounts of cross traffic from other nodes (i.e., the total offered load on the individual links of a network path) may lead to fluctuations in queue size (reflected again in a path latency) and to congestion (reflected in packet drops from router quenes). Senders have to adapt dynamically to these network conditions and adjust their sending rate and possibly other transmission parameters (such as encoding or redundancy) to match the available bit rate while maximizing the media quality perceived at the receiver. We investigate congestion indicators and their characteristics in different multimedia environments. Taking these characteristics into account, we propose a rate-adaptation algorithm that works in the following environments: a) Mobile-Mobile, b) Internet-Internet and c) Heterogeneous, Mobile-Internet scenarios. Using metrics such as Peak Signal-to-Noise Ratio (PSNR), loss rate, bandwidth utilization and fairness, we compare the algorithm with other rate-control algorithms for conversational video communication
    corecore