1,558 research outputs found

    Fotofacesua: sistema de gestão fotográfica da Universidade de Aveiro

    Get PDF
    Nowadays, automation is present in basically every computational system. With the raise of Machine Learning algorithms through the years, the necessity of a human being to intervene in a system has dropped a lot. Although, in Universities, Companies and even governmental Institutions there are some systems that are have not been automatized. One of these cases, is the profile photo management, that stills requires human intervention to check if the image follows the Institution set of criteria that are obligatory to submit a new photo. FotoFaces is a system for updating the profile photos of collaborators at the University of Aveiro that allows the collaborator to submit a new photo and, automatically, through a set of image processing algorithms, decide if the photo meets a set of predifined criteria. One of the main advantages of this system is that it can be used in any institution and can be adapted to different needs by just changing the algorithms or criteria considered. This Dissertation describes some improvements implemented in the existing system, as well as some new features in terms of the available algorithms. The main contributions to the system are the following: sunglasses detection, hat detection and background analysis. For the first two, it was necessary to create a new database and label it to train, validate and test a deep transfer learning network, used to detect sunglasses and hats. In addition, several tests were performed varying the parameters of the network and using some machine learning and pre-processing techniques on the input images. Finally, the background analysis consists of the implementation and testing of 2 existing algorithms in the literature, one low level and the other deep learning. Overall, the results obtained in the improvement of the existing algorithms, as well as the performance of the new image processing modules, allowed the creation of a more robust (improved production version algorithms) and versatile (addition of new algorithms to the system) profile photo update system.Atualmente, a automação está presente em basicamente todos os sistemas computacionais. Com o aumento dos algoritmos de Aprendizagem Máquina ao longo dos anos, a necessidade de um ser humano intervir num sistema caiu bastante. Embora, em Universidades, Empresas e até Instituições governamentais, existam alguns sistemas que não foram automatizados. Um desses casos, é a gestão de fotos de perfil, que requer intervenção humana para verificar se a imagem segue o conjunto de critérios da Instituição que são obrigatórios para a submissão de uma nova foto. O FotoFaces é um sistema de atualização de fotos do perfil dos colaboradores na Universidade de Aveiro que permite ao colaborador submeter uma nova foto e, automaticamente, através de um conjunto de algoritmos de processamnto de imagem, decidir se a foto cumpre um conjunto de critérios predefinidos. Uma das principais vantagens deste sistema é que pode ser utilizado em qualquer Instituição e pode ser adaptado às diferentes necessidades alterando apenas os algoritmos ou os critérios considerados. Esta Dissertação descreve algumas melhorias implementadas no sistema existente, bem como algumas funcionalidades novas ao nível dos algoritmos disponíveis. As principais contribuições para o sistema são as seguintes: detecção de óculos de sol, detecção de chapéus e análise de background. Para as duas primeiras, foi necessário criar uma nova base de dados e rotulá-la para treinar, validar e testar uma rede de aprendizagem profunda por transferência, utilizada para detectar os óculos de sol e chapéus. Além disso, foram feitos vários testes variando os parâmetros da rede e usando algumas técnicas de aprendizagem máquina e pré-processamento sobre as imagens de entrada. Por fim, a análise do fundo consiste na implementação e teste de 2 algoritmos existentes na literatura, um de baixo nível e outro de aprendizagem profunda. Globalmente, os resultados obtidos na melhoria dos algoritmos existentes, bem como o desempenho dos novos módulos de processamneto de imagem, permitiram criar um sistema de atualização de fotos do perfil mais robusto (melhoria dos algoritmos da versão de produção) e versátil (adição de novos algoritmos ao sistema).Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

    Get PDF
    Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with physical environments for mobile devices. MAR systems enable users to interact with MAR devices, such as smartphones and head-worn wearables, and perform seamless transitions from the physical world to a mixed world with digital entities. These MAR systems support user experiences using MAR devices to provide universal access to digital content. Over the past 20 years, several MAR systems have been developed, however, the studies and design of MAR frameworks have not yet been systematically reviewed from the perspective of user-centric design. This article presents the first effort of surveying existing MAR frameworks (count: 37) and further discuss the latest studies on MAR through a top-down approach: (1) MAR applications; (2) MAR visualisation techniques adaptive to user mobility and contexts; (3) systematic evaluation of MAR frameworks, including supported platforms and corresponding features such as tracking, feature extraction, and sensing capabilities; and (4) underlying machine learning approaches supporting intelligent operations within MAR systems. Finally, we summarise the development of emerging research fields and the current state-of-the-art, and discuss the important open challenges and possible theoretical and technical directions. This survey aims to benefit both researchers and MAR system developers alike.Peer reviewe

    Large-Scale Mapping of Human Activity using Geo-Tagged Videos

    Full text link
    This paper is the first work to perform spatio-temporal mapping of human activity using the visual content of geo-tagged videos. We utilize a recent deep-learning based video analysis framework, termed hidden two-stream networks, to recognize a range of activities in YouTube videos. This framework is efficient and can run in real time or faster which is important for recognizing events as they occur in streaming video or for reducing latency in analyzing already captured video. This is, in turn, important for using video in smart-city applications. We perform a series of experiments to show our approach is able to accurately map activities both spatially and temporally. We also demonstrate the advantages of using the visual content over the tags/titles.Comment: Accepted at ACM SIGSPATIAL 201

    Robust pedestrian detection and path prediction using mmproved YOLOv5

    Get PDF
    In vision-based surveillance systems, pedestrian recognition and path prediction are critical concerns. Advanced computer vision applications, on the other hand, confront numerous challenges due to differences in pedestrian postures and scales, backdrops, and occlusion. To tackle these challenges, we present a YOLOv5-based deep learning-based pedestrian recognition and path prediction method. The updated YOLOv5 model was first used to detect pedestrians of various sizes and proportions. The proposed path prediction method is then used to estimate the pedestrian's path based on motion data. The suggested method deals with partial occlusion circumstances to reduce object occlusion-induced progression and loss, and links recognition results with motion attributes. After then, the path prediction algorithm uses motion and directional data to estimate the pedestrian movement's direction. The proposed method outperforms the existing methods, according to the results of the experiments. Finally, we come to a conclusion and look into future study

    Translating Video Recordings of Mobile App Usages into Replayable Scenarios

    Full text link
    Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly evolving platforms, automated techniques for analyzing all types of rich software artifacts provide benefit to mobile developers. Unfortunately, automatically analyzing screen recordings presents serious challenges, due to their graphical nature, compared to other types of (textual) artifacts. To address these challenges, this paper introduces V2S, a lightweight, automated approach for translating video recordings of Android app usages into replayable scenarios. V2S is based primarily on computer vision techniques and adapts recent solutions for object detection and image classification to detect and classify user actions captured in a video, and convert these into a replayable test scenario. We performed an extensive evaluation of V2S involving 175 videos depicting 3,534 GUI-based actions collected from users exercising features and reproducing bugs from over 80 popular Android apps. Our results illustrate that V2S can accurately replay scenarios from screen recordings, and is capable of reproducing \approx 89% of our collected videos with minimal overhead. A case study with three industrial partners illustrates the potential usefulness of V2S from the viewpoint of developers.Comment: In proceedings of the 42nd International Conference on Software Engineering (ICSE'20), 13 page

    Scene understanding for autonomous robots operating in indoor environments

    Get PDF
    Mención Internacional en el título de doctorThe idea of having robots among us is not new. Great efforts are continually made to replicate human intelligence, with the vision of having robots performing different activities, including hazardous, repetitive, and tedious tasks. Research has demonstrated that robots are good at many tasks that are hard for us, mainly in terms of precision, efficiency, and speed. However, there are some tasks that humans do without much effort that are challenging for robots. Especially robots in domestic environments are far from satisfactorily fulfilling some tasks, mainly because these environments are unstructured, cluttered, and with a variety of environmental conditions to control. This thesis addresses the problem of scene understanding in the context of autonomous robots operating in everyday human environments. Furthermore, this thesis is developed under the HEROITEA research project that aims to develop a robot system to help elderly people in domestic environments as an assistant. Our main objective is to develop different methods that allow robots to acquire more information from the environment to progressively build knowledge that allows them to improve the performance on high-level robotic tasks. In this way, scene understanding is a broad research topic, and it is considered a complex task due to the multiple sub-tasks that are involved. In that context, in this thesis, we focus on three sub-tasks: object detection, scene recognition, and semantic segmentation of the environment. Firstly, we implement methods to recognize objects considering real indoor environments. We applied machine learning techniques incorporating uncertainties and more modern techniques based on deep learning. Besides, apart from detecting objects, it is essential to comprehend the scene where they can occur. For this reason, we propose an approach for scene recognition that considers the influence of the detected objects in the prediction process. We demonstrate that the exiting objects and their relationships can improve the inference about the scene class. We also consider that a scene recognition model can benefit from the advantages of other models. We propose a multi-classifier model for scene recognition based on weighted voting schemes. The experiments carried out in real-world indoor environments demonstrate that the adequate combination of independent classifiers allows obtaining a more robust and precise model for scene recognition. Moreover, to increase the understanding of a robot about its surroundings, we propose a new division of the environment based on regions to build a useful representation of the environment. Object and scene information is integrated into a probabilistic fashion generating a semantic map of the environment containing meaningful regions within each room. The proposed system has been assessed on simulated and real-world domestic scenarios, demonstrating its ability to generate consistent environment representations. Lastly, full knowledge of the environment can enhance more complex robotic tasks; that is why in this thesis, we try to study how a complete knowledge of the environment influences the robot’s performance in high-level tasks. To do so, we select an essential task, which is searching for objects. This mundane task can be considered a precondition to perform many complex robotic tasks such as fetching and carrying, manipulation, user requirements, among others. The execution of these activities by service robots needs full knowledge of the environment to perform each task efficiently. In this thesis, we propose two searching strategies that consider prior information, semantic representation of the environment, and the relationships between known objects and the type of scene. All our developments are evaluated in simulated and real-world environments, integrated with other systems, and operating in real platforms, demonstrating their feasibility to implement in real scenarios, and in some cases outperforming other approaches. We also demonstrate how our representation of the environment can boost the performance of more complex robotic tasks compared to more standard environmental representations.La idea de tener robots entre nosotros no es nueva. Continuamente se realizan grandes esfuerzos para replicar la inteligencia humana, con la visión de tener robots que realicen diferentes actividades, incluidas tareas peligrosas, repetitivas y tediosas. La investigación ha demostrado que los robots son buenos en muchas tareas que resultan difíciles para nosotros, principalmente en términos de precisión, eficiencia y velocidad. Sin embargo, existen tareas que los humanos realizamos sin mucho esfuerzo y que son un desafío para los robots. Especialmente, los robots en entornos domésticos están lejos de cumplir satisfactoriamente algunas tareas, principalmente porque estos entornos no son estructurados, pueden estar desordenados y cuentan con una gran variedad de condiciones ambientales que controlar. Esta tesis aborda el problema de la comprensión de la escena en el contexto de robots autónomos que operan en entornos humanos cotidianos. Asimismo, esta tesis se desarrolla en el marco del proyecto de investigación HEROITEA que tiene como objetivo desarrollar un sistema robótico que funcione como asistente para ayudar a personas mayores en entornos domésticos. Nuestro principal objetivo es desarrollar diferentes métodos que permitan a los robots adquirir más información del entorno a fin de construir progresivamente un conocimiento que les permita mejorar su desempeño en tareas robóticas más complejas. En este sentido, la comprensión de escenas es un tema de investigación amplio, y se considera una tarea compleja debido a las múltiples subtareas involucradas. En esta tesis nos enfocamos específicamente en tres subtareas: detección de objetos, reconocimiento de escenas y etiquetado semántico del entorno. Por un lado, implementamos métodos para el reconocimiento de objectos considerando entornos interiores reales. Aplicamos técnicas de aprendizaje automático incorporando incertidumbres y técnicas más modernas basadas en aprendizaje profundo. Además, aparte de detectar objetos, es fundamental comprender la escena donde estos se encuentran. Por esta razón, proponemos un modelo para el reconocimiento de escenas que considera la influencia de los objetos detectados en el proceso de predicción. Demostramos que los objetos existentes y sus relaciones pueden mejorar el proceso de inferencia de la categoría de la escena. También consideramos que un modelo de reconocimiento de escenas puede beneficiarse de las ventajas de otros modelos. Por ello, proponemos un multiclasificador para el reconocimiento de escenas basado en esquemas de votación ponderados. Los experimentos llevados a cabo en entornos interiores reales demuestran que la combinación adecuada de clasificadores independientes permite obtener un modelo más robusto y preciso para el reconocimiento de escenas. Adicionalmente, para aumentar la comprensión de un robot acerca de su entorno, proponemos una nueva división del entorno basada en regiones a fin de construir una representación útil del entorno. La información de objetos y de la escena se integra de forma probabilística generando un mapa semántico que contiene regiones significativas dentro de cada habitación. El sistema propuesto ha sido evaluado en entornos domésticos simulados y reales, demostrando su capacidad para generar representaciones consistentes del entorno. Por otro lado, el conocimiento integral del entorno puede mejorar tareas robóticas más complejas; es por ello que en esta tesis analizamos cómo el conocimiento completo del entorno influye en el desempeño del robot en tareas de alto nivel. Para ello, seleccionamos una tarea fundamental, que es la búsqueda de objetos. Esta tarea mundana puede considerarse una condición previa para realizar diversas tareas robóticas complejas, como transportar objetos, tareas de manipulación, atender requerimientos del usuario, entre otras. La ejecución de estas actividades por parte de robots de servicio requiere un conocimiento profundo del entorno para realizar cada tarea de manera eficiente. En esta tesis proponemos dos estrategias de búsqueda de objetos que consideran información previa, la representación semántica del entorno, las relaciones entre los objetos conocidos y el tipo de escena. Todos nuestros desarrollos son evaluados en entornos simulados y reales, integrados con otros sistemas y operando en plataformas reales, demostrando su viabilidad de ser implementados en escenarios reales y, en algunos casos, superando a otros enfoques. También demostramos cómo nuestra representación del entorno puede mejorar el desempeño de tareas robóticas más complejas en comparación con representaciones del entorno más tradicionales.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Carlos Balaguer Bernaldo de Quirós.- Secretario: Fernando Matía Espada.- Vocal: Klaus Strob

    Desarrollo de técnicas avanzadas de seguimiento de posturas para reconocimiento de comportamientos de C. elegans

    Full text link
    Tesis por compendio[ES] El objetivo principal de esta tesis es el desarrollo de técnicas avanzadas de seguimiento de posturas para reconocimiento de comportamientos del Caenorhabditis elegans o C. elegans. El C. elegans es una clase de nematodo utilizado como organismo modelo para el estudio y tratamientos de diferentes enfermedades patológicas así como neurodegenerativas. Su comportamiento ofrece información valiosa para la investigación de nuevos fármacos (o productos alimenticios y cosméticos saludables) en el estudio de lifespan y healthspan. Al día de hoy, muchos de los ensayos con C. elegans se realizan de forma manual, es decir, usando microscopios para seguirlos y observar sus comportamientos o en laboratorios más modernos utilizando programas específicos. Estos programas no son totalmente automáticos, requieren ajuste de parámetros. Y en otros casos, son programas para visualización de imágenes donde el operador debe etiquetar maualmente el comportamiento de cada C. elegans. Todo esto se traduce a muchas horas de trabajo, lo cual se puede automatizar utilizando técnicas de visión por computador. Además de poder estimar indicadores de movilidad con mayor precisión que un operador humano. El problema principal en el seguimiento de posturas de C. elegans en placas de Petri son las agregaciones entre nematodos o con ruido del entorno. La pérdida o cambios de identidad son muy comunes ya sea de forma manual o usando programas automáticos/semi-automáticos. Y este problema se vuelve más complicado aún en imágenes de baja resolución. Los programas que automatizan estas tareas de seguimiento de posturas trabajan con técnicas de visión por computador usando técnicas tradicionales de procesamiento de imágenes o técnicas de aprendizaje profundo. Ambas técnicas han demostrado excelentes resultados en la detección y seguimiento de posturas de C. elegan}. Por un lado, técnicas tradicionales utilizan algoritmos/optimizadores para obtener la mejor solución, mientras que las técnicas de aprendizaje profundo aprenden de forma automática características del conjunto de datos de entrenamiento. El problema con las técnicas de aprendizaje profundo es que necesitan un conjunto de datos dedicado y grande para entrenar los modelos. La metodología utilizada para el desarrollo de esta tesis (técnicas avanzadas de seguimiento de posturas) se encuadran dentro del área de investigación de la visión artificial. Y ha sido abordada explorando ambas ramas de visión por computador para resolver los problemas de seguimiento de posturas de C. elegans en imágenes de baja resolución. La primera parte, es decir, secciones 1 y 2, capítulo 2, utilizó técnicas tradicionales de procesamiento de imágenes para realizar la detección y seguimiento de posturas de los C. elegans. Para ello se propuso una nueva técnica de esqueletización y dos nuevos criterios de evaluación para obtener mejores resultados de seguimiento, detección, y segmentación de posturas. Las siguientes secciones del capítulo 2 utilizan técnicas de aprendizaje profundo, y simulación de imágenes sintéticas para entrenar modelos y mejorar los resultados de detección y predicción de posturas. Los resultados demostraron ser más rápidos y más precisos en comparación con técnicas tradicionales. También se demostró que los métodos de aprendizaje profundo son más robustos ante la presencia de ruido en la placa.[CA] L'objectiu principal d'aquesta tesi és el desenvolupament de tècniques avançades de seguiment de postures per a reconeixement de comportaments del Caenorhabditis elegans o C. elegans. El C. elegans és una classe de nematodo utilitzat com a organisme model per a l'estudi i tractaments de diferents malalties patològiques així com neurodegeneratives. El seu comportament ofereix informació valuosa per a la investigació de nous fàrmacs (o productes alimentosos i cosmètics saludables) en l'estudi de lifespan i healthspan. Al dia de hui, molts dels assajos amb C. elegans es realitzen de manera manual, és a dir, usant microscopis per a seguir-los i observar els seus comportaments o en laboratoris més moderns utilitzant programes específics. Aquests programes no són totalment automàtics, requereixen ajust de paràmetres. I en altres casos, són programes per a visualització d'imatges on l'operador ha d'etiquetar maualment el comportament de cada C. elegans. Tot això es tradueix a moltes hores de treball, la qual cosa es pot automatitzar utilitzant tècniques de visió per computador. A més de poder estimar indicadors de mobilitat amb major precisió que un operador humà. El problema principal en el seguiment de postures de C. elegans en plaques de Petri són les agregacions entre nematodes o amb soroll de l'entorn. La pèrdua o canvis d'identitat són molt comuns ja siga de manera manual o usant programes automàtics/semi-automàtics. I aquest problema es torna més complicat encara en imatges de baixa resolució. Els programes que automatitzen aquestes tasques de seguiment de postures treballen amb tècniques de visió per computador usant tècniques tradicionals de processament d'imatges o tècniques d'aprenentatge profund. Totes dues tècniques han demostrat excel·lents resultats en la detecció i seguiment de postures de C. elegans. D'una banda, tècniques tradicionals utilitzen algorismes/optimizadors per a obtindre la millor solució, mentre que les tècniques d'aprenentatge profund aprenen de manera automàtica característiques del conjunt de dades d'entrenament. El problema amb les tècniques d'aprenentatge profund és que necessiten un conjunt de dades dedicat i gran per a entrenar els models. La metodologia utilitzada per al desenvolupament d'aquesta tesi (tècniques avançades de seguiment de postures) s'enquadren dins de l'àrea d'investigació de la visió artificial. I ha sigut abordada explorant totes dues branques de visió per computador per a resoldre els problemes de seguiment de postures de C. elegans en imatges de baixa resolució. La primera part, és a dir, secció 1 i 2, capítol 2, va utilitzar tècniques tradicionals de processament d'imatges per a realitzar la detecció i seguiment de postures dels C. elegans. Per a això es va proposar una nova tècnica de esqueletizació i dos nous criteris d'avaluació per a obtindre millors resultats de seguiment, detecció i segmentació de postures. Les següents seccions del capítol 2 utilitzen tècniques d'aprenentatge profund i simulació d'imatges sintètiques per a entrenar models i millorar els resultats de detecció i predicció de postures. Els resultats van demostrar ser més ràpids i més precisos en comparació amb tècniques tradicionals. També es va demostrar que els mètodes d'aprenentatge profund són més robustos davant la presència de soroll en la placa.[EN] The main objective of this thesis is the development of advanced posture-tracking techniques for behavioural recognition of Caenorhabditis elegans or C. elegans. C. elegans is a kind of nematode used as a model organism for the study and treatment of different pathological and neurodegenerative diseases. Their behaviour provides valuable information for the research of new drugs (or healthy food and cosmetic products) in the study of lifespan and healthspan. Today, many of the tests on C. elegans are performed manually, i.e. using microscopes to track them and observe their behaviour, or in more modern laboratories using specific software. These programmes are not fully automatic, requiring parameter adjustment. And in other cases, they are programmes for image visualisation where the operator must label the behaviour of each C. elegans manually. All this translates into many hours of work, which can be automated using computer vision techniques. In addition to being able to estimate mobility indicators more accurately than a human operator. The main problem in tracking C. elegans postures in Petri dishes is aggregations between nematodes or with noise from the environment. Loss or changes of identity are very common either manually or using automatic/semi-automatic programs. And this problem becomes even more complicated in low-resolution images. Programs that automate these pose-tracking tasks work with computer vision techniques using either traditional image processing techniques or deep learning techniques. Both techniques have shown excellent results in the detection and tracking of C. elegans postures. On the one hand, traditional techniques use algorithms/optimizers to obtain the best solution, while deep learning techniques automatically learn features from the training dataset. The problem with deep learning techniques is that they need a dedicated and large dataset to train the models. The methodology used for the development of this thesis (advanced posture-tracking techniques) falls within the research area of computer vision. It has been approached by exploring both branches of computer vision to solve the posture-tracking problems of C. elegans in low-resolution images. The first part, i.e. sections 1 and 2, chapter 2, used traditional image processing techniques to perform posture detection and tracking of C. elegans. For this purpose, a new skeletonization technique and two new evaluation criteria were proposed to obtain better posture-tracking, detection, and segmentation results. The next sections of chapter 2 use deep learning techniques, and synthetic image simulation to train models and improve posture detection and prediction results. The results proved to be faster and more accurate compared to traditional techniques. Deep learning methods were also shown to be more robust in the presence of plate noise.This research was supported by Ministerio de Ciencia, Innovación y Universidades [RTI2018-094312-B-I00 (European FEDER funds); FPI PRE2019-088214], and also was supported by Universitat Politècnica de València [“Funding for open access charge: Uni- versitat Politècnica de València”]. The author received a scholarship from the grant: Ayudas para contratos predoctorales para la formación de doctores 2019.Layana Castro, PE. (2023). Desarrollo de técnicas avanzadas de seguimiento de posturas para reconocimiento de comportamientos de C. elegans [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/198879Compendi

    Unobtrusive Assessment Of Student Engagement Levels In Online Classroom Environment Using Emotion Analysis

    Get PDF
    Measuring student engagement has emerged as a significant factor in the process of learning and a good indicator of the knowledge retention capacity of the student. As synchronous online classes have become more prevalent in recent years, gauging a student\u27s attention level is more critical in validating the progress of every student in an online classroom environment. This paper details the study on profiling the student attentiveness to different gradients of engagement level using multiple machine learning models. Results from the high accuracy model and the confidence score obtained from the cloud-based computer vision platform - Amazon Rekognition were then used to statistically validate any correlation between student attentiveness and emotions. This statistical analysis helps to identify the significant emotions that are essential in gauging various engagement levels. This study identified emotions like calm, happy, surprise, and fear are critical in gauging the student\u27s attention level. These findings help in the earlier detection of students with lower attention levels, consequently helping the instructors focus their support and guidance on the students in need, leading to a better online learning environment
    corecore