7 research outputs found

    Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data

    Full text link
    Paucity of large curated hand-labeled training data for every domain-of-interest forms a major bottleneck in the deployment of machine learning models in computer vision and other fields. Recent work (Data Programming) has shown how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time. In this work, we present Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label has given a set of weak labeling functions. We validated our method on the MNIST, Fashion MNIST, CIFAR 10 and SVHN datasets, and it outperformed many state-of-the-art models. We conducted extensive experiments to study its usefulness, as well as showed how the proposed ADP framework can be used for transfer learning as well as multi-task learning, where data from two domains are generated simultaneously using the framework along with the label information. Our future work will involve understanding the theoretical implications of this new framework from a game-theoretic perspective, as well as explore the performance of the method on more complex datasets.Comment: CVPR 2018 main conference pape

    Apreensão e discretização de ambientes tangíveis em sistemas de realidade aumentada

    Get PDF
    Dissertação de mestrado integrado em Engenharia InformáticaA Realidade Aumentada (RA) caracteriza-se pela mistura de elementos virtuais no mundo real de forma interativa e em tempo real. O conceito de RA levanta uma ampla variedade de questões quanto à coerência visual entre os objetos reais e virtuais num ambiente. De forma a melhorar o processo de inclusão destes elementos no meio físico foram criadas várias técnicas e algoritmos de visão por computador que através do mapeamento de espaços físicos, extração de características e marcadores fiduciais de objetos, verificação, deteção, identificação, classificação, entre outros, permitem analisar e estruturar o conteúdo de uma cena. O maior desafio que se coloca com a realização desta proposta de dissertação encontra-se associado à forma como é extraída e processada a informação que conseguimos obter a partir dos sensores que complementam os dispositivos de RA hoje em dia, a fim de representar e compreender, da melhor forma possível, os ambientes que nos rodeiam e preparar um espaço apto para a introdução e apresentação de conteúdo virtual com a maior harmonia. Neste documento é possível encontrar o estado da arte relativo aos temas previamente citados a fim de explorar, melhorar e desenvolver novas técnicas e paradigmas para, a partir da informação dos sensores mais genéricos encontrados em muitas das tecnologias móveis e óculos de realidade aumentada mais atuais, extrair várias características do cenário e objetos envolventes em tempo real. O processamento e tratamento desta informação tem como objetivo final realizar o reconhecimento e compreensão da cena e objetos que se encontram no espaço que rodeia estes sensores. Em paralelo à realização desta proposta de dissertação, foi desenvolvida uma framework denominada “Tangible Environments in Augmented Reality Systems (TEARS)” com o objetivo de demonstrar tudo o que é discutido neste documento não só como algo para fins de investigação científica, mas também para utilização e apoio num projeto e protótipo realizado no âmbito da unidade curricular do 5ºAno do Mestrado Integrado em Engenharia Informática (MIEI) de Projeto em Engenharia Informática (PEI) e que apresenta o título: “Assistência Remota com Realidade Mista (ARRM)”.Augmented Reality (AR) is described as the mixing of virtual elements in the real world in an interactive way and in real time. The concept of AR raises many questions about the visual coherence between real and virtual objects in an environment. In order to improve the process of inclusion of these elements in the physical environment, a number of techniques and algorithms of computer vision have been created, which, through spatial mapping, extraction of characteristics and fiducial markers of objects, verification, detection, identification, classification, among others, allow us to analyse and structure the content of a scene. The greatest challenge with this dissertation proposal is associated to how information, that we can get from the sensors that complement the AR devices today, is extracted and processed to better represent and understand our surroundings and prepare a suitable space that allows the introduction and presentation of virtual content with the greatest harmony. In this document it is possible to find the state of art related to the before mentioned themes in order to explore, improve and develop new techniques and paradigms in a way that, from the information of the most generic sensors found in many of the most current mobile technologies and augmented reality glasses, we can extract various features of the scene and surrounding objects in real time. The stage of processing and treat this information has as its final goal the recognition and understanding of the scene and objects that are in the space that surrounds these sensors. In parallel to this dissertation proposal, a framework called "Tangible Environments in Augmented Reality Systems (TEARS)" was developed with the intention of demonstrating everything that is discussed in this document not only for scientific research purposes, but also for use and support in a project and prototype carried out within the scope of the curricular unit of the 5th year of the Integrated Master’s in Informatics Engineering (IMIE) named Informatics Engineering Project (IEP) and is titled: "Remote Assistance with Mixed Reality (RAMR)"

    Revealing the Invisible: On the Extraction of Latent Information from Generalized Image Data

    Get PDF
    The desire to reveal the invisible in order to explain the world around us has been a source of impetus for technological and scientific progress throughout human history. Many of the phenomena that directly affect us cannot be sufficiently explained based on the observations using our primary senses alone. Often this is because their originating cause is either too small, too far away, or in other ways obstructed. To put it in other words: it is invisible to us. Without careful observation and experimentation, our models of the world remain inaccurate and research has to be conducted in order to improve our understanding of even the most basic effects. In this thesis, we1 are going to present our solutions to three challenging problems in visual computing, where a surprising amount of information is hidden in generalized image data and cannot easily be extracted by human observation or existing methods. We are able to extract the latent information using non-linear and discrete optimization methods based on physically motivated models and computer graphics methodology, such as ray tracing, real-time transient rendering, and image-based rendering

    The PatchMatch Randomized Matching Algorithm for Image Manipulation

    No full text
    This paper presents a new randomized algorithm for quickly finding approximate nearest neighbor matches between image patches. Our algorithm offers substantial performance improvements over the previous state of the art (20–100×), enabling its use in new interactive image editing tools, computer vision, and video applications. Previously, the cost of computing such matches for an entire image had eluded efforts to provide interactive performance. The key insight driving our algorithm is that the elements of our search domain—patches of image pixels—are correlated, and thus the search strategy takes advantage of these statistics. Our algorithm uses two principles: first, that good patch matches can be found via random sampling, and second, that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. Our simple algorithm allows finding a single nearest neighbor match across translations only, whereas our general algorithm additionally allows matching of k-nearest neighbors, across all rotations and scales, and matching arbitrary descriptors. This one simple algorithm forms the basis for a variety of applications including image retargeting, completion, reshuffling, object detection, digital forgery detection, and video summarization

    Deep learning y big data en cartografía digital. Creación de inteligencias artificiales para el tratamiento de ortofotografías y sistemas de información geográfica tridimensionales

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Filosofía y Letras. Departamento de Geografía. Fecha de Lectura: 16-07-202
    corecore