157 research outputs found

    Video Processing Techniques for Traffic Information Acquisition Using Uncontrolled Video Streams

    Get PDF
    This paper reports on the first steps taken in search of a solution that uses public video streams available on the Internet to address the increasing need for monitoring transportation networks with the intent of returning added value to the community, either by allowing a better understanding of the network and its needs or by feeding applications with real-time information for various purposes, such as simulation, decision-making support and updated route guidance. After the introduction of the field, we present our findings from a survey that briefly describes several works with related studies and explain the algorithms that can be adopted to get relevant information from video streams. This is followed by an analysis of the issues that may arise and the best ways to address them. Next it reports on the results achieved so far, draws some conclusions on what has been done and suggests the next steps of our research

    Video Processing Techniques for Traffic Information Acquisition Using Uncontrolled Video Streams

    Get PDF
    This paper reports on the first steps taken in search of a solution that uses public video streams available on the Internet to address the increasing need for monitoring transportation networks with the intent of returning added value to the community, either by allowing a better understanding of the network and its needs or by feeding applications with real-time information for various purposes, such as simulation, decision-making support and updated route guidance. After the introduction of the field, we present our findings from a survey that briefly describes several works with related studies and explain the algorithms that can be adopted to get relevant information from video streams. This is followed by an analysis of the issues that may arise and the best ways to address them. Next it reports on the results achieved so far, draws some conclusions on what has been done and suggests the next steps of our research

    Detecting 3D geometric boundaries of indoor scenes under varying lighting

    Full text link
    The goal of this research is to identify 3D geometric boundaries in a set of 2D photographs of a static indoor scene under unknown, changing lighting conditions. A 3D geometric boundary is a contour located at a 3D depth discontinuity or a discontinuity in the surface normal. These boundaries can be used effectively for reasoning about the 3D layout of a scene. To distinguish 3D geometric boundaries from 2D texture edges, we analyze the illumination subspace of local appearance at each image location. In indoor time-lapse photography and surveillance video, we frequently see images that are lit by unknown combinations of uncalibrated light sources. We in-troduce an algorithm for semi-binary non-negative matrix factorization (SBNMF) to decompose such images into a set of lighting basis images, each of which shows the scene lit by a single light source. These basis images provide a natural, succinct representation of the scene, enabling tasks such as scene editing (e.g., relighting) and shadow edge identificatio

    Face Recognition Under Varying Illumination

    Get PDF
    This study is a result of a successful joint-venture with my adviser Prof. Dr. Muhittin Gökmen. I am thankful to him for his continuous assistance on preparing this project. Special thanks to the assistants of the Computer Vision Laboratory for their steady support and help in many topics related with the project

    Physics-based vision meets deep learning

    Get PDF
    Physics-based vision explores computer vision and graphics problems by applying methods based upon physical models. On the other hand, deep learning is a learning-based technique, where a substantial number of observations are used to train an expressive yet unexplainable neural network model. In this thesis, we propose the concept of a model-based decoder, which is an unlearnable and differentiable neural layer being designed according to a physics-based model. Constructing neural networks with such model-based decoders afford the model strong learning capability as well as the potential to respect the underlying physics. We start the study by developing a toolbox of differentiable photometric layers ported from classical photometric techniques. This enables us to perform the image formation process given geometry, illumination and reflectance function. Applying these differentiable photometric layers into a bidirectional reflectance distribution function (BRDF) estimation network training, we show the network could be trained in a self-supervised manner without the knowledge of ground truth BRDFs. Next, in a more general setting, we attempt to solve inverse rendering problems in a self-supervised fashion by making use of model-based decoders. Here, an inverse rendering network decomposes a single image into normal and diffuse albedo map and illumination. In order to achieve self-supervised training, we draw inspiration from multiview stereo (MVS) and employ a Lambertian model and a cross-projection MVS model to generate model-based supervisory signals. Finally, we seek potential hybrids of a neural decoder and a model-based decoder on a pair of practical problems: image relighting, and fine-scale depth prediction and novel view synthesis. In contrast to using model-based decoders to only supervise the training, the model-based decoder in our hybrid model serves to disentangle the intricate problem into a set of physically connected solvable ones. In practice, we develop a hybrid model that can estimate a fine-scale depth map and generate novel view synthesis from a single image by using a physical subnet to combine results from an inverse rendering network with a monodepth prediction network. As for neural image relighting, we propose another hybrid model using a Lambertian renderer to generate initial estimates of relighting results followed by a neural renderer performing corrections over deficits in initial renderings. We demonstrate the model-based decoder can significantly improve the quality of results and relax the demands for labelled data

    PERSON RE-IDENTIFICATION USING RGB-DEPTH CAMERAS

    Full text link
    [EN] The presence of surveillance systems in our lives has drastically increased during the last years. Camera networks can be seen in almost every crowded public and private place, which generate huge amount of data with valuable information. The automatic analysis of data plays an important role to extract relevant information from the scene. In particular, the problem of person re-identification is a prominent topic that has become of great interest, specially for the fields of security or marketing. However, there are some factors, such as changes in the illumination conditions, variations in the person pose, occlusions or the presence of outliers that make this topic really challenging. Fortunately, the recent introduction of new technologies such as depth cameras opens new paradigms in the image processing field and brings new possibilities. This Thesis proposes a new complete framework to tackle the problem of person re-identification using commercial rgb-depth cameras. This work includes the analysis and evaluation of new approaches for the modules of segmentation, tracking, description and matching. To evaluate our contributions, a public dataset for person re-identification using rgb-depth cameras has been created. Rgb-depth cameras provide accurate 3D point clouds with color information. Based on the analysis of the depth information, an novel algorithm for person segmentation is proposed and evaluated. This method accurately segments any person in the scene, and naturally copes with occlusions and connected people. The segmentation mask of a person generates a 3D person cloud, which can be easily tracked over time based on proximity. The accumulation of all the person point clouds over time generates a set of high dimensional color features, named raw features, that provides useful information about the person appearance. In this Thesis, we propose a family of methods to extract relevant information from the raw features in different ways. The first approach compacts the raw features into a single color vector, named Bodyprint, that provides a good generalisation of the person appearance over time. Second, we introduce the concept of 3D Bodyprint, which is an extension of the Bodyprint descriptor that includes the angular distribution of the color features. Third, we characterise the person appearance as a bag of color features that are independently generated over time. This descriptor receives the name of Bag of Appearances because its similarity with the concept of Bag of Words. Finally, we use different probabilistic latent variable models to reduce the feature vectors from a statistical perspective. The evaluation of the methods demonstrates that our proposals outperform the state of the art.[ES] La presencia de sistemas de vigilancia se ha incrementado notablemente en los últimos anños. Las redes de videovigilancia pueden verse en casi cualquier espacio público y privado concurrido, lo cual genera una gran cantidad de datos de gran valor. El análisis automático de la información juega un papel importante a la hora de extraer información relevante de la escena. En concreto, la re-identificación de personas es un campo que ha alcanzado gran interés durante los últimos años, especialmente en seguridad y marketing. Sin embargo, existen ciertos factores, como variaciones en las condiciones de iluminación, variaciones en la pose de la persona, oclusiones o la presencia de artefactos que hacen de este campo un reto. Afortunadamente, la introducción de nuevas tecnologías como las cámaras de profundidad plantea nuevos paradigmas en la visión artificial y abre nuevas posibilidades. En esta Tesis se propone un marco completo para abordar el problema de re-identificación utilizando cámaras rgb-profundidad. Este trabajo incluye el análisis y evaluación de nuevos métodos de segmentación, seguimiento, descripción y emparejado de personas. Con el fin de evaluar las contribuciones, se ha creado una base de datos pública para re-identificación de personas usando estas cámaras. Las cámaras rgb-profundidad proporcionan nubes de puntos 3D con información de color. A partir de la información de profundidad, se propone y evalúa un nuevo algoritmo de segmentación de personas. Este método segmenta de forma precisa cualquier persona en la escena y resuelve de forma natural problemas de oclusiones y personas conectadas. La máscara de segmentación de una persona genera una nube de puntos 3D que puede ser fácilmente seguida a lo largo del tiempo. La acumulación de todas las nubes de puntos de una persona a lo largo del tiempo genera un conjunto de características de color de grandes dimensiones, denominadas características base, que proporcionan información útil de la apariencia de la persona. En esta Tesis se propone una familia de métodos para extraer información relevante de las características base. La primera propuesta compacta las características base en un vector único de color, denominado Bodyprint, que proporciona una buena generalización de la apariencia de la persona a lo largo del tiempo. En segundo lugar, se introducen los Bodyprints 3D, definidos como una extensión de los Bodyprints que incluyen información angular de las características de color. En tercer lugar, la apariencia de la persona se caracteriza mediante grupos de características de color que se generan independientemente a lo largo del tiempo. Este descriptor recibe el nombre de Grupos de Apariencias debido a su similitud con el concepto de Grupos de Palabras. Finalmente, se proponen diferentes modelos probabilísticos de variables latentes para reducir los vectores de características desde un punto de vista estadístico. La evaluación de los métodos demuestra que nuestras propuestas superan los métodos del estado del arte.[CA] La presència de sistemes de vigilància s'ha incrementat notòriament en els últims anys. Les xarxes de videovigilància poden veure's en quasi qualsevol espai públic i privat concorregut, la qual cosa genera una gran quantitat de dades de gran valor. L'anàlisi automàtic de la informació pren un paper important a l'hora d'extraure informació rellevant de l'escena. En particular, la re-identificaciò de persones és un camp que ha aconseguit gran interès durant els últims anys, especialment en seguretat i màrqueting. No obstant, hi ha certs factors, com variacions en les condicions d'il.luminació, variacions en la postura de la persona, oclusions o la presència d'artefactes que fan d'aquest camp un repte. Afortunadament, la introducció de noves tecnologies com les càmeres de profunditat, planteja nous paradigmes en la visió artificial i obri noves possibilitats. En aquesta Tesi es proposa un marc complet per abordar el problema de la re-identificació mitjançant càmeres rgb-profunditat. Aquest treball inclou l'anàlisi i avaluació de nous mètodes de segmentació, seguiment, descripció i emparellat de persones. Per tal d'avaluar les contribucions, s'ha creat una base de dades pública per re-identificació de persones emprant aquestes càmeres. Les càmeres rgb-profunditat proporcionen núvols de punts 3D amb informació de color. A partir de la informació de profunditat, es defineix i s'avalua un nou algorisme de segmentació de persones. Aquest mètode segmenta de forma precisa qualsevol persona en l'escena i resol de forma natural problemes d'oclusions i persones connectades. La màscara de segmentació d'una persona genera un núvol de punts 3D que pot ser fàcilment seguida al llarg del temps. L'acumulació de tots els núvols de punts d'una persona al llarg del temps genera un conjunt de característiques de color de grans dimensions, anomenades característiques base, que hi proporcionen informació útil de l'aparença de la persona. En aquesta Tesi es proposen una família de mètodes per extraure informació rellevant de les característiques base. La primera proposta compacta les característiques base en un vector únic de color, anomenat Bodyprint, que proporciona una bona generalització de l'aparença de la persona al llarg del temps. En segon lloc, s'introdueixen els Bodyprints 3D, definits com una extensió dels Bodyprints que inclouen informació angular de les característiques de color. En tercer lloc, l'aparença de la persona es caracteritza amb grups de característiques de color que es generen independentment a llarg del temps. Aquest descriptor reb el nom de Grups d'Aparences a causa de la seua similitud amb el concepte de Grups de Paraules. Finalment, es proposen diferents models probabilístics de variables latents per reduir els vectors de característiques des d'un punt de vista estadístic. L'avaluació dels mètodes demostra que les propostes presentades superen als mètodes de l'estat de l'art.Oliver Moll, J. (2015). PERSON RE-IDENTIFICATION USING RGB-DEPTH CAMERAS [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/59227TESI
    • …
    corecore