9 research outputs found

    Large-Scale Image Segmentation with Convolutional Networks

    Get PDF
    Object recognition is one of the most important problems in computer vision. However, visual recognition poses many challenges when tried to be reproduced by artificial systems. A main challenge is the problem of variability: objects can appear across huge variations in pose, appearance, illumination and occlusion, and a visual system need to be robust to all these changes. In the present thesis, we are interested in pixel-level recognition problems, ie, problems in which the objective is to partition a given image into multiple regions (overlapping or not) that are considered meaningful according to some criterion. Our interests are in algorithms that require the least amount of feature engineering and are easy to scale. Deep learning methods fit very well with this objective: these models alleviate the need of engineered features by discriminatively training a system from raw data (pixels). More precisely, we propose different convolutional neural network (CNN) based algorithms to deal with three important segmentation problems: semantic segmentation, object proposal generation and object detection with segments. The objective of semantic segmentation is to generate a categorical label to each pixel present in a scene. We first study the problem of fully supervised semantic segmentation. We propose a recurrent CNN that is able to consider a large input context (while limiting its capacity), which is essential to model long range pixel label dependencies. This approach achieves state-of-the-art performance without relying on any post-processing smoothing step. However, having densely labeled images to train a model can be expensive and require a lot of human labor. We also propose a CNN-based model that is able to infer object semantic segmentation by leveraging only the object category information from images. This is achieved by casting the problem into a multiple instance learning framework. This approach beats previous state of the art in weakly supervised semantic segmentation by a large margin. Object proposal algorithms generate a set of regions (segments) that are likely to contain objects, independent of their semantic category. Contrary to most approaches (which rely on low-level vision cues), we propose a CNN-based discriminative approach that is able to learn segmentation proposals from raw pixels. This approach is proven to be quite effective in this setting, achieving substantially higher recall using fewer proposals than other methods. The state of the art is pushed further with the introduction of a new top-down network augmentation. The resulting bottom-up/top-down network combines low-level rich spatial information with high-level object semantic information to improve segmentation, while remaining fast at test time. Finally, we show that the proposals generated by our approach, when coupled with a standard state-of-the-art object detection pipeline, achieve considerably better performance than previous proposals methods

    Doctor of Philosophy

    Get PDF
    dissertationA broad range of applications capture dynamic data at an unprecedented scale. Independent of the application area, finding intuitive ways to understand the dynamic aspects of these increasingly large data sets remains an interesting and, to some extent, unsolved research problem. Generically, dynamic data sets can be described by some, often hierarchical, notion of feature of interest that exists at each moment in time, and those features evolve across time. Consequently, exploring the evolution of these features is considered to be one natural way of studying these data sets. Usually, this process entails the ability to: 1) define and extract features from each time step in the data set; 2) find their correspondences over time; and 3) analyze their evolution across time. However, due to the large data sizes, visualizing the evolution of features in a comprehensible manner and performing interactive changes are challenging. Furthermore, feature evolution details are often unmanageably large and complex, making it difficult to identify the temporal trends in the underlying data. Additionally, many existing approaches develop these components in a specialized and standalone manner, thus failing to address the general task of understanding feature evolution across time. This dissertation demonstrates that interactive exploration of feature evolution can be achieved in a non-domain-specific manner so that it can be applied across a wide variety of application domains. In particular, a novel generic visualization and analysis environment that couples a multiresolution unified spatiotemporal representation of features with progressive layout and visualization strategies for studying the feature evolution across time is introduced. This flexible framework enables on-the-fly changes to feature definitions, their correspondences, and other arbitrary attributes while providing an interactive view of the resulting feature evolution details. Furthermore, to reduce the visual complexity within the feature evolution details, several subselection-based and localized, per-feature parameter value-based strategies are also enabled. The utility and generality of this framework is demonstrated by using several large-scale dynamic data sets

    Spatial-Aware Object-Level Saliency Prediction by Learning Graphlet Hierarchies

    No full text
    To fill the semantic gap between the predictive power of computational saliency models and human behavior, this paper proposes to predict where people look at using spatial-aware object-level cues. While object-level saliency has been recently suggested by psychophysics experiments and shown effective with a few computational models, the spatial relationship between the objects has not yet been explored in this context. We in this work for the first time explicitly model such spatial relationship, as well as leveraging semantic information of an image to enhance object-level saliency modeling. The core computational module is a graphlet-based (i.e., graphlets are moderate-sized connected subgraphs) deep architecture, which hierarchically learns a saliency map from raw image pixels to object-level graphlets (oGLs) and further to spatial-level graphlets (sGLs). Eye tracking data are also used to leverage human experience in saliency prediction. Experimental results demonstrate that the proposed oGLs and sGLs well capture object-level and spatial-level cues relating to saliency, and the resulting saliency model performs competitively compared with the state-of-the-art

    AVATAR - Machine Learning Pipeline Evaluation Using Surrogate Model

    Get PDF
    © 2020, The Author(s). The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution

    Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations

    Get PDF
    The network structure (or topology) of a dynamical network is often unavailable or uncertain. Hence, we consider the problem of network reconstruction. Network reconstruction aims at inferring the topology of a dynamical network using measurements obtained from the network. In this technical note we define the notion of solvability of the network reconstruction problem. Subsequently, we provide necessary and sufficient conditions under which the network reconstruction problem is solvable. Finally, using constrained Lyapunov equations, we establish novel network reconstruction algorithms, applicable to general dynamical networks. We also provide specialized algorithms for specific network dynamics, such as the well-known consensus and adjacency dynamics.Comment: 8 page

    Políticas de Copyright de Publicações Científicas em Repositórios Institucionais: O Caso do INESC TEC

    Get PDF
    A progressiva transformação das práticas científicas, impulsionada pelo desenvolvimento das novas Tecnologias de Informação e Comunicação (TIC), têm possibilitado aumentar o acesso à informação, caminhando gradualmente para uma abertura do ciclo de pesquisa. Isto permitirá resolver a longo prazo uma adversidade que se tem colocado aos investigadores, que passa pela existência de barreiras que limitam as condições de acesso, sejam estas geográficas ou financeiras. Apesar da produção científica ser dominada, maioritariamente, por grandes editoras comerciais, estando sujeita às regras por estas impostas, o Movimento do Acesso Aberto cuja primeira declaração pública, a Declaração de Budapeste (BOAI), é de 2002, vem propor alterações significativas que beneficiam os autores e os leitores. Este Movimento vem a ganhar importância em Portugal desde 2003, com a constituição do primeiro repositório institucional a nível nacional. Os repositórios institucionais surgiram como uma ferramenta de divulgação da produção científica de uma instituição, com o intuito de permitir abrir aos resultados da investigação, quer antes da publicação e do próprio processo de arbitragem (preprint), quer depois (postprint), e, consequentemente, aumentar a visibilidade do trabalho desenvolvido por um investigador e a respetiva instituição. O estudo apresentado, que passou por uma análise das políticas de copyright das publicações científicas mais relevantes do INESC TEC, permitiu não só perceber que as editoras adotam cada vez mais políticas que possibilitam o auto-arquivo das publicações em repositórios institucionais, como também que existe todo um trabalho de sensibilização a percorrer, não só para os investigadores, como para a instituição e toda a sociedade. A produção de um conjunto de recomendações, que passam pela implementação de uma política institucional que incentive o auto-arquivo das publicações desenvolvidas no âmbito institucional no repositório, serve como mote para uma maior valorização da produção científica do INESC TEC.The progressive transformation of scientific practices, driven by the development of new Information and Communication Technologies (ICT), which made it possible to increase access to information, gradually moving towards an opening of the research cycle. This opening makes it possible to resolve, in the long term, the adversity that has been placed on researchers, which involves the existence of barriers that limit access conditions, whether geographical or financial. Although large commercial publishers predominantly dominate scientific production and subject it to the rules imposed by them, the Open Access movement whose first public declaration, the Budapest Declaration (BOAI), was in 2002, proposes significant changes that benefit the authors and the readers. This Movement has gained importance in Portugal since 2003, with the constitution of the first institutional repository at the national level. Institutional repositories have emerged as a tool for disseminating the scientific production of an institution to open the results of the research, both before publication and the preprint process and postprint, increase the visibility of work done by an investigator and his or her institution. The present study, which underwent an analysis of the copyright policies of INESC TEC most relevant scientific publications, allowed not only to realize that publishers are increasingly adopting policies that make it possible to self-archive publications in institutional repositories, all the work of raising awareness, not only for researchers but also for the institution and the whole society. The production of a set of recommendations, which go through the implementation of an institutional policy that encourages the self-archiving of the publications developed in the institutional scope in the repository, serves as a motto for a greater appreciation of the scientific production of INESC TEC

    Contribuciones en el campo de la detección de la posición y velocidad de motores "Brushed DC" y "Brushless DC" mediante técnicas sensorless

    Get PDF
    - INTRODUCCIÓN: Entre los motores eléctricos de corriente continua, se tienen los motores con escobillas (brushed DC, BDC), que emplean escobillas para conmutar la corriente, y los motores sin escobillas (brushless DC, BLDC), que emplean un inversor electrónico para realizar la conmutación de fases. La revisión de la literatura relacionada con motores BLDC (Artículo 1 del compendio) y BDC sugiere que en el control de los mismos utilizando sensores de posición, como codificadores digitales (encoders) o sondas de efecto Hall, puede reducirse el coste y aumentar la fiabilidad mediante la sustitución de dichos sensores por técnicas sin sensores (sensorless). - OBJETIVOS: El objetivo general de la tesis comprende el análisis, desarrollo y validación de diversas técnicas sensorless para la detección de la posición y velocidad de motores BDC y BLDC. Para la consecución de este objetivo se han propuesto cuatro técnicas. La primera está basada en el análisis las ondulaciones o rizado (ripple) de la corriente en motores BDC (patente ES 2334551 A1). La segunda se fundamenta también en la componente ripple de la corriente para motores BDC, pero utilizando reconocimiento de patrones con clasificadores (Artículo 2 del compendio). La tercera está basada en la derivada de los voltajes de fase para motores BLDC (Artículo 3 del compendio). La cuarta aplica redes neuronales artificiales a motores BLDC (Artículo 4 del compendio).- MÉTODOS: La primera técnica permite determinar la posición y velocidad de un motor BDC mediante la detección de las ondulaciones que aparecen en la corriente del motor, utilizando la comparación entre las muestras de la corriente. En la segunda técnica, se estima la posición y velocidad de motores BDC utilizando reconocimiento de patrones con clasificadores de tipo Máquina Vectores de Soporte (Support Vector Machine, SVM). En la tercera técnica, se detecta la información de posición y velocidad de un motor BLDC a través de la derivada de los voltajes de fase con respecto a un punto neutro virtual, empleando un hardware versátil basado en una matriz de puertas programable en campo (FPGA). En la cuarta técnica, se estima la posición y velocidad de un motor BLDC mediante dos ANNs de tipo Perceptron Multicapa (Multilayer Perceptron, MLP). - RESULTADOS: En la primera técnica, se han obtenido unos errores absolutos medios de posición y velocidad inferiores a 17,75 rad y 4,64 rpm, respectivamente, en un rango entre 5.000 rpm y 7.000 rpm en condiciones de velocidad constante o con variación lenta. En la segunda técnica, se han obtenido unos errores absolutos medios de posición y velocidad inferiores a 19 rad y 18 rpm, respectivamente, en un rango entre 500 rpm y 11.000 rpm, en condiciones como aceleración constante y saltos abruptos de velocidad. En la tercera técnica, se han obtenido unos errores cuadráticos medios de posición entre 10º y 30º, y de velocidad inferiores a 3 rpm con el motor BLDC sin carga; así como de posición entre 10º y 15º, y de velocidad inferiores a 1 rpm en condiciones de plena carga, en un rango entre 5 rpm y 1.500 rpm con aceleración constante y saltos bruscos de velocidad. En la cuarta técnica, se ha obtenido un error de posición absoluto medio de 6,47º y un error de velocidad relativo medio de 4,87% en un rango entre 125 rpm y 1.500 rpm con una aceleración constante a plena carga. -CONCLUSIONES: Los resultados muestran que las cuatro técnicas propuestas permiten detectar la posición y velocidad, tanto en motores BDC como BLDC, con una aceptable precisión, inmunidad al ruido y coste computacional sobre un amplio rango de velocidades. En base a ello, puede considerarse que las técnicas desarrolladas representan una alternativa fiable respecto a técnicas de detección basadas en sensores y frente a técnicas sensorless básicas.Departamento de Teoría de la Señal y Comunicaciones e Ingeniería Telemátic
    corecore