6,489 research outputs found

    Self-growing neural network architecture using crisp and fuzzy entropy

    Get PDF
    The paper briefly describes the self-growing neural network algorithm, CID2, which makes decision trees equivalent to hidden layers of a neural network. The algorithm generates a feedforward architecture using crisp and fuzzy entropy measures. The results of a real-life recognition problem of distinguishing defects in a glass ribbon and of a benchmark problem of differentiating two spirals are shown and discussed

    Natural image processing and synthesis using deep learning

    Full text link
    Nous Ă©tudions dans cette thĂšse comment les rĂ©seaux de neurones profonds peuvent ĂȘtre utilisĂ©s dans diffĂ©rents domaines de la vision artificielle. La vision artificielle est un domaine interdisciplinaire qui traite de la comprĂ©hension d’images et de vidĂ©os numĂ©riques. Les problĂšmes de ce domaine ont traditionnellement Ă©tĂ© adressĂ©s avec des mĂ©thodes ad-hoc nĂ©cessitant beaucoup de rĂ©glages manuels. En effet, ces systĂšmes de vision artificiels comprenaient jusqu’à rĂ©cemment une sĂ©rie de modules optimisĂ©s indĂ©pendamment. Cette approche est trĂšs raisonnable dans la mesure oĂč, avec peu de donnĂ©es, elle bĂ©nĂ©ficient autant que possible des connaissances du chercheur. Mais cette avantage peut se rĂ©vĂ©ler ĂȘtre une limitation si certaines donnĂ©es d’entrĂ© n’ont pas Ă©tĂ© considĂ©rĂ©es dans la conception de l’algorithme. Avec des volumes et une diversitĂ© de donnĂ©es toujours plus grands, ainsi que des capacitĂ©s de calcul plus rapides et Ă©conomiques, les rĂ©seaux de neurones profonds optimisĂ©s d’un bout Ă  l’autre sont devenus une alternative attrayante. Nous dĂ©montrons leur avantage avec une sĂ©rie d’articles de recherche, chacun d’entre eux trouvant une solution Ă  base de rĂ©seaux de neurones profonds Ă  un problĂšme d’analyse ou de synthĂšse visuelle particulier. Dans le premier article, nous considĂ©rons un problĂšme de vision classique: la dĂ©tection de bords et de contours. Nous partons de l’approche classique et la rendons plus ‘neurale’ en combinant deux Ă©tapes, la dĂ©tection et la description de motifs visuels, en un seul rĂ©seau convolutionnel. Cette mĂ©thode, qui peut ainsi s’adapter Ă  de nouveaux ensembles de donnĂ©es, s’avĂšre ĂȘtre au moins aussi prĂ©cis que les mĂ©thodes conventionnelles quand il s’agit de domaines qui leur sont favorables, tout en Ă©tant beaucoup plus robuste dans des domaines plus gĂ©nĂ©rales. Dans le deuxiĂšme article, nous construisons une nouvelle architecture pour la manipulation d’images qui utilise l’idĂ©e que la majoritĂ© des pixels produits peuvent d’ĂȘtre copiĂ©s de l’image d’entrĂ©e. Cette technique bĂ©nĂ©ficie de plusieurs avantages majeurs par rapport Ă  l’approche conventionnelle en apprentissage profond. En effet, elle conserve les dĂ©tails de l’image d’origine, n’introduit pas d’aberrations grĂące Ă  la capacitĂ© limitĂ©e du rĂ©seau sous-jacent et simplifie l’apprentissage. Nous dĂ©montrons l’efficacitĂ© de cette architecture dans le cadre d’une tĂąche de correction du regard, oĂč notre systĂšme produit d’excellents rĂ©sultats. Dans le troisiĂšme article, nous nous Ă©clipsons de la vision artificielle pour Ă©tudier le problĂšme plus gĂ©nĂ©rale de l’adaptation Ă  de nouveaux domaines. Nous dĂ©veloppons un nouvel algorithme d’apprentissage, qui assure l’adaptation avec un objectif auxiliaire Ă  la tĂąche principale. Nous cherchons ainsi Ă  extraire des motifs qui permettent d’accomplir la tĂąche mais qui ne permettent pas Ă  un rĂ©seau dĂ©diĂ© de reconnaĂźtre le domaine. Ce rĂ©seau est optimisĂ© de maniĂšre simultanĂ© avec les motifs en question, et a pour tĂąche de reconnaĂźtre le domaine de provenance des motifs. Cette technique est simple Ă  implĂ©menter, et conduit pourtant Ă  l’état de l’art sur toutes les tĂąches de rĂ©fĂ©rence. Enfin, le quatriĂšme article prĂ©sente un nouveau type de modĂšle gĂ©nĂ©ratif d’images. À l’opposĂ© des approches conventionnels Ă  base de rĂ©seaux de neurones convolutionnels, notre systĂšme baptisĂ© SPIRAL dĂ©crit les images en termes de programmes bas-niveau qui sont exĂ©cutĂ©s par un logiciel de graphisme ordinaire. Entre autres, ceci permet Ă  l’algorithme de ne pas s’attarder sur les dĂ©tails de l’image, et de se concentrer plutĂŽt sur sa structure globale. L’espace latent de notre modĂšle est, par construction, interprĂ©table et permet de manipuler des images de façon prĂ©visible. Nous montrons la capacitĂ© et l’agilitĂ© de cette approche sur plusieurs bases de donnĂ©es de rĂ©fĂ©rence.In the present thesis, we study how deep neural networks can be applied to various tasks in computer vision. Computer vision is an interdisciplinary field that deals with understanding of digital images and video. Traditionally, the problems arising in this domain were tackled using heavily hand-engineered adhoc methods. A typical computer vision system up until recently consisted of a sequence of independent modules which barely talked to each other. Such an approach is quite reasonable in the case of limited data as it takes major advantage of the researcher's domain expertise. This strength turns into a weakness if some of the input scenarios are overlooked in the algorithm design process. With the rapidly increasing volumes and varieties of data and the advent of cheaper and faster computational resources end-to-end deep neural networks have become an appealing alternative to the traditional computer vision pipelines. We demonstrate this in a series of research articles, each of which considers a particular task of either image analysis or synthesis and presenting a solution based on a ``deep'' backbone. In the first article, we deal with a classic low-level vision problem of edge detection. Inspired by a top-performing non-neural approach, we take a step towards building an end-to-end system by combining feature extraction and description in a single convolutional network. The resulting fully data-driven method matches or surpasses the detection quality of the existing conventional approaches in the settings for which they were designed while being significantly more usable in the out-of-domain situations. In our second article, we introduce a custom architecture for image manipulation based on the idea that most of the pixels in the output image can be directly copied from the input. This technique bears several significant advantages over the naive black-box neural approach. It retains the level of detail of the original images, does not introduce artifacts due to insufficient capacity of the underlying neural network and simplifies training process, to name a few. We demonstrate the efficiency of the proposed architecture on the challenging gaze correction task where our system achieves excellent results. In the third article, we slightly diverge from pure computer vision and study a more general problem of domain adaption. There, we introduce a novel training-time algorithm (\ie, adaptation is attained by using an auxilliary objective in addition to the main one). We seek to extract features that maximally confuse a dedicated network called domain classifier while being useful for the task at hand. The domain classifier is learned simultaneosly with the features and attempts to tell whether those features are coming from the source or the target domain. The proposed technique is easy to implement, yet results in superior performance in all the standard benchmarks. Finally, the fourth article presents a new kind of generative model for image data. Unlike conventional neural network based approaches our system dubbed SPIRAL describes images in terms of concise low-level programs executed by off-the-shelf rendering software used by humans to create visual content. Among other things, this allows SPIRAL not to waste its capacity on minutae of datasets and focus more on the global structure. The latent space of our model is easily interpretable by design and provides means for predictable image manipulation. We test our approach on several popular datasets and demonstrate its power and flexibility

    Finding strong lenses in CFHTLS using convolutional neural networks

    Get PDF
    We train and apply convolutional neural networks, a machine learning technique developed to learn from and classify image data, to Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) imaging for the identification of potential strong lensing systems. An ensemble of four convolutional neural networks was trained on images of simulated galaxy-galaxy lenses. The training sets consisted of a total of 62,406 simulated lenses and 64,673 non-lens negative examples generated with two different methodologies. The networks were able to learn the features of simulated lenses with accuracy of up to 99.8% and a purity and completeness of 94-100% on a test set of 2000 simulations. An ensemble of trained networks was applied to all of the 171 square degrees of the CFHTLS wide field image data, identifying 18,861 candidates including 63 known and 139 other potential lens candidates. A second search of 1.4 million early type galaxies selected from the survey catalog as potential deflectors, identified 2,465 candidates including 117 previously known lens candidates, 29 confirmed lenses/high-quality lens candidates, 266 novel probable or potential lenses and 2097 candidates we classify as false positives. For the catalog-based search we estimate a completeness of 21-28% with respect to detectable lenses and a purity of 15%, with a false-positive rate of 1 in 671 images tested. We predict a human astronomer reviewing candidates produced by the system would identify ~20 probable lenses and 100 possible lenses per hour in a sample selected by the robot. Convolutional neural networks are therefore a promising tool for use in the search for lenses in current and forthcoming surveys such as the Dark Energy Survey and the Large Synoptic Survey Telescope.Comment: 16 pages, 8 figures. Accepted by MNRA

    Exploration of spatial-temporal dynamic phenomena in a 32×32-cell stored program two-layer CNN universal machine chip prototype

    Get PDF
    This paper describes a full-custom mixed-signal chip that embeds digitally programmable analog parallel processing and distributed image memory on a common silicon substrate. The chip was designed and fabricated in a standard 0.5 ÎŒm CMOS technology and contains approximately 500 000 transistors. It consists of 1024 processing units arranged into a 32 × 32 grid. Each processing element contains two coupled CNN cores, thus, constituting two parallel layers of 32 × 32 nodes. The functional features of the chip are in accordance with the 2nd Order Complex Cell CNN-UM architecture. It is composed of two CNN layers with programmable inter- and intra-layer connections between cells. Other features are: cellular, spatial-invariant array architecture; randomly selectable memory of instructions; random storage and retrieval of intermediate images. The chip is capable of completing algorithmic image processing tasks controlled by the user-selected stored instructions. The internal analog circuitry is designed to operate with 7-bits equivalent accuracy. The physical implementation of a CNN containing second order cells allows real-time experiments of complex dynamics and active wave phenomena. Such well-known phenomena from the reaction-diffusion equations are traveling waves, autowaves, and spiral-waves. All of these active waves are demonstrated on-chip. Moreover this chip was specifically designed to be suitable for the computation of biologically inspired retina models. These computational experiments have been carried out in a developmental environment designed for testing and programming the analogic (analog-and-logic) programmable array processors.Hungarian Academy of Sciences SIVA-2ComisiĂłn Interministerial de Ciencia y TecnologĂ­a TICC99-0826Office of Naval Research (USA) N00014-00-1-042
    • 

    corecore