Search CORE

510 research outputs found

kLog: A Language for Logical and Relational Learning with Kernels

Author: Altun
Ando
Antanas
Antanas
Antanas
Argyriou
Blockeel
Blockeel
Bottou
Boulicaut
Bröcheler
Ceroni
Chang
Chang
Cook
Costa
Costa
De
De Grave
De Grave
De Raedt
De Raedt
De Raedt
Dietterich
Dietterich
Evgeniou
Fabrizio Costa
Frasconi
Frasconi
Friedman
Gross
Gärtner
Gärtner
Haussler
Heckerman
Helma
Helma
Horváth
Joachims
Kazius
Kersting
Kersting
Kersting
Kimmig
Koller
Kordjamshidi
Kou
Kramer
Kurt De Grave
Lanckriet
Landwehr
Lao
Lari
London
Lowd
Luc De Raedt
Luks
Macskassy
Mahe
McCallum
McKay
Menchetti
Mitchell
Muggleton
Muggleton
Neville
Ng
Paolo Frasconi
Quinlan
Ralaivola
Richardson
Rizzolo
Rossi
Serebrenik
Shervashidze
Shi
Sorlin
Srinivasan
Srinivasan
Sun
Sutton
Taskar
Taskar
Tsochantaridis
van de Waterbeemd
Vazquez
Verbeke
Verbeke
Vishwanathan
Wachman
Wang
Wolpert
Yan
Publication venue: 'Elsevier BV'
Publication date: 28/07/2014
Field of study

We introduce kLog, a novel approach to statistical relational learning. Unlike standard approaches, kLog does not represent a probability distribution directly. It is rather a language to perform kernel-based learning on expressive logical and relational representations. kLog allows users to specify learning problems declaratively. It builds on simple but powerful concepts: learning from interpretations, entity/relationship data modeling, logic programming, and deductive databases. Access by the kernel to the rich representation is mediated by a technique we call graphicalization: the relational representation is first transformed into a graph --- in particular, a grounded entity/relationship diagram. Subsequently, a choice of graph kernel defines the feature space. kLog supports mixed numerical and symbolic data, as well as background knowledge in the form of Prolog or Datalog programs as in inductive logic programming systems. The kLog framework can be applied to tackle the same range of tasks that has made statistical relational learning so popular, including classification, regression, multitask learning, and collective classification. We also report about empirical comparisons, showing that kLog can be either more accurate, or much faster at the same level of accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at http://klog.dinfo.unifi.it along with tutorials

arXiv.org e-Print Archive

Lirias

Crossref

Image Database Management System : Design Considerations, Algorithms and Architecture

Author: Nes N.J. (Niels)
Publication venue
Publication date: 14/12/2000
Field of study

CWI's Institutional Repository

Study of object recognition and identification based on shape and texture analysis

Author: Wang Guanqi
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/03/2012
Field of study

The objective of object recognition is to enable computers to recognize image patterns without human intervention. According to its applications, it is mainly divided into two parts: recognition of object categories and detection/identification of objects. My thesis studied the techniques of object feature analysis and identification strategies, which solve the object recognition problem by employing effective and perceptually important object features. The shape information is of particular interest and a review of the shape representation and description is presented, as well as the latest research work on object recognition. In the second chapter of the thesis, a novel content-based approach is proposed for efficient shape classification and retrieval of 2D objects. Two object detection approaches, which are designed according to the characteristics of the shape context and SIFT descriptors, respectively, are analyzed and compared. It is found that the identification strategy constructed on a single type of object feature is only able to recognize the target object under specific conditions which the identifier is adapted to. These identifiers are usually designed to detect the target objects which are rich in the feature type captured by the identifier. In addition, this type of feature often distinguishes the target object from the complex scene. To overcome this constraint, a novel prototyped-based object identification method is presented to detect the target object in the complex scene by employing different types of descriptors to capture the heterogeneous features. All types of descriptors are modified to meet the requirement of the detection strategy’s framework. Thus this new method is able to describe and identify various kinds of objects whose dominant features are quite different. The identification system employs the cosine similarity to evaluate the resemblance between the prototype image and image windows on the complex scene. Then a ‘resemblance map’ is established with values on each patch representing the likelihood of the target object’s presence. The simulation approved that this novel object detection strategy is efficient, robust and of scale and rotation invariance

Spiral - Imperial College Digital Repository

NiftyNet: a deep-learning platform for medical imaging

Author: Barratt Dean C.
Cardoso M. Jorge
Doel Tom
Eaton-Rosen Zach
Fidon Lucas
Gibson Eli
Gray Robert
Hu Yipeng
Li Wenqi
Modat Marc
Nachev Parashkev
Ourselin Sébastien
Shakir Dzhoshkun I.
Sudre Carole
Vercauteren Tom
Wang Guotai
Whyntie Tom
Publication venue: 'Elsevier BV'
Publication date: 16/10/2017
Field of study

Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions. Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this application requires substantial implementation effort. Thus, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups. This work presents the open-source NiftyNet platform for deep learning in medical imaging. The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon. NiftyNet provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention. NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D and 3D images and computational graphs by default. We present 3 illustrative medical image analysis applications built using NiftyNet: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses. NiftyNet enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.Comment: Wenqi Li and Eli Gibson contributed equally to this work. M. Jorge Cardoso and Tom Vercauteren contributed equally to this work. 26 pages, 6 figures; Update includes additional applications, updated author list and formatting for journal submissio

arXiv.org e-Print Archive

Crossref

UCL Discovery

King's Research Portal

Feature extraction using MPEG-CDVS and Deep Learning with application to robotic navigation and image classification

Author: PORTO BUARQUE DE GUSMAO PEDRO
Publication venue: country:Italy
Publication date: 01/01/2017
Field of study

The main contributions of this thesis are the evaluation of MPEG Compact Descriptor for Visual Search in the context of indoor robotic navigation and the introduction of a new method for training Convolutional Neural Networks with applications to object classification. The choice for image descriptor in a visual navigation system is not straightforward. Visual descriptors must be distinctive enough to allow for correct localisation while still offering low matching complexity and short descriptor size for real-time applications. MPEG Compact Descriptor for Visual Search is a low complexity image descriptor that offers several levels of compromises between descriptor distinctiveness and size. In this work, we describe how these trade-offs can be used for efficient loop-detection in a typical indoor environment. We first describe a probabilistic approach to loop detection based on the standard’s suggested similarity metric. We then evaluate the performance of CDVS compression modes in terms of matching speed, feature extraction, and storage requirements and compare them with the state of the art SIFT descriptor for five different types of indoor floors. During the second part of this thesis we focus on the new paradigm to machine learning and computer vision called Deep Learning. Under this paradigm visual features are no longer extracted using fine-grained, highly engineered feature extractor, but rather using a Convolutional Neural Networks (CNN) that extracts hierarchical features learned directly from data at the cost of long training periods. In this context, we propose a method for speeding up the training of Convolutional Neural Networks (CNN) by exploiting the spatial scaling property of convolutions. This is done by first training a pre-train CNN of smaller kernel resolutions for a few epochs, followed by properly rescaling its kernels to the target’s original dimensions and continuing training at full resolution. We show that the overall training time of a target CNN architecture can be reduced by exploiting the spatial scaling property of convolutions during early stages of learning. Moreover, by rescaling the kernels at different epochs, we identify a trade-off between total training time and maximum obtainable accuracy. Finally, we propose a method for choosing when to rescale kernels and evaluate our approach on recent architectures showing savings in training times of nearly 20% while test set accuracy is preserved

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Deep Representation-aligned Graph Multi-view Clustering for Limited Labeled Multi-modal Health Data

Author: Grimstad Erland
Publication venue: 'UiT The Arctic University of Norway'
Publication date: 01/06/2022
Field of study

Today, many fields are characterised by having extensive quantities of data from a wide range of dissimilar sources and domains. One such field is medicine, in which data contain exhaustive combinations of spatial, temporal, linear, and relational data. Often lacking expert-assessed labels, much of this data would require analysis within the fields of unsupervised or semi-supervised learning. Thus, reasoned by the notion that higher view-counts provide more ways to recognise commonality across views, contrastive multi-view clustering may be utilised to train a model to suppress redundancy and otherwise medically irrelevant information. Yet, standard multi-view clustering approaches do not account for relational graph data. Recent developments aim to solve this by utilising various graph operations including graph-based attention. And within deep-learning graph-based multi-view clustering on a sole view-invariant affinity graph, representation alignment remains unexplored. We introduce Deep Representation-Aligned Graph Multi-View Clustering (DRAGMVC), a novel attention-based graph multi-view clustering model. Comparing maximal performance, our model surpassed the state-of-the-art in eleven out of twelve metrics on Cora, CiteSeer, and PubMed. The model considers view alignment on a sample-level by employing contrastive loss and relational data through a novel take on graph attention embeddings in which we use a Markov chain prior to increase the receptive field of each layer. For clustering, a graph-induced DDC module is used. GraphSAINT sampling is implemented to control our mini-batch space to capitalise on our Markov prior. Additionally, we present the MIMIC pleural effusion graph multi-modal dataset, consisting of two modalities registering 3520 chest X-ray images along with two static views registered within a one-day time frame: vital signs and lab tests. These making up the, in total, three views of the dataset. We note a significant improvement in terms of separability, view mixing, and clustering performance comparing DRAGMVC to preceding non-graph multi-view clustering models, suggesting a possible, largely unexplored use case of unsupervised graph multi-view clustering on graph-induced, multi-modal, and complex medical data

Munin - Open Research Archive

Neural function approximation on graphs: shape modelling, graph discrimination & compression

Author: Bouritsas Giorgos
Publication venue: Computing, Imperial College London
Publication date: 01/06/2023
Field of study

Graphs serve as a versatile mathematical abstraction of real-world phenomena in numerous scientific disciplines. This thesis is part of the Geometric Deep Learning subject area, a family of learning paradigms, that capitalise on the increasing volume of non-Euclidean data so as to solve real-world tasks in a data-driven manner. In particular, we focus on the topic of graph function approximation using neural networks, which lies at the heart of many relevant methods. In the first part of the thesis, we contribute to the understanding and design of Graph Neural Networks (GNNs). Initially, we investigate the problem of learning on signals supported on a fixed graph. We show that treating graph signals as general graph spaces is restrictive and conventional GNNs have limited expressivity. Instead, we expose a more enlightening perspective by drawing parallels between graph signals and signals on Euclidean grids, such as images and audio. Accordingly, we propose a permutation-sensitive GNN based on an operator analogous to shifts in grids and instantiate it on 3D meshes for shape modelling (Spiral Convolutions). Following, we focus on learning on general graph spaces and in particular on functions that are invariant to graph isomorphism. We identify a fundamental trade-off between invariance, expressivity and computational complexity, which we address with a symmetry-breaking mechanism based on substructure encodings (Graph Substructure Networks). Substructures are shown to be a powerful tool that provably improves expressivity while controlling computational complexity, and a useful inductive bias in network science and chemistry. In the second part of the thesis, we discuss the problem of graph compression, where we analyse the information-theoretic principles and the connections with graph generative models. We show that another inevitable trade-off surfaces, now between computational complexity and compression quality, due to graph isomorphism. We propose a substructure-based dictionary coder - Partition and Code (PnC) - with theoretical guarantees that can be adapted to different graph distributions by estimating its parameters from observations. Additionally, contrary to the majority of neural compressors, PnC is parameter and sample efficient and is therefore of wide practical relevance. Finally, within this framework, substructures are further illustrated as a decisive archetype for learning problems on graph spaces.Open Acces

Spiral - Imperial College Digital Repository

Automatic Object Detection and Categorisation in Deep Astronomical Imaging Surveys Using Unsupervised Machine Learning

Author: Hocking Alexander
Publication venue
Publication date: 05/09/2018
Field of study

I present an unsupervised machine learning technique that automatically segments and labels galaxies in astronomical imaging surveys using only pixel data. Distinct from previous unsupervised machine learning approaches used in astronomy the technique uses no pre-selection or pre-filtering of target galaxy type to identify galaxies that are similar. I demonstrate the technique on the Hubble Space Telescope (HST) Frontier Fields. By training the algorithm using galaxies from one field (Abell 2744) and applying the result to another (MACS0416.1-2403), I show how the algorithm can cleanly separate early and late type galaxies without any form of pre-directed training for what an ‘early’ or ‘late’ type galaxy is. I present the results of testing the technique for generalisation and to identify its optimal configuration. I then apply the technique to the HST Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) fields, creating a catalogue of 60000 labelled galaxies, grouped by their similarity. I show how the automatically identified groups contain galaxies with similar morphological (and photometric) type. I compare the catalogue to human-classifications from the Galaxy Zoo: CANDELS project. Although there is not a direct mapping, I demonstrate a good level of concordance between them. I publicly release the catalogue and a corresponding visual catalogue and galaxy similarity search facility at www.galaxyml.uk. I show how the technique can be used to identify rarer objects and present lensed galaxy candidates from the CANDELS imaging. Finally, I consider how the technique can be improved and applied to future surveys to identify transient objects

University of Hertfordshire Research Archive

Query-Driven Global Graph Attention Model for Visual Parsing: Recognizing Handwritten and Typeset Math Formulas

Author: Mahdavi Mahshad
Publication venue: RIT Scholar Works
Publication date: 07/08/2020
Field of study

We present a new visual parsing method based on standard Convolutional Neural Networks (CNNs) for handwritten and typeset mathematical formulas. The Query-Driven Global Graph Attention (QD-GGA) parser employs multi-task learning, using a single feature representation for locating, classifying, and relating symbols. QD-GGA parses formulas by first constructing a Line-Of-Sight (LOS) graph over the input primitives (e.g handwritten strokes or connected components in images). Second, class distributions for LOS nodes and edges are obtained using query-specific feature filters (i.e., attention) in a single feed-forward pass. This allows end-to-end structure learning using a joint loss over primitive node and edge class distributions. Finally, a Maximum Spanning Tree (MST) is extracted from the weighted graph using Edmonds\u27 Arborescence Algorithm. The model may be run recurrently over the input graph, updating attention to focus on symbols detected in the previous iteration. QD-GGA does not require additional grammar rules and the language model is learned from the sets of symbols/relationships and the statistics over them in the training set. We benchmark our system against both handwritten and typeset state-of-the-art math recognition systems. Our preliminary results show that this is a promising new approach for visual parsing of math formulas. Using recurrent execution, symbol detection is near perfect for both handwritten and typeset formulas: we obtain a symbol f-measure of over 99.4% for both the CROHME (handwritten) and INFTYMCCDB-2 (typeset formula image) datasets. Our method is also much faster in both training and execution than state-of-the-art RNN-based formula parsers. The unlabeled structure detection of QDGGA is competitive with encoder-decoder models, but QD-GGA symbol and relationship classification is weaker. We believe this may be addressed through increased use of spatial features and global context

RIT Scholar Works