464 research outputs found

    Automatic Object Detection and Categorisation in Deep Astronomical Imaging Surveys Using Unsupervised Machine Learning

    Get PDF
    I present an unsupervised machine learning technique that automatically segments and labels galaxies in astronomical imaging surveys using only pixel data. Distinct from previous unsupervised machine learning approaches used in astronomy the technique uses no pre-selection or pre-filtering of target galaxy type to identify galaxies that are similar. I demonstrate the technique on the Hubble Space Telescope (HST) Frontier Fields. By training the algorithm using galaxies from one field (Abell 2744) and applying the result to another (MACS0416.1-2403), I show how the algorithm can cleanly separate early and late type galaxies without any form of pre-directed training for what an ‘early’ or ‘late’ type galaxy is. I present the results of testing the technique for generalisation and to identify its optimal configuration. I then apply the technique to the HST Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) fields, creating a catalogue of 60000 labelled galaxies, grouped by their similarity. I show how the automatically identified groups contain galaxies with similar morphological (and photometric) type. I compare the catalogue to human-classifications from the Galaxy Zoo: CANDELS project. Although there is not a direct mapping, I demonstrate a good level of concordance between them. I publicly release the catalogue and a corresponding visual catalogue and galaxy similarity search facility at www.galaxyml.uk. I show how the technique can be used to identify rarer objects and present lensed galaxy candidates from the CANDELS imaging. Finally, I consider how the technique can be improved and applied to future surveys to identify transient objects

    Structural Generative Descriptions for Temporal Data

    Get PDF
    In data mining problems the representation or description of data plays a fundamental role, since it defines the set of essential properties for the extraction and characterisation of patterns. However, for the case of temporal data, such as time series and data streams, one outstanding issue when developing mining algorithms is finding an appropriate data description or representation. In this thesis two novel domain-independent representation frameworks for temporal data suitable for off-line and online mining tasks are formulated. First, a domain-independent temporal data representation framework based on a novel data description strategy which combines structural and statistical pattern recognition approaches is developed. The key idea here is to move the structural pattern recognition problem to the probability domain. This framework is composed of three general tasks: a) decomposing input temporal patterns into subpatterns in time or any other transformed domain (for instance, wavelet domain); b) mapping these subpatterns into the probability domain to find attributes of elemental probability subpatterns called primitives; and c) mining input temporal patterns according to the attributes of their corresponding probability domain subpatterns. This framework is referred to as Structural Generative Descriptions (SGDs). Two off-line and two online algorithmic instantiations of the proposed SGDs framework are then formulated: i) For the off-line case, the first instantiation is based on the use of Discrete Wavelet Transform (DWT) and Wavelet Density Estimators (WDE), while the second algorithm includes DWT and Finite Gaussian Mixtures. ii) For the online case, the first instantiation relies on an online implementation of DWT and a recursive version of WDE (RWDE), whereas the second algorithm is based on a multi-resolution exponentially weighted moving average filter and RWDE. The empirical evaluation of proposed SGDs-based algorithms is performed in the context of time series classification, for off-line algorithms, and in the context of change detection and clustering, for online algorithms. For this purpose, synthetic and publicly available real-world data are used. Additionally, a novel framework for multidimensional data stream evolution diagnosis incorporating RWDE into the context of Velocity Density Estimation (VDE) is formulated. Changes in streaming data and changes in their correlation structure are characterised by means of local and global evolution coefficients as well as by means of recursive correlation coefficients. The proposed VDE framework is evaluated using temperature data from the UK and air pollution data from Hong Kong.Open Acces

    Digital photo album management techniques: from one dimension to multi-dimension.

    Get PDF
    Lu Yang.Thesis submitted in: November 2004.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 96-103).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Our Contributions --- p.3Chapter 1.3 --- Thesis Outline --- p.5Chapter 2 --- Background Study --- p.7Chapter 2.1 --- MPEG-7 Introduction --- p.8Chapter 2.2 --- Image Analysis in CBIR Systems --- p.11Chapter 2.2.1 --- Color Information --- p.13Chapter 2.2.2 --- Color Layout --- p.19Chapter 2.2.3 --- Texture Information --- p.20Chapter 2.2.4 --- Shape Information --- p.24Chapter 2.2.5 --- CBIR Systems --- p.26Chapter 2.3 --- Image Processing in JPEG Frequency Domain --- p.30Chapter 2.4 --- Photo Album Clustering --- p.33Chapter 3 --- Feature Extraction and Similarity Analysis --- p.38Chapter 3.1 --- Feature Set in Frequency Domain --- p.38Chapter 3.1.1 --- JPEG Frequency Data --- p.39Chapter 3.1.2 --- Our Feature Set --- p.42Chapter 3.2 --- Digital Photo Similarity Analysis --- p.43Chapter 3.2.1 --- Energy Histogram --- p.43Chapter 3.2.2 --- Photo Distance --- p.45Chapter 4 --- 1-Dimensional Photo Album Management Techniques --- p.49Chapter 4.1 --- Photo Album Sorting --- p.50Chapter 4.2 --- Photo Album Clustering --- p.52Chapter 4.3 --- Photo Album Compression --- p.56Chapter 4.3.1 --- Variable IBP frames --- p.56Chapter 4.3.2 --- Adaptive Search Window --- p.57Chapter 4.3.3 --- Compression Flow --- p.59Chapter 4.4 --- Experiments and Performance Evaluations --- p.60Chapter 5 --- High Dimensional Photo Clustering --- p.67Chapter 5.1 --- Traditional Clustering Techniques --- p.67Chapter 5.1.1 --- Hierarchical Clustering --- p.68Chapter 5.1.2 --- Traditional K-means --- p.71Chapter 5.2 --- Multidimensional Scaling --- p.74Chapter 5.2.1 --- Introduction --- p.75Chapter 5.2.2 --- Classical Scaling --- p.77Chapter 5.3 --- Our Interactive MDS-based Clustering --- p.80Chapter 5.3.1 --- Principal Coordinates from MDS --- p.81Chapter 5.3.2 --- Clustering Scheme --- p.82Chapter 5.3.3 --- Layout Scheme --- p.84Chapter 5.4 --- Experiments and Results --- p.87Chapter 6 --- Conclusions --- p.94Bibliography --- p.9

    An analysis of the relevance of temporal information in time series classification

    Get PDF
    In recent years, the interest in time series has increased considerably due to the vast amount of such data collected in a variety of fields. Time series are a particular type of data: they are sets of ordered observations. The analysis of databases composed of time series requires the consideration of the nature of the instances, where there exist a temporal correlation among the observations. A considerable variety of algorithms that take heed of such characteristic of the instances have been developed to represent, index, cluster and classify time series. Particularly, this work focuses on the classification task. Firstly, a review is performed about the specific time series classifiers, to give an insight of their workflow as well as how they capture the temporal information. The different procedures of those classifiers endow them with different abilities to catch the intrinsic temporal information of the instances for classification. Moreover, this work carries out an experiment based on empirical distributions to estimate the sensitivity of specific time series classifiers to the temporal order of the observations. Besides, although in general specific classifiers have been used to classify time series, in some time series classification problems, non-specific classification algorithms have shown to be competitive with the specific ones. Thus the relevance of the temporal order for classification varies for different time series classification problems. The present work aims to develop an analysis based on empirical distributions for estimating the relevance of the temporal ordering in a given time series classification problem, as well as studying the sensitivity of the specific time series classifiers to the temporal correlation of the observations. Englis

    High Performance Video Stream Analytics System for Object Detection and Classification

    Get PDF
    Due to the recent advances in cameras, cell phones and camcorders, particularly the resolution at which they can record an image/video, large amounts of data are generated daily. This video data is often so large that manually inspecting it for object detection and classification can be time consuming and error prone, thereby it requires automated analysis to extract useful information and meta-data. The automated analysis from video streams also comes with numerous challenges such as blur content and variation in illumination conditions and poses. We investigate an automated video analytics system in this thesis which takes into account the characteristics from both shallow and deep learning domains. We propose fusion of features from spatial frequency domain to perform highly accurate blur and illumination invariant object classification using deep learning networks. We also propose the tuning of hyper-parameters associated with the deep learning network through a mathematical model. The mathematical model used to support hyper-parameter tuning improved the performance of the proposed system during training. The outcomes of various hyper-parameters on system's performance are compared. The parameters that contribute towards the most optimal performance are selected for the video object classification. The proposed video analytics system has been demonstrated to process a large number of video streams and the underlying infrastructure is able to scale based on the number and size of the video stream(s) being processed. The extensive experimentation on publicly available image and video datasets reveal that the proposed system is significantly more accurate and scalable and can be used as a general purpose video analytics system.N/

    Digital watermarking in medical images

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 05/12/2005.This thesis addresses authenticity and integrity of medical images using watermarking. Hospital Information Systems (HIS), Radiology Information Systems (RIS) and Picture Archiving and Communication Systems (P ACS) now form the information infrastructure for today's healthcare as these provide new ways to store, access and distribute medical data that also involve some security risk. Watermarking can be seen as an additional tool for security measures. As the medical tradition is very strict with the quality of biomedical images, the watermarking method must be reversible or if not, region of Interest (ROI) needs to be defined and left intact. Watermarking should also serve as an integrity control and should be able to authenticate the medical image. Three watermarking techniques were proposed. First, Strict Authentication Watermarking (SAW) embeds the digital signature of the image in the ROI and the image can be reverted back to its original value bit by bit if required. Second, Strict Authentication Watermarking with JPEG Compression (SAW-JPEG) uses the same principal as SAW, but is able to survive some degree of JPEG compression. Third, Authentication Watermarking with Tamper Detection and Recovery (AW-TDR) is able to localise tampering, whilst simultaneously reconstructing the original image

    Exploration of Higher-Order Quantum Interference Landscapes

    Get PDF
    Earth, Moon and Sun unite when they star together in the three-body problem, whose intricate plot still baffles us today. For some reason, the factorization of the two-body problem into two one-body problems does not, in general, cross the N=2 border. Is computational irreducibility responsible for this emergence of complexity, as Stephen Wolfram likes to think? We don't know. The introduction to this thesis in Chapter 1, however, makes it clear that the history of science is marked by intermittent encounters of sudden complexities when the number 2 is left behind. In Chapter 2, I present an experiment that is quite similar in spirit, for my colleagues and I observe three-photon interference without two-photon and single-photon interference. We had to overcome significant experimental challenges that are typical for most quantum interference experiments involving more than two photons. Next in line is the three-slit interference experiment. Again a deceptively simple extension of the famous double-slit experiment, we are faced with questions that are difficult to access experimentally: the existence of genuine three-slit interference was first denied and then affirmed, though no experiment has decided yet. My contribution to the study of this problem is outlined in Chapter 3, where I use symmetry of measurement settings in such interference experiments to theoretically derive higher-order interference terms. In Chapter 4, I take a step back in one sense, for we study a two-photon phenomenon, but we also leap forward and discover entirely new interference landscapes. Theoretically and experimentally, I demonstrate how to use a polarization-modulated lasers to go beyond the standard Hong-Ou-Mandel (HOM) dip, and generate both triangular and square wave HOM interference patterns. Two-photon interference is also subject of Chapter 5, but with an interesting twist. While laser HOM interference relies on two independent photons, here we endow the pair with the strongest known correlations, namely entanglement. More specifically, we entangle a polarization and a time-bin qubit and use this hybrid to assess the viability of a rather special interferometer for quantum communication purposes

    Classification-Based Adaptive Search Algorithm for Video Motion Estimation

    Get PDF
    A video sequence consists of a series of frames. In order to compress the video for efficient storage and transmission, the temporal redundancy among adjacent frames must be exploited. A frame is selected as reference frame and subsequent frames are predicted from the reference frame using a technique known as motion estimation. Real videos contain a mixture of motions with slow and fast contents. Among block matching motion estimation algorithms, the full search algorithm is known for its superiority in the performance over other matching techniques. However, this method is computationally very extensive. Several fast block matching algorithms (FBMAs) have been proposed in the literature with the aim to reduce computational costs while maintaining desired quality performance, but all these methods are considered to be sub-optimal. No fixed fast block matching algorithm can effi- ciently remove temporal redundancy of video sequences with wide motion contents. Adaptive fast block matching algorithm, called classification based adaptive search (CBAS) has been proposed. A Bayes classifier is applied to classify the motions into slow and fast categories. Accordingly, appropriate search strategy is applied for each class. The algorithm switches between different search patterns according to the content of motions within video frames. The proposed technique outperforms conventional stand-alone fast block matching methods in terms of both peak signal to noise ratio (PSNR) and computational complexity. In addition, a new hierarchical method for detecting and classifying shot boundaries in video sequences is proposed which is based on information theoretic classification (ITC). ITC relies on likelihood of class label transmission of a data point to the data points in its vicinity. ITC focuses on maximizing the global transmission of true class labels and classify the frames into classes of cuts and non-cuts. Applying the same rule, the non-cut frames are also classified into two categories of arbitrary shot frames and gradual transition frames. CBAS is applied on the proposed shot detection method to handle camera or object motions. Experimental evidence demonstrates that our method can detect shot breaks with high accuracy

    Combined Industry, Space and Earth Science Data Compression Workshop

    Get PDF
    The sixth annual Space and Earth Science Data Compression Workshop and the third annual Data Compression Industry Workshop were held as a single combined workshop. The workshop was held April 4, 1996 in Snowbird, Utah in conjunction with the 1996 IEEE Data Compression Conference, which was held at the same location March 31 - April 3, 1996. The Space and Earth Science Data Compression sessions seek to explore opportunities for data compression to enhance the collection, analysis, and retrieval of space and earth science data. Of particular interest is data compression research that is integrated into, or has the potential to be integrated into, a particular space or earth science data information system. Preference is given to data compression research that takes into account the scien- tist's data requirements, and the constraints imposed by the data collection, transmission, distribution and archival systems
    corecore