108,903 research outputs found

    Searching for Exoplanets Using Artificial Intelligence

    Full text link
    In the last decade, over a million stars were monitored to detect transiting planets. Manual interpretation of potential exoplanet candidates is labor intensive and subject to human error, the results of which are difficult to quantify. Here we present a new method of detecting exoplanet candidates in large planetary search projects which, unlike current methods uses a neural network. Neural networks, also called "deep learning" or "deep nets" are designed to give a computer perception into a specific problem by training it to recognize patterns. Unlike past transit detection algorithms deep nets learn to recognize planet features instead of relying on hand-coded metrics that humans perceive as the most representative. Our convolutional neural network is capable of detecting Earth-like exoplanets in noisy time-series data with a greater accuracy than a least-squares method. Deep nets are highly generalizable allowing data to be evaluated from different time series after interpolation without compromising performance. As validated by our deep net analysis of Kepler light curves, we detect periodic transits consistent with the true period without any model fitting. Our study indicates that machine learning will facilitate the characterization of exoplanets in future analysis of large astronomy data sets.Comment: Accepted, 16 Pages, 14 Figures, https://github.com/pearsonkyle/Exoplanet-Artificial-Intelligenc

    Feature-Learning Networks Are Consistent Across Widths At Realistic Scales

    Full text link
    We study the effect of width on the dynamics of feature-learning neural networks across a variety of architectures and datasets. Early in training, wide neural networks trained on online data have not only identical loss curves but also agree in their point-wise test predictions throughout training. For simple tasks such as CIFAR-5m this holds throughout training for networks of realistic widths. We also show that structural properties of the models, including internal representations, preactivation distributions, edge of stability phenomena, and large learning rate effects are consistent across large widths. This motivates the hypothesis that phenomena seen in realistic models can be captured by infinite-width, feature-learning limits. For harder tasks (such as ImageNet and language modeling), and later training times, finite-width deviations grow systematically. Two distinct effects cause these deviations across widths. First, the network output has initialization-dependent variance scaling inversely with width, which can be removed by ensembling networks. We observe, however, that ensembles of narrower networks perform worse than a single wide network. We call this the bias of narrower width. We conclude with a spectral perspective on the origin of this finite-width bias

    Pan-chromatic photometric classification of supernovae from multiple surveys and transfer learning for future surveys

    Full text link
    Time-domain astronomy is entering a new era as wide-field surveys with higher cadences allow for more discoveries than ever before. The field has seen an increased use of machine learning and deep learning for automated classification of transients into established taxonomies. Training such classifiers requires a large enough and representative training set, which is not guaranteed for new future surveys such as the Vera Rubin Observatory, especially at the beginning of operations. We present the use of Gaussian processes to create a uniform representation of supernova light curves from multiple surveys, obtained through the Open Supernova Catalog for supervised classification with convolutional neural networks. We also investigate the use of transfer learning to classify light curves from the Photometric LSST Astronomical Time Series Classification Challenge (PLAsTiCC) dataset. Using convolutional neural networks to classify the Gaussian process generated representation of supernova light curves from multiple surveys, we achieve an AUC score of 0.859 for classification into Type Ia, Ibc, and II. We find that transfer learning improves the classification accuracy for the most under-represented classes by up to 18% when classifying PLAsTiCC light curves, and is able to achieve an AUC score of 0.945 when including photometric redshifts for classification into six classes (Ia, Iax, Ia-91bg, Ibc, II, SLSN-I). We also investigate the usefulness of transfer learning when there is a limited labelled training set to see how this approach can be used for training classifiers in future surveys at the beginning of operations.Comment: 15 pages, 14 figure

    Applications of Machine Learning to Estimating the Sizes and Market Impact of Hidden Orders in the BRICS Financial Markets

    Get PDF
    The research aims to investigate the role of hidden orders on the structure of the average market impact curves in the five BRICS financial markets. The concept of market impact is central to the implementation of cost-effective trading strategies during financial order executions. The literature of Lillo et al. (2003) is replicated using the data of visible orders from the five BRICS financial markets. We repeat the implementation of Lillo et al. (2003) to investigate the effect of hidden orders. We subsequently study the dynamics of hidden orders. The research applies machine learning to estimate the sizes of hidden orders. We revisit the methodology of Lillo et al. (2003) to compare the average market impact curves in which true hidden orders are added to visible orders to the average market impact curves in which hidden orders sizes are estimated via machine learning. The study discovers that : (1) hidden orders sizes could be uncovered via machine learning techniques such as Generalized Linear Models (GLM), Artificial Neural Networks (ANN), Support Vector Machines (SVM), and Random Forests (RF); and (2) there exist no set of market features that are consistently predictive of the sizes of hidden orders across different stocks. Artificial Neural Networks produce large R^2 and small MSE on the prediction of hidden orders of individual stocks across the five studied markets. Random Forests produce the ˆ most appropriate average price impact curves of visible and estimated hidden orders that are closest to the average market impact curves of visible and true hidden orders. In some markets, hidden orders produce a convex power-law far-right tail in contrast to visible orders which produce a concave power-law far-right tail. Hidden orders may affect the average price impact curves for orders of size less than the average order size; meanwhile, hidden orders may not affect the structure of the average price impact curves in other markets. The research implies ANN and RF as the recommended tools to uncover hidden orders

    Quantum Optical Convolutional Neural Network: A Novel Image Recognition Framework for Quantum Computing

    Full text link
    Large machine learning models based on Convolutional Neural Networks (CNNs) with rapidly increasing number of parameters, trained with massive amounts of data, are being deployed in a wide array of computer vision tasks from self-driving cars to medical imaging. The insatiable demand for computing resources required to train these models is fast outpacing the advancement of classical computing hardware, and new frameworks including Optical Neural Networks (ONNs) and quantum computing are being explored as future alternatives. In this work, we report a novel quantum computing based deep learning model, the Quantum Optical Convolutional Neural Network (QOCNN), to alleviate the computational bottleneck in future computer vision applications. Using the popular MNIST dataset, we have benchmarked this new architecture against a traditional CNN based on the seminal LeNet model. We have also compared the performance with previously reported ONNs, namely the GridNet and ComplexNet, as well as a Quantum Optical Neural Network (QONN) that we built by combining the ComplexNet with quantum based sinusoidal nonlinearities. In essence, our work extends the prior research on QONN by adding quantum convolution and pooling layers preceding it. We have evaluated all the models by determining their accuracies, confusion matrices, Receiver Operating Characteristic (ROC) curves, and Matthews Correlation Coefficients. The performance of the models were similar overall, and the ROC curves indicated that the new QOCNN model is robust. Finally, we estimated the gains in computational efficiencies from executing this novel framework on a quantum computer. We conclude that switching to a quantum computing based approach to deep learning may result in comparable accuracies to classical models, while achieving unprecedented boosts in computational performances and drastic reduction in power consumption.Comment: 9 pages, 6 figure

    One-shot learning with pretrained convolutional neural network

    Get PDF
    2019 Summer.Includes bibliographical references.Recent progress in convolutional neural networks and deep learning has revolutionized the image classification field, and computers can now classify images with a very high accuracy. However, unlike the human vision system which efficiently recognizes a new object after seeing a similar one, recognizing new classes of images requires a time- and resource-consuming process of retraining a neural network due to several restrictions. Since a pretrained neural network has seen a large amount of training data, it may be generalized to effectively and efficiently recognize new classes considering it may extract patterns from training images. This inspires some research in one-shot learning, which is the process of learning to classify a novel class through one training image from the novel class. One-shot learning can help expand the use of a trained convolutional neural network without costly model retraining. In addition to the practical application of one-shot learning, it is also important to understand how a convolutional neural network supports one-shot learning. More specifically, how does the feature space structure to support one-shot learning? This can potentially help us better understand the mechanisms of convolutional neural networks. This thesis proposes an approximate nearest neighbor-based method for one-shot learning. This method makes use of the features produced by a pretrained convolutional neural network and builds a proximity forest to classify new classes. The algorithm is tested in two datasets with different scales and achieves reasonable high classification accuracy in both datasets. Furthermore, this thesis tries to understand the feature space to explain the success of our proposed method. A novel tool generalized curvature analysis is used to probe the feature space structure of the convolutional neural network. The results show that the feature space curves around samples with both known classes and unknown in-domain classes, but not around transition samples between classes or out-of-domain samples. In addition, the low curvature of out-of-domain samples is correlated with the inability of a pretrained convolutional neural network to classify out-of-domain classes, indicating that a pretrained model cannot generate useful feature representations for out-of-domain samples. In summary, this thesis proposes a new method for one-shot learning, and provides insight into understanding the feature space of convolutional neural networks
    • …
    corecore