108,903 research outputs found
Searching for Exoplanets Using Artificial Intelligence
In the last decade, over a million stars were monitored to detect transiting
planets. Manual interpretation of potential exoplanet candidates is labor
intensive and subject to human error, the results of which are difficult to
quantify. Here we present a new method of detecting exoplanet candidates in
large planetary search projects which, unlike current methods uses a neural
network. Neural networks, also called "deep learning" or "deep nets" are
designed to give a computer perception into a specific problem by training it
to recognize patterns. Unlike past transit detection algorithms deep nets learn
to recognize planet features instead of relying on hand-coded metrics that
humans perceive as the most representative. Our convolutional neural network is
capable of detecting Earth-like exoplanets in noisy time-series data with a
greater accuracy than a least-squares method. Deep nets are highly
generalizable allowing data to be evaluated from different time series after
interpolation without compromising performance. As validated by our deep net
analysis of Kepler light curves, we detect periodic transits consistent with
the true period without any model fitting. Our study indicates that machine
learning will facilitate the characterization of exoplanets in future analysis
of large astronomy data sets.Comment: Accepted, 16 Pages, 14 Figures,
https://github.com/pearsonkyle/Exoplanet-Artificial-Intelligenc
Feature-Learning Networks Are Consistent Across Widths At Realistic Scales
We study the effect of width on the dynamics of feature-learning neural
networks across a variety of architectures and datasets. Early in training,
wide neural networks trained on online data have not only identical loss curves
but also agree in their point-wise test predictions throughout training. For
simple tasks such as CIFAR-5m this holds throughout training for networks of
realistic widths. We also show that structural properties of the models,
including internal representations, preactivation distributions, edge of
stability phenomena, and large learning rate effects are consistent across
large widths. This motivates the hypothesis that phenomena seen in realistic
models can be captured by infinite-width, feature-learning limits. For harder
tasks (such as ImageNet and language modeling), and later training times,
finite-width deviations grow systematically. Two distinct effects cause these
deviations across widths. First, the network output has
initialization-dependent variance scaling inversely with width, which can be
removed by ensembling networks. We observe, however, that ensembles of narrower
networks perform worse than a single wide network. We call this the bias of
narrower width. We conclude with a spectral perspective on the origin of this
finite-width bias
Pan-chromatic photometric classification of supernovae from multiple surveys and transfer learning for future surveys
Time-domain astronomy is entering a new era as wide-field surveys with higher
cadences allow for more discoveries than ever before. The field has seen an
increased use of machine learning and deep learning for automated
classification of transients into established taxonomies. Training such
classifiers requires a large enough and representative training set, which is
not guaranteed for new future surveys such as the Vera Rubin Observatory,
especially at the beginning of operations. We present the use of Gaussian
processes to create a uniform representation of supernova light curves from
multiple surveys, obtained through the Open Supernova Catalog for supervised
classification with convolutional neural networks. We also investigate the use
of transfer learning to classify light curves from the Photometric LSST
Astronomical Time Series Classification Challenge (PLAsTiCC) dataset. Using
convolutional neural networks to classify the Gaussian process generated
representation of supernova light curves from multiple surveys, we achieve an
AUC score of 0.859 for classification into Type Ia, Ibc, and II. We find that
transfer learning improves the classification accuracy for the most
under-represented classes by up to 18% when classifying PLAsTiCC light curves,
and is able to achieve an AUC score of 0.945 when including photometric
redshifts for classification into six classes (Ia, Iax, Ia-91bg, Ibc, II,
SLSN-I). We also investigate the usefulness of transfer learning when there is
a limited labelled training set to see how this approach can be used for
training classifiers in future surveys at the beginning of operations.Comment: 15 pages, 14 figure
Applications of Machine Learning to Estimating the Sizes and Market Impact of Hidden Orders in the BRICS Financial Markets
The research aims to investigate the role of hidden orders on the structure of the average market impact curves in the five BRICS financial markets. The concept of market impact is central to the implementation of cost-effective trading strategies during financial order executions. The literature of Lillo et al. (2003) is replicated using the data of visible orders from the five BRICS financial markets. We repeat the implementation of Lillo et al. (2003) to investigate the effect of hidden orders. We subsequently study the dynamics of hidden orders. The research applies machine learning to estimate the sizes of hidden orders. We revisit the methodology of Lillo et al. (2003) to compare the average market impact curves in which true hidden orders are added to visible orders to the average market impact curves in which hidden orders sizes are estimated via machine learning. The study discovers that : (1) hidden orders sizes could be uncovered via machine learning techniques such as Generalized Linear Models (GLM), Artificial Neural Networks (ANN), Support Vector Machines (SVM), and Random Forests (RF); and (2) there exist no set of market features that are consistently predictive of the sizes of hidden orders across different stocks. Artificial Neural Networks produce large R^2 and small MSE on the prediction of hidden orders of individual stocks across the five studied markets. Random Forests produce the ˆ
most appropriate average price impact curves of visible and estimated hidden orders that are closest to the average market impact curves of visible and true hidden orders. In some markets, hidden orders produce a convex power-law far-right tail in contrast to visible orders which produce a concave power-law far-right tail. Hidden orders may affect the average price impact curves for orders of size less than the average order size; meanwhile, hidden orders may not affect the structure of the average price impact curves in other markets. The research implies ANN and RF as the recommended tools to uncover hidden orders
Quantum Optical Convolutional Neural Network: A Novel Image Recognition Framework for Quantum Computing
Large machine learning models based on Convolutional Neural Networks (CNNs)
with rapidly increasing number of parameters, trained with massive amounts of
data, are being deployed in a wide array of computer vision tasks from
self-driving cars to medical imaging. The insatiable demand for computing
resources required to train these models is fast outpacing the advancement of
classical computing hardware, and new frameworks including Optical Neural
Networks (ONNs) and quantum computing are being explored as future
alternatives.
In this work, we report a novel quantum computing based deep learning model,
the Quantum Optical Convolutional Neural Network (QOCNN), to alleviate the
computational bottleneck in future computer vision applications. Using the
popular MNIST dataset, we have benchmarked this new architecture against a
traditional CNN based on the seminal LeNet model. We have also compared the
performance with previously reported ONNs, namely the GridNet and ComplexNet,
as well as a Quantum Optical Neural Network (QONN) that we built by combining
the ComplexNet with quantum based sinusoidal nonlinearities. In essence, our
work extends the prior research on QONN by adding quantum convolution and
pooling layers preceding it.
We have evaluated all the models by determining their accuracies, confusion
matrices, Receiver Operating Characteristic (ROC) curves, and Matthews
Correlation Coefficients. The performance of the models were similar overall,
and the ROC curves indicated that the new QOCNN model is robust. Finally, we
estimated the gains in computational efficiencies from executing this novel
framework on a quantum computer. We conclude that switching to a quantum
computing based approach to deep learning may result in comparable accuracies
to classical models, while achieving unprecedented boosts in computational
performances and drastic reduction in power consumption.Comment: 9 pages, 6 figure
One-shot learning with pretrained convolutional neural network
2019 Summer.Includes bibliographical references.Recent progress in convolutional neural networks and deep learning has revolutionized the image classification field, and computers can now classify images with a very high accuracy. However, unlike the human vision system which efficiently recognizes a new object after seeing a similar one, recognizing new classes of images requires a time- and resource-consuming process of retraining a neural network due to several restrictions. Since a pretrained neural network has seen a large amount of training data, it may be generalized to effectively and efficiently recognize new classes considering it may extract patterns from training images. This inspires some research in one-shot learning, which is the process of learning to classify a novel class through one training image from the novel class. One-shot learning can help expand the use of a trained convolutional neural network without costly model retraining. In addition to the practical application of one-shot learning, it is also important to understand how a convolutional neural network supports one-shot learning. More specifically, how does the feature space structure to support one-shot learning? This can potentially help us better understand the mechanisms of convolutional neural networks. This thesis proposes an approximate nearest neighbor-based method for one-shot learning. This method makes use of the features produced by a pretrained convolutional neural network and builds a proximity forest to classify new classes. The algorithm is tested in two datasets with different scales and achieves reasonable high classification accuracy in both datasets. Furthermore, this thesis tries to understand the feature space to explain the success of our proposed method. A novel tool generalized curvature analysis is used to probe the feature space structure of the convolutional neural network. The results show that the feature space curves around samples with both known classes and unknown in-domain classes, but not around transition samples between classes or out-of-domain samples. In addition, the low curvature of out-of-domain samples is correlated with the inability of a pretrained convolutional neural network to classify out-of-domain classes, indicating that a pretrained model cannot generate useful feature representations for out-of-domain samples. In summary, this thesis proposes a new method for one-shot learning, and provides insight into understanding the feature space of convolutional neural networks
- …