Search CORE

141 research outputs found

Automated novelty detection in the WISE survey with one-class support vector machines

Author: Bilicki M.
Durkalec A.
Gromadzki M.
Pollo A.
Solarz A.
Wypych M.
Publication venue: 'EDP Sciences'
Publication date: 01/01/2017
Field of study

Wide-angle photometric surveys of previously uncharted sky areas or wavelength regimes will always bring in unexpected sources whose existence and properties cannot be easily predicted from earlier observations: novelties or even anomalies. Such objects can be efficiently sought for with novelty detection algorithms. Here we present an application of such a method, called one-class support vector machines (OCSVM), to search for anomalous patterns among sources preselected from the mid-infrared AllWISE catalogue covering the whole sky. To create a model of expected data we train the algorithm on a set of objects with spectroscopic identifications from the SDSS DR13 database, present also in AllWISE. OCSVM detects as anomalous those sources whose patterns - WISE photometric measurements in this case - are inconsistent with the model. Among the detected anomalies we find artefacts, such as objects with spurious photometry due to blending, but most importantly also real sources of genuine astrophysical interest. Among the latter, OCSVM has identified a sample of heavily reddened AGN/quasar candidates distributed uniformly over the sky and in a large part absent from other WISE-based AGN catalogues. It also allowed us to find a specific group of sources of mixed types, mostly stars and compact galaxies. By combining the semi-supervised OCSVM algorithm with standard classification methods it will be possible to improve the latter by accounting for sources which are not present in the training sample but are otherwise well-represented in the target set. Anomaly detection adds flexibility to automated source separation procedures and helps verify the reliability and representativeness of the training samples. It should be thus considered as an essential step in supervised classification schemes to ensure completeness and purity of produced catalogues.Comment: 14 pages, 15 figure

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Leiden University Scholary Publications

Jagiellonian Univeristy Repository

Finding rare objects and building pure samples: Probabilistic quasar classification from low resolution Gaia spectra

Author: A. Vallenari
Bailer-Jones
Ball
Burges
C. A. L. Bailer-Jones
C. Tiede
Claeskens
Cortes
Gao
Gustafsson
Hastie
K. W. Smith
Lejeune
Luri
O'Mullane
Platt
R. Sordo
Richards
Richards
Suchkov
Tsalmantza
Tsalmantza
Vanden Berk
Weiss
Wu
Publication venue: 'Wiley'
Publication date: 19/09/2008
Field of study

We develop and demonstrate a probabilistic method for classifying rare objects in surveys with the particular goal of building very pure samples. It works by modifying the output probabilities from a classifier so as to accommodate our expectation (priors) concerning the relative frequencies of different classes of objects. We demonstrate our method using the Discrete Source Classifier, a supervised classifier currently based on Support Vector Machines, which we are developing in preparation for the Gaia data analysis. DSC classifies objects using their very low resolution optical spectra. We look in detail at the problem of quasar classification, because identification of a pure quasar sample is necessary to define the Gaia astrometric reference frame. By varying a posterior probability threshold in DSC we can trade off sample completeness and contamination. We show, using our simulated data, that it is possible to achieve a pure sample of quasars (upper limit on contamination of 1 in 40,000) with a completeness of 65% at magnitudes of G=18.5, and 50% at G=20.0, even when quasars have a frequency of only 1 in every 2000 objects. The star sample completeness is simultaneously 99% with a contamination of 0.7%. Including parallax and proper motion in the classifier barely changes the results. We further show that not accounting for class priors in the target population leads to serious misclassifications and poor predictions for sample completeness and contamination. (Truncated)Comment: MNRAS accepte

arXiv.org e-Print Archive

Crossref

Data Mining and Machine Learning in Astronomy

Author: Aha D. W.
Aizerman M. A.
Benjamini Y.
Bertin E.
Borne K.
Breiman L.
de Vaucouleurs G.
Dempster A.
Drake A. J.
Ebisuzaki T.
Faundez-Abans M.
Goebel J.
Karhunen K.
Levy S.
Li L.-L.
Maddox S. J.
Molinari E.
Moore G. E.
Naim A.
NICHOLAS M. BALL
P. A.
Patterson F. S.
ROBERT J. BRUNNER
Salzberg S. L.
Scaringi S.
Serra-Ricart M.
Steinhaus H.
Urunkar N.
Wells D. C.
Won E.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 10/08/2010
Field of study

We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

arXiv.org e-Print Archive

Crossref

The Extremely Luminous Quasar Survey (ELQS) in the SDSS footprint I.: Infrared Based Candidate Selection

Author: Fan Xiaohui
Green Richard
Jiang Linhua
McGreer Ian. D.
Schindler Jan-Torge
Wu Jin
Yang Qian
Publication venue: 'American Astronomical Society'
Publication date: 04/12/2017
Field of study

Studies of the most luminous quasars at high redshift directly probe the evolution of the most massive black holes in the early Universe and their connection to massive galaxy formation. However, extremely luminous quasars at high redshift are very rare objects. Only wide area surveys have a chance to constrain their population. The Sloan Digital Sky Survey (SDSS) has so far provided the most widely adopted measurements of the quasar luminosity function (QLF) at

z>3

. However, a careful re-examination of the SDSS quasar sample revealed that the SDSS quasar selection is in fact missing a significant fraction of

z\gtrsim3

quasars at the brightest end. We have identified the purely optical color selection of SDSS, where quasars at these redshifts are strongly contaminated by late-type dwarfs, and the spectroscopic incompleteness of the SDSS footprint as the main reasons. Therefore we have designed the Extremely Luminous Quasar Survey (ELQS), based on a novel near-infrared JKW2 color cut using WISE AllWISE and 2MASS all-sky photometry, to yield high completeness for very bright (

m_{\rm{i}} < 18.0

) quasars in the redshift range of

3.0\leq z\leq5.0

. It effectively uses random forest machine-learning algorithms on SDSS and WISE photometry for quasar-star classification and photometric redshift estimation. The ELQS will spectroscopically follow-up

\sim 230

new quasar candidates in an area of

\sim12000\,\rm{deg}^2

in the SDSS footprint, to obtain a well-defined and complete quasars sample for an accurate measurement of the bright-end quasar luminosity function at

3.0\leq z\leq5.0

. In this paper we present the quasar selection algorithm and the quasar candidate catalog.Comment: 16 pages, 8 figures, 9 tables; ApJ in pres

arXiv.org e-Print Archive

The University of Arizona

Estimating Photometric Redshifts of Quasars via K-nearest Neighbor Approach Based on Large Survey Databases

Author: He Ma
Nanbo Peng
Xue-bing Wu
Yanxia Zhang
Yongheng Zhao
Publication venue: 'IOP Publishing'
Publication date: 01/01/2013
Field of study

We apply one of lazy learning methods named k-nearest neighbor algorithm (kNN) to estimate the photometric redshifts of quasars, based on various datasets from the Sloan Digital Sky Survey (SDSS), UKIRT Infrared Deep Sky Survey (UKIDSS) and Wide-field Infrared Survey Explorer (WISE) (the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN arrives at the best performance when k is different with a special input pattern for a special dataset. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results show that generally the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. By comparing the performance of various methods for photometric redshift estimation of quasars, kNN based on KD-Tree shows its superiority with the best accuracy for our case.Comment: 28 pages, 4 figures, 3 tables, accepted for publication in A

arXiv.org e-Print Archive

Machine Learning in Astronomy: A Case Study in Quasar-Star Classification

Author: AA Miller
D Gao
I Landesa-Vázquez
JK Adelman-McCarthy
KN Abazajian
N Peng
NC Hambly
Publication venue
Publication date: 13/04/2018
Field of study

We present the results of various automated classification methods, based on machine learning (ML), of objects from data releases 6 and 7 (DR6 and DR7) of the Sloan Digital Sky Survey (SDSS), primarily distinguishing stars from quasars. We provide a careful scrutiny of approaches available in the literature and have highlighted the pitfalls in those approaches based on the nature of data used for the study. The aim is to investigate the appropriateness of the application of certain ML methods. The manuscript argues convincingly in favor of the efficacy of asymmetric AdaBoost to classify photometric data. The paper presents a critical review of existing study and puts forward an application of asymmetric AdaBoost, as an offspring of that exercise.Comment: 10 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Support Vector Machine classification of strong gravitational lenses

Author: Flamary R.
Hartley P.
Jackson N.
Metcalf R. B.
Tagore A. S.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

The imminent advent of very large-scale optical sky surveys, such as Euclid and LSST, makes it important to find efficient ways of discovering rare objects such as strong gravitational lens systems, where a background object is multiply gravitationally imaged by a foreground mass. As well as finding the lens systems, it is important to reject false positives due to intrinsic structure in galaxies, and much work is in progress with machine learning algorithms such as neural networks in order to achieve both these aims. We present and discuss a Support Vector Machine (SVM) algorithm which makes use of a Gabor filterbank in order to provide learning criteria for separation of lenses and non-lenses, and demonstrate using blind challenges that under certain circumstances it is a particularly efficient algorithm for rejecting false positives. We compare the SVM engine with a large-scale human examination of 100000 simulated lenses in a challenge dataset, and also apply the SVM method to survey images from the Kilo-Degree Survey.Comment: Accepted by MNRA

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

The University of Manchester - Institutional Repository