23 research outputs found

    Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case

    Get PDF
    Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of photometric redshifts as a case study, we outline the main problems and some ongoing efforts to solve them.Comment: 13 pages, 3 figures, Springer's Communications in Computer and Information Science (CCIS), Vol. 82

    Sky Surveys

    Full text link
    Sky surveys represent a fundamental data basis for astronomy. We use them to map in a systematic way the universe and its constituents, and to discover new types of objects or phenomena. We review the subject, with an emphasis on the wide-field imaging surveys, placing them in a broader scientific and historical context. Surveys are the largest data generators in astronomy, propelled by the advances in information and computation technology, and have transformed the ways in which astronomy is done. We describe the variety and the general properties of surveys, the ways in which they may be quantified and compared, and offer some figures of merit that can be used to compare their scientific discovery potential. Surveys enable a very wide range of science; that is perhaps their key unifying characteristic. As new domains of the observable parameter space open up thanks to the advances in technology, surveys are often the initial step in their exploration. Science can be done with the survey data alone or a combination of different surveys, or with a targeted follow-up of potentially interesting selected sources. Surveys can be used to generate large, statistical samples of objects that can be studied as populations, or as tracers of larger structures. They can be also used to discover or generate samples of rare or unusual objects, and may lead to discoveries of some previously unknown types. We discuss a general framework of parameter spaces that can be used for an assessment and comparison of different surveys, and the strategies for their scientific exploration. As we move into the Petascale regime, an effective processing and scientific exploitation of such large data sets and data streams poses many challenges, some of which may be addressed in the framework of Virtual Observatory and Astroinformatics, with a broader application of data mining and knowledge discovery technologies.Comment: An invited chapter, to appear in Astronomical Techniques, Software, and Data (ed. H. Bond), Vol.2 of Planets, Stars, and Stellar Systems (ser. ed. T. Oswalt), Springer Verlag, in press (2012). 62 pages, incl. 2 tables and 3 figure

    Astroinformatics

    Get PDF
    As President of Commission on Astroinformatics and Astrostatistics of the International Astronomical Union, I welcome you to the first IAU Symposium on astroinformatics. This is not the first meeting in the field: the 26th meeting on ADASS (Astronomical Data Analysis Software and Systems) was held last weak in Trieste (and members of that group are here today), and this symposium has a strong heritage in workshops held in recent years at Caltech, Seattle, and Sydney. But this is the first time that the broader community of astronomers, through the IAU in collaboration of the giant IEEE organization has recognized this new field of study devoted to the challenges of Big Data and advanced methodology in astronomical research. This is the first time experts from around the world have gathered to share experiences and plan for the future. I have a comment to make. The typical IAU Symposium treats some well-established field of stars or galaxies or cosmology where the leading groups know each other well. But astroinformatics is such a young field, that we do not know each other and we do not know what ideas will emerge from this meeting. So I encourage each of us to have a creative approach to this meeting, work hard to talk to strangers, and help generate a community of scholars who can lead this field into the future

    Increasing the Discovery Space in Astrophysics - A Collation of Six Submitted White Papers

    Get PDF
    We write in response to the call from the 2020 Decadal Survey to submit white papers illustrating the most pressing scientific questions in astrophysics for the coming decade. We propose exploration as the central question for the Decadal Committee's discussions.The history of astronomy shows that paradigm changing discoveries are not driven by well formulated scientific questions, based on the knowledge of the time. They were instead the result of the increase in discovery space fostered by new telescopes and instruments. An additional tool for increasing the discovery space is provided by the analysis and mining of the increasingly larger amount of archival data available to astronomers. Revolutionary observing facilities, and the state of the art astronomy archives needed to support these facilities, will open up the universe to new discovery. Here we focus on exploration for compact objects and multi messenger science. This white paper includes science examples of the power of the discovery approach, encompassing all the areas of astrophysics covered by the 2020 Decadal Survey

    Research Data: Who will share what, with whom, when, and why?

    Get PDF
    The deluge of scientific research data has excited the general public, as well as the scientific community, with the possibilities for better understanding of scientific problems, from climate to culture. For data to be available, researchers must be willing and able to share them. The policies of governments, funding agencies, journals, and university tenure and promotion committees also influence how, when, and whether research data are shared. Data are complex objects. Their purposes and the methods by which they are produced vary widely across scientific fields, as do the criteria for sharing them. To address these challenges, it is necessary to examine the arguments for sharing data and how those arguments match the motivations and interests of the scientific community and the public. Four arguments are examined: to make the results of publicly funded data available to the public, to enable others to ask new questions of extant data, to advance the state of science, and to reproduce research. Libraries need to consider their role in the face of each of these arguments, and what expertise and systems they require for data curation.

    Learning to Identify Extragalactic Radio Sources

    Get PDF
    Radio observations of actively accreting supermassive black holes outside of the galaxy can provide insight into the history of galaxies and their evolution. With the construction of fast new radio telescopes and the undertaking of large new radio surveys in the lead-up to the Square Kilometre Array (SKA), radio astronomy faces a `data deluge' where traditional methods of data analysis cannot keep up with the scale of the data. Astronomers are increasingly looking to machine learning to provide ways of handling large-scale data like these. This thesis introduces machine learning methods for use in wide-area radio surveys and demonstrates their application to radio astronomy data. To help understand the issues facing large-scale wide-area radio surveys, and contribute toward their solutions, we consider the problems of automated radio-infrared cross-identification and Faraday complexity classification. We developed an automated machine learning method for cross-identifying radio objects with their infrared counterparts, training the algorithm with data from the citizen science project Radio Galaxy Zoo. The trained result performed comparably to an algorithm trained on expert cross-identifications, demonstrating the benefit of non-expert labelling in radio astronomy. By examining the theoretical maximum accuracy of this algorithm we showed that existing pilot studies for future surveys were not sufficiently large enough to train machine learning methods. We showed the utility of our cross-identification algorithm by applying it instead to a large survey, Faint Images of the Radio Sky at Twenty Centimeters (FIRST), producing the largest catalogue of cross-identified extended sources available at the time of writing. From this catalogue, we calculated a mid-infrared-divided fractional radio luminosity function as well as an estimate of energy injected into the intergalactic medium by active galactic nuclei jets---one of the first applications of machine learning to radio astronomy to obtain a physics result. A key result from this work was that the limitation in our sample size was not due to the number of radio objects cross-identified but rather by the number of available redshift measurements. Finally, we developed interpretable features for spectropolarimetric measurements of radio sources and used these features to design a machine learning algorithm that can identify Faraday complexity, while the features themselves may be used for other tasks. The methods in this thesis will be applicable to future radio surveys such as the Evolutionary Map of the Universe (EMU) continuum survey and the Polarised Sky Survey of the Universe's Magnetism (POSSUM), as well as surveys produced with the SKA, allowing the development of higher resolution radio luminosity functions, better estimates of the impact of radio galaxies on their environments, faster analysis of polarised surveys, and better quality rotation measure grids

    Increasing the Discovery Space in Astrophysics - A Collation of Six Submitted White Papers

    Get PDF
    We write in response to the call from the 2020 Decadal Survey to submit white papers illustrating the most pressing scientific questions in astrophysics for the coming decade. We propose exploration as the central question for the Decadal Committee's discussions. The history of astronomy shows that paradigm changing discoveries are not driven by well formulated scientific questions, based on the knowledge of the time. They were instead the result of the increase in discovery space fostered by new telescopes and instruments. An additional tool for increasing the discovery space is provided by the analysis and mining of the increasingly larger amount of archival data available to astronomers. Revolutionary observing facilities, and the state of the art astronomy archives needed to support these facilities, will open up the universe to new discovery. Here we focus on exploration for compact objects and multi messenger science. This white paper includes science examples of the power of the discovery approach, encompassing all the areas of astrophysics covered by the 2020 Decadal Survey
    corecore