17,802 research outputs found
Virtual Astronomy, Information Technology, and the New Scientific Methodology
All sciences, including astronomy, are now entering the era of information abundance. The exponentially increasing volume and complexity of modern data sets promises to transform the scientific practice, but also poses a number of common technological challenges. The Virtual Observatory concept is the astronomical community's response to these challenges: it aims to harness the progress in information technology in the service of astronomy, and at the same time provide a valuable testbed for information technology and applied computer science. Challenges broadly fall into two categories: data handling (or "data farming"), including issues such as archives, intelligent storage, databases, interoperability, fast networks, etc., and data mining, data understanding, and knowledge discovery, which include issues such as automated clustering and classification, multivariate correlation searches, pattern recognition, visualization in highly hyperdimensional parameter spaces, etc., as well as various applications of machine learning in these contexts. Such techniques are forming a methodological foundation for science with massive and complex data sets in general, and are likely to have a much broather impact on the modern society, commerce, information economy, security, etc. There is a powerful emerging synergy between the
computationally enabled science and the science-driven computing, which will drive the progress in science, scholarship, and many other venues in the 21st century
Data Driven Discovery in Astrophysics
We review some aspects of the current state of data-intensive astronomy, its
methods, and some outstanding data analysis challenges. Astronomy is at the
forefront of "big data" science, with exponentially growing data volumes and
data rates, and an ever-increasing complexity, now entering the Petascale
regime. Telescopes and observatories from both ground and space, covering a
full range of wavelengths, feed the data via processing pipelines into
dedicated archives, where they can be accessed for scientific analysis. Most of
the large archives are connected through the Virtual Observatory framework,
that provides interoperability standards and services, and effectively
constitutes a global data grid of astronomy. Making discoveries in this
overabundance of data requires applications of novel, machine learning tools.
We describe some of the recent examples of such applications.Comment: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data
from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figure
Mining Knowledge in Astrophysical Massive Data Sets
Modern scientific data mainly consist of huge datasets gathered by a very
large number of techniques and stored in very diversified and often
incompatible data repositories. More in general, in the e-science environment,
it is considered as a critical and urgent requirement to integrate services
across distributed, heterogeneous, dynamic "virtual organizations" formed by
different resources within a single enterprise. In the last decade, Astronomy
has become an immensely data rich field due to the evolution of detectors
(plates to digital to mosaics), telescopes and space instruments. The Virtual
Observatory approach consists into the federation under common standards of all
astronomical archives available worldwide, as well as data analysis, data
mining and data exploration applications. The main drive behind such effort
being that once the infrastructure will be completed, it will allow a new type
of multi-wavelength, multi-epoch science which can only be barely imagined.
Data Mining, or Knowledge Discovery in Databases, while being the main
methodology to extract the scientific information contained in such MDS
(Massive Data Sets), poses crucial problems since it has to orchestrate complex
problems posed by transparent access to different computing environments,
scalability of algorithms, reusability of resources, etc. In the present paper
we summarize the present status of the MDS in the Virtual Observatory and what
is currently done and planned to bring advanced Data Mining methodologies in
the case of the DAME (DAta Mining & Exploration) project.Comment: Pages 845-849 1rs International Conference on Frontiers in
Diagnostics Technologie
Some Pattern Recognition Challenges in Data-Intensive Astronomy
We review some of the recent developments and challenges posed by the data
analysis in modern digital sky surveys, which are representative of the
information-rich astronomy in the context of Virtual Observatory. Illustrative
examples include the problems of an automated star-galaxy classification in
complex and heterogeneous panoramic imaging data sets, and an automated,
iterative, dynamical classification of transient events detected in synoptic
sky surveys. These problems offer good opportunities for productive
collaborations between astronomers and applied computer scientists and
statisticians, and are representative of the kind of challenges now present in
all data-intensive fields. We discuss briefly some emergent types of scalable
scientific data analysis systems with a broad applicability.Comment: 8 pages, compressed pdf file, figures downgraded in quality in order
to match the arXiv size limi
Exploration of Parameter Spaces in a Virtual Observatory
Like every other field of intellectual endeavor, astronomy is being
revolutionised by the advances in information technology. There is an ongoing
exponential growth in the volume, quality, and complexity of astronomical data
sets, mainly through large digital sky surveys and archives. The Virtual
Observatory (VO) concept represents a scientific and technological framework
needed to cope with this data flood. Systematic exploration of the observable
parameter spaces, covered by large digital sky surveys spanning a range of
wavelengths, will be one of the primary modes of research with a VO. This is
where the truly new discoveries will be made, and new insights be gained about
the already known astronomical objects and phenomena. We review some of the
methodological challenges posed by the analysis of large and complex data sets
expected in the VO-based research. The challenges are driven both by the size
and the complexity of the data sets (billions of data vectors in parameter
spaces of tens or hundreds of dimensions), by the heterogeneity of the data and
measurement errors, including differences in basic survey parameters for the
federated data sets (e.g., in the positional accuracy and resolution,
wavelength coverage, time baseline, etc.), various selection effects, as well
as the intrinsic clustering properties (functional form, topology) of the data
distributions in the parameter spaces of observed attributes. Answering these
challenges will require substantial collaborative efforts and partnerships
between astronomers, computer scientists, and statisticians.Comment: Invited review, 10 pages, Latex file with 4 eps figures, style files
included. To appear in Proc. SPIE, v. 4477 (2001
Some statistical and computational challenges, and opportunities in astronomy
The data complexity and volume of astronomical findings have increased in recent decades due to major technological improvements in instrumentation and data collection methods. The contemporary astronomer is flooded with terabytes of raw data that produce enormous multidimensional catalogs of objects (stars, galaxies, quasars, etc.) numbering in the billions, with hundreds of measured numbers for each object. The astronomical community thus faces a key task: to enable efficient and objective scientific exploitation of enormous multifaceted data sets and the complex links between data and astrophysical theory. In recognition of this task, the National Virtual Observatory (NVO) initiative recently emerged to federate numerous large digital sky archives, and to develop tools to explore and understand these vast volumes of data. The effective use of such integrated massive data sets presents a variety of new challenging statistical and algorithmic problems that require methodological advances. An interdisciplinary team of statisticians, astronomers and computer scientists from The Pennsylvania State University, California Institute of Technology and Carnegie Mellon University is developing statistical methodology for the NVO. A brief glimpse into the Virtual Observatory and the work of the Penn State-led team is provided here
Exploring the Use of Virtual Worlds as a Scientific Research Platform: The Meta-Institute for Computational Astrophysics (MICA)
We describe the Meta-Institute for Computational Astrophysics (MICA), the
first professional scientific organization based exclusively in virtual worlds
(VWs). The goals of MICA are to explore the utility of the emerging VR and VWs
technologies for scientific and scholarly work in general, and to facilitate
and accelerate their adoption by the scientific research community. MICA itself
is an experiment in academic and scientific practices enabled by the immersive
VR technologies. We describe the current and planned activities and research
directions of MICA, and offer some thoughts as to what the future developments
in this arena may be.Comment: 15 pages, to appear in the refereed proceedings of "Facets of Virtual
Environments" (FaVE 2009), eds. F. Lehmann-Grube, J. Sablating, et al., ICST
Lecture Notes Ser., Berlin: Springer Verlag (2009); version with full
resolution color figures is available at
http://www.mica-vw.org/wiki/index.php/Publication
Exploration of Large Digital Sky Surveys
We review some of the scientific opportunities and technical challenges posed
by the exploration of the large digital sky surveys, in the context of a
Virtual Observatory (VO). The VO paradigm will profoundly change the way
observational astronomy is done. Clustering analysis techniques can be used to
discover samples of rare, unusual, or even previously unknown types of
astronomical objects and phenomena. Exploration of the previously poorly probed
portions of the observable parameter space are especially promising. We
illustrate some of the possible types of studies with examples drawn from
DPOSS; much more complex and interesting applications are forthcoming.
Development of the new tools needed for an efficient exploration of these vast
data sets requires a synergy between astronomy and information sciences, with
great potential returns for both fields.Comment: To appear in: Mining the Sky, eds. A. Banday et al., ESO Astrophysics
Symposia, Berlin: Springer Verlag, in press (2001). Latex file, 18 pages, 6
encapsulated postscript figures, style files include
- …