8,060 research outputs found

    Data Management and Mining in Astrophysical Databases

    Full text link
    We analyse the issues involved in the management and mining of astrophysical data. The traditional approach to data management in the astrophysical field is not able to keep up with the increasing size of the data gathered by modern detectors. An essential role in the astrophysical research will be assumed by automatic tools for information extraction from large datasets, i.e. data mining techniques, such as clustering and classification algorithms. This asks for an approach to data management based on data warehousing, emphasizing the efficiency and simplicity of data access; efficiency is obtained using multidimensional access methods and simplicity is achieved by properly handling metadata. Clustering and classification techniques, on large datasets, pose additional requirements: computational and memory scalability with respect to the data size, interpretability and objectivity of clustering or classification results. In this study we address some possible solutions.Comment: 10 pages, Late

    The non-linear evolution of bispectrum from the scale-free N-body simulation

    Full text link
    We have accurately measured the bispectrum for four scale-free models of structure formation with the spectral index n=1n=1, 0, -1, and -2. The measurement is based on a new method that can effectively eliminate the alias and numerical artifacts, and reliably extend the analysis into the strongly non-linear regime. The work makes use of a set of state-of-the art N-body simulations that have significantly increased the resolution range compared with the previous studies on the subject. With these measured results, we demonstrated that the measured bispectrum depends on the shape and size of kk-triangle even in the strongly nonlinear regime. It increases with wavenumber and decreases with the spectral index. These results are in contrast with the hypothesis that the reduced bispectrum is a constant in the strongly non-linear regime. We also show that the fitting formula of Scoccimarro & Frieman (1999) does not describe our simulation results well (with a typical error about 40 percent). In the end, we present a new fitting formula for the reduced bispectrum that is valid for 2n0-2 \leq n \leq 0 with a typical error of 10 percent only.Comment: 33 pages, including 1 table, 14 figures, accepted by Ap

    ASPECT: A spectra clustering tool for exploration of large spectral surveys

    Full text link
    We present the novel, semi-automated clustering tool ASPECT for analysing voluminous archives of spectra. The heart of the program is a neural network in form of Kohonen's self-organizing map. The resulting map is designed as an icon map suitable for the inspection by eye. The visual analysis is supported by the option to blend in individual object properties such as redshift, apparent magnitude, or signal-to-noise ratio. In addition, the package provides several tools for the selection of special spectral types, e.g. local difference maps which reflect the deviations of all spectra from one given input spectrum (real or artificial). ASPECT is able to produce a two-dimensional topological map of a huge number of spectra. The software package enables the user to browse and navigate through a huge data pool and helps him to gain an insight into underlying relationships between the spectra and other physical properties and to get the big picture of the entire data set. We demonstrate the capability of ASPECT by clustering the entire data pool of 0.6 million spectra from the Data Release 4 of the Sloan Digital Sky Survey (SDSS). To illustrate the results regarding quality and completeness we track objects from existing catalogues of quasars and carbon stars, respectively, and connect the SDSS spectra with morphological information from the GalaxyZoo project.Comment: 15 pages, 14 figures; accepted for publication in Astronomy and Astrophysic

    Astrophysics in S.Co.P.E

    Get PDF
    S.Co.P.E. is one of the four projects funded by the Italian Government in order to provide Southern Italy with a distributed computing infrastructure for fundamental science. Beside being aimed at building the infrastructure, S.Co.P.E. is also actively pursuing research in several areas among which astrophysics and observational cosmology. We shortly summarize the most significant results obtained in the first two years of the project and related to the development of middleware and Data Mining tools for the Virtual Observatory

    The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

    Get PDF
    Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need for new paradigms for the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found only in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments, some "old" and some new. These are generally unknown to most of the astronomical community, but are vital to the analysis and visualization of complex datasets and images. In order for astronomers to take advantage of the richness and complexity of the new era of data, and to be able to identify, adopt, and apply new solutions, the astronomical community needs a certain degree of awareness and understanding of the new concepts. One of the goals of this paper is to help bridge the gap between applied mathematics, artificial intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in Astronomy, special issue "Robotic Astronomy

    Some Pattern Recognition Challenges in Data-Intensive Astronomy

    Get PDF
    We review some of the recent developments and challenges posed by the data analysis in modern digital sky surveys, which are representative of the information-rich astronomy in the context of Virtual Observatory. Illustrative examples include the problems of an automated star-galaxy classification in complex and heterogeneous panoramic imaging data sets, and an automated, iterative, dynamical classification of transient events detected in synoptic sky surveys. These problems offer good opportunities for productive collaborations between astronomers and applied computer scientists and statisticians, and are representative of the kind of challenges now present in all data-intensive fields. We discuss briefly some emergent types of scalable scientific data analysis systems with a broad applicability.Comment: 8 pages, compressed pdf file, figures downgraded in quality in order to match the arXiv size limi
    corecore