19 research outputs found
Astroinformatics of galaxies and quasars: a new general method for photometric redshifts estimation
With the availability of the huge amounts of data produced by current and
future large multi-band photometric surveys, photometric redshifts have become
a crucial tool for extragalactic astronomy and cosmology. In this paper we
present a novel method, called Weak Gated Experts (WGE), which allows to derive
photometric redshifts through a combination of data mining techniques.
\noindent The WGE, like many other machine learning techniques, is based on the
exploitation of a spectroscopic knowledge base composed by sources for which a
spectroscopic value of the redshift is available. This method achieves a
variance \sigma^2(\Delta z)=2.3x10^{-4} (\sigma^2(\Delta z) =0.08), where
\Delta z = z_{phot} - z_{spec}) for the reconstruction of the photometric
redshifts for the optical galaxies from the SDSS and for the optical quasars
respectively, while the Root Mean Square (RMS) of the \Delta z variable
distributions for the two experiments is respectively equal to 0.021 and 0.35.
The WGE provides also a mechanism for the estimation of the accuracy of each
photometric redshift. We also present and discuss the catalogs obtained for the
optical SDSS galaxies, for the optical candidate quasars extracted from the DR7
SDSS photometric dataset {The sample of SDSS sources on which the accuracy of
the reconstruction has been assessed is composed of bright sources, for a
subset of which spectroscopic redshifts have been measured.}, and for optical
SDSS candidate quasars observed by GALEX in the UV range. The WGE method
exploits the new technological paradigm provided by the Virtual Observatory and
the emerging field of Astroinformatics.Comment: 36 pages, 22 figures and 8 table
DAME: A distributed data mining and exploration framework within the virtual observatory
Nowadays, many scientific areas share the same broad requirements of being able to deal with massive and distributed
datasets while, when possible, being integrated with services and applications. In order to solve the growing gap between
the incremental generation of data and our understanding of it, it is required to know how to access, retrieve, analyze,
mine and integrate data from disparate sources. One of the fundamental aspects of any new generation of data mining
software tool or package which really wants to become a service for the community is the possibility to use it within
complex workflows which each user can fine tune in order to match the specific demands of his scientific goal. These
workflows need often to access different resources (data, providers, computing facilities and packages) and require a
strict interoperability on (at least) the client side. The project DAME (DAta Mining & Exploration) arises from these
requirements by providing a distributed WEB-based data mining infrastructure specialized on Massive Data Sets
exploration with Soft Computing methods. Originally designed to deal with astrophysical use cases, where first scientific
application examples have demonstrated its effectiveness, the DAME Suite results as a multi-disciplinary platformindependent
tool perfectly compliant with modern KDD (Knowledge Discovery in Databases) requirements and
Information & Communication Technology trends
Detecting and tracking bacteria with quantum light
The field of quantum sensing aims at improving the detection and estimation of classical parameters that are encoded in physical systems by resorting to quantum sources of light and quantum detection strategies. The same approach can be used to improve the current classical measurements that are performed on biological systems. Here we consider the scenario of two bacteria (E. coli and Salmonella) growing in a Luria Bertani broth and monitored by classical spectrophotometers. Their concentration can be related to the optical transmissivity via the Beer-Lambert-Bouguer's law and their growth curves can be described by means of Gompertz functions. Starting from experimental data points, we extrapolate the growth curves of the two bacteria and we study the theoretical performance that would be achieved with a quantum setup. In particular, we discuss how the bacterial growth can in principle be tracked by irradiating the samples with orders of magnitude fewer photons, identifying the clear superiority of quantum light in the early stages of growth. We then show the superiority and the limits of quantum resources in two basic tasks: (i) the early detection of bacterial growth and (ii) the early discrimination between two bacteria species
Iris: an Extensible Application for Building and Analyzing Spectral Energy Distributions
Iris is an extensible application that provides astronomers with a
user-friendly interface capable of ingesting broad-band data from many
different sources in order to build, explore, and model spectral energy
distributions (SEDs). Iris takes advantage of the standards defined by the
International Virtual Observatory Alliance, but hides the technicalities of
such standards by implementing different layers of abstraction on top of them.
Such intermediate layers provide hooks that users and developers can exploit in
order to extend the capabilities provided by Iris. For instance, custom Python
models can be combined in arbitrary ways with the Iris built-in models or with
other custom functions. As such, Iris offers a platform for the development and
integration of SED data, services, and applications, either from the user's
system or from the web. In this paper we describe the built-in features
provided by Iris for building and analyzing SEDs. We also explore in some
detail the Iris framework and software development kit, showing how astronomers
and software developers can plug their code into an integrated SED analysis
environment.Comment: 18 pages, 8 figures, accepted for publication in Astronomy &
Computin
Managing Distributed Software Development in the Virtual Astronomical Observatory
The U.S. Virtual Astronomical Observatory (VAO) is a product-driven
organization that provides new scientific research capabilities to the
astronomical community. Software development for the VAO follows a lightweight
framework that guides development of science applications and infrastructure.
Challenges to be overcome include distributed development teams, part-time
efforts, and highly constrained schedules. We describe the process we followed
to conquer these challenges while developing Iris, the VAO application for
analysis of 1-D astronomical spectral energy distributions (SEDs). Iris was
successfully built and released in less than a year with a team distributed
across four institutions. The project followed existing International Virtual
Observatory Alliance inter-operability standards for spectral data and
contributed a SED library as a by-product of the project. We emphasize lessons
learned that will be folded into future development efforts. In our experience,
a well-defined process that provides guidelines to ensure the project is
cohesive and stays on track is key to success. Internal product deliveries with
a planned test and feedback loop are critical. Release candidates are measured
against use cases established early in the process, and provide the opportunity
to assess priorities and make course corrections during development. Also key
is the participation of a stakeholder such as a lead scientist who manages the
technical questions, advises on priorities, and is actively involved as a lead
tester. Finally, frequent scheduled communications (for example a bi-weekly
tele-conference) assure issues are resolved quickly and the team is working
toward a common visionComment: 7 pages, 2 figures, SPIE 2012 conferenc
Nextcast : A software suite to analyse and model toxicogenomics data
The recent advancements in toxicogenomics have led to the availability of large omics data sets, representing the starting point for studying the exposure mechanism of action and identifying candidate biomarkers for toxicity prediction. The current lack of standard methods in data generation and analysis hampers the full exploitation of toxicogenomics-based evidence in regulatory risk assessment. Moreover, the pipelines for the preprocessing and downstream analyses of toxicogenomic data sets can be quite challenging to implement. During the years, we have developed a number of software packages to address specific questions related to multiple steps of toxicogenomics data analysis and modelling. In this review we present the Nextcast software collection and discuss how its individual tools can be combined into efficient pipelines to answer specific biological questions. Nextcast components are of great support to the scientific community for analysing and interpreting large data sets for the toxicity evaluation of compounds in an unbiased, straightforward, and reliable manner. The Nextcast software suite is available at: ( https://github.com/fhaive/nextcast).(c) 2022 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Peer reviewe
IVOA Recommendation: Spectrum Data Model 1.1
We present a data model describing the structure of spectrophotometric
datasets with spectral and temporal coordinates and associated metadata. This
data model may be used to represent spectra, time series data, segments of SED
(Spectral Energy Distributions) and other spectral or temporal associations.Comment: http://www.ivoa.ne