Search CORE

539 research outputs found

A New Approach to Tagging Data in the Astronomical Literature

Author: Alexov Anastasia
Good John C.
Publication venue: 'Astronomical Society of the Pacific Conference Series'
Publication date: 01/01/2007
Field of study

Data Tags are strings used in journals to indicate the origin of the archival data and to enable the reader to recover the data. The NASA/IPAC Infrared Science Archive (IRSA) has recently introduced a new approach to production of data tags and recovery of data from them. Many of the data access services at the IRSA return filtered data sets (such as subsets of source catalogs) and dynamically created products (such as image cutouts); these dynamically created products are not saved permanently at the archive. Rather than tag the data sets from which the query result sets are drawn, the archive tags the query that generates the results. A single tag can, then, encode a complex dynamic data set and simplifies the embedding of tags in manuscripts and journals. By logging user queries and all the parameters for those query as Data Tags, IRSA can re-create the query and rerun the IRSA service using the same search parameters used when the Data Tag was created. At the same time, the logs give a simple count of the actual numbers of queries made to the archive, a powerful metric of archive usage unobtainable from the Apache web server logs. Currently, IRSA creates tags for queries to more than 20 data sets, including the Infrared Astronomical Satellite (IRAS), Cosmic Evolution Survey (COSMOS) and Spitzer Space Telescope Legacy Data Sets. These tags are returned by the spatial query engine, Atlas. IRSA plans to create tags for queries to the rest of its services in late Spring 2007. The archive provides a simple web interface which recovers a data set that corresponds to the input data tag. Archived data sets may evolve in time due to improved calibrations or augmentations to the data set. IRSA’s query based approach guarantees that users always receive the best available data sets

Caltech Authors

Spitzer data at the NASA/IPAC Infrared Science Archive (IRSA)

Author: Alexov Anastasia
Good John C.
Publication venue: 'Astronomical Society of the Pacific Conference Series'
Publication date: 01/01/2008
Field of study

The NASA/IPAC Infrared Science Archive (IRSA) curates and serves science data sets from NASA’s infrared and submillimeter projects and missions, including IRAS, 2MASS, MSX, SWAS, ISO, IRTS and from the Spitzer Space Telescope. All Spitzer data can be accessed from IRSA’s Spitzer mission page at: http://irsa.ipac.caltech.edu/Missions/spitzer.html Spitzer Legacy Enhanced Products along with ancillary data are delivered in six month intervals starting from Fall 2004, until Fall 2006. IRSA continually ingests the Spitzer data and the ancillary data, and these data are made accessible through IRSA’s query engines. Legacy products for the C2D, FEPS, GLIMPSE, GOODS, SINGS and SWIRE projects are accessible through a common interface http://irsa.ipac.caltech.edu/applications/Atlas. This engine returns the spatial footprints of observations and provides access to all flavors of released data sets, including, where appropriate, previews of image mosaics, 3-color image mosaics and spectra

Caltech Authors

Chapter 11: Web-based Tools—VO Region Inventory Service

Author: Good John C.
Publication venue: 'Astronomical Society of the Pacific Conference Series'
Publication date: 01/01/2007
Field of study

As the size and number of datasets available through the VO grows, it becomes increasingly critical to have services that aid in locating and characterizing data pertinent to a particular scientific problem. At the same time, this same increase makes that goal more and more difficult to achieve. With a small number of datasets, it is feasible to simply retrieve the data itself (as the NVO DataScope service does). At intermediate scales, “count” DBMS searches (searches of the actual datasets which return record counts rather than full data subsets) sent to each data provider will work. However, neither of these approaches scale as the number of datasets expands into the hundreds or thousands. Dealing with the same problem internally, IRSA developed a compact and extremely fast scheme for determining source counts for positional catalogs (and in some cases image metadata) over arbitrarily large regions for multiple catalogs in a fraction of a second. To show applicability to the VO in general, this service has been extended with indices for all 4000+ catalogs in CDS Vizier (essentially all published catalogs and source tables). In this chapter, we will briefly describe the architecture of this service, and then describe how this can be used in a distributed system to retrieve rapid inventories of all VO holdings in a way that places an insignificant load on any data supplier. Further, we show and this tool can be used in conjunction with VO Registries and catalog services to zero in on those datasets that are appropriate to the user’s needs. The initial implementation of this service consolidates custom binary index file structures (external to any DBMS and therefore portable) at a single site to minimize search times and implements the search interface as a simple CGI program. However, the architecture is amenable to distribution. The next phase of development will focus on metadata harvesting from data archives through a standard program interface and distribution of the search processing across multiple service providers for redundancy and parallelization

Caltech Authors

Sustaining the Montage Image Mosaic Engine Since 2002

Author: Berriman G. Bruce
Good John C.
Publication venue
Publication date: 11/06/2018
Field of study

This paper describes how we have sustained the Montage image mosaic engine (http://montage.ipac.caltech.edu) first released in 2002, to support the ever-growing scale and complexity of modern data sets. The key to its longevity has been its design as a toolkit written in ANSI-C, with each tool performing one distinct task, for easy integration into scripts, pipelines and workflows. The same code base now supports Windows, JavaScript and Python by taking advantage of recent advances in compilers. The design has led to applicability of Montage far beyond what was anticipated when Montage was first built, such as supporting observation planning for the JWST. Moreover, Montage is highly scalable and is in wide use within the IT community to develop advanced, fault-tolerant cyber-infrastructure, such as job schedulers for grids, workflow orchestration, and restructuring techniques for processing complex workflows and pipelines.Comment: 12 pages, 8 figures. Software and Cyberinfrastructure for Astronomy V (Conference 10707), SPIE SPIE Astronomical Telescopes + Instrumentation, Austin TX. June 10-15, 201

arXiv.org e-Print Archive

Caltech Authors

ROME (Request Object Management Environment)

Author: Berriman G. Bruce
Good John C.
Kong Mihseh
Publication venue: 'Astronomical Society of the Pacific Conference Series'
Publication date: 01/10/2005
Field of study

Most current astronomical archive services are based on an HTML/ CGI architecture where users submit HTML forms via a browser and CGI programs operating under a web server process the requests. Most services return an HTML result page with URL links to the result files or, for longer jobs, return a message indicating that email will be sent when the job is done. This paradigm has a few serious shortcomings. First, it is all too common for something to go wrong and for the user to never hear about the job again. Second, for long and complicated jobs there is often important intermediate information that would allow the user to adjust the processing. Finally, unless some sort of custom queueing mechanism is used, background jobs are started immediately upon receiving the CGI request. When there are many such requests the server machine can easily be overloaded and either slow to a crawl or crash. Request Object Management Environment (ROME) is a collection of middleware components being developed under the National Virtual Observatory Project to provide mechanism for managing long jobs such as computationally intensive statistical analysis requests or the generation of large scale mosaic images. Written as EJB objects within the open-source JBoss applications server, ROME receives processing requests via a servelet interface, stores them in a DBMS using JDBC, distributes the processing (via queuing mechanisms) across multiple machines and environments (including Grid resources), manages realtime messages from the processing modules, and ensures proper user notification. The request processing modules are identical in structure to standard CGIprograms – though they can optionally implement status messaging – and can be written in any language. ROME will persist these jobs across failures of processing modules, network outages, and even downtime of ROME and the DBMS, restarting them as necessary

Caltech Authors

TESS as a Low-surface-brightness Observatory: Cutouts from Wide-area Coadded Images

Author: Berriman G. Bruce
Good John C.
Holwerda Benne
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/07/2021
Field of study

We present a mosaic of those co-added Full Frame Images acquired by the TESS satellite that had been released in 2020 April. The mosaic shows substantial stray light over the sky. Yet over spatial scales of a few degrees, the background appears uniform. This result indicates that TESS has considerable potential as a Low Surface Brightness Observatory. The co-added images are freely available as a High Level Science Product (HLSP) at MAST and accessible through a Jupyter Notebook

University of Louisville

Caltech Authors

A Study of the Efficiency of Spatial Indexing Methods Applied to Large Astronomical Databases

Author: Berriman G. Bruce
Donaldson Tom
Good John C.
Shiao Bernie
Publication venue
Publication date: 22/06/2018
Field of study

We report the results of a study to compare the performance of two common database indexing methods, HTM and HEALPix, on Solaris and Windows database servers installed with PostgreSQL, and a Windows Server installed with MS SQL Server. The indexing was applied to the 2MASS All-Sky Catalog and to the Hubble Source Catalog, which approximate the diversity of catalogs common in astronomy. On each server, the study compared indexing performance by submitting 1 million queries at each index level with random sky positions and random cone search radius, which was computed on a logarithmic scale between 1 arcsec and 1 degree, and measuring the time to complete the query and write the output. These simulated queries, intended to model realistic use patterns, were run in a uniform way on many combinations of indexing method and indexing depth. The query times in all simulations are strongly I/O-bound and are linear with number of records returned for large numbers of sources. There are, however, considerable differences between simulations, which reveal that hardware I/O throughput is a more important factor in managing the performance of a DBMS than the choice of indexing scheme. The choice of index itself is relatively unimportant: for comparable index levels, the performance is consistent within the scatter of the timings. At small index levels (large cells; e.g. level 4; cell size 3.7 deg), there is large scatter in the timings because of wide variations in the number of sources found in the cells. At larger index levels, performance improves and scatter decreases, but the improvement at level 8 (14 arcmin) and higher is masked to some extent in the timing scatter caused by the range of query sizes. At very high levels (20; 0.0004 arsec), the granularity of the cells becomes so high that a large number of extraneous empty cells begin to degrade performance

arXiv.org e-Print Archive

Caltech Authors

Connecting the time domain community with the Virtual Astronomical Observatory

Author: Djorgovski S. G.
Donalek Ciro
Drake Andrew J.
Good John C.
Graham Matthew J.
Kantor Jeffrey
Mahabal Ashish A.
Plante Raymond L.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 18/06/2012
Field of study

The time domain has been identified as one of the most important areas of astronomical research for the next decade. The Virtual Observatory is in the vanguard with dedicated tools and services that enable and facilitate the discovery, dissemination and analysis of time domain data. These range in scope from rapid notifications of time-critical astronomical transients to annotating long-term variables with the latest modeling results. In this paper, we will review the prior art in these areas and focus on the capabilities that the VAO is bringing to bear in support of time domain science. In particular, we will focus on the issues involved with the heterogeneous collections of (ancillary) data associated with astronomical transients, and the time series characterization and classification tools required by the next generation of sky surveys, such as LSST and SKA.Comment: Submitted to Proceedings of SPIE Observatory Operations: Strategies, Processes and Systems IV, Amsterdam, 2012 July 2-

arXiv.org e-Print Archive

Crossref

Caltech Authors