2,666 research outputs found
Data Provenance and Management in Radio Astronomy: A Stream Computing Approach
New approaches for data provenance and data management (DPDM) are required
for mega science projects like the Square Kilometer Array, characterized by
extremely large data volume and intense data rates, therefore demanding
innovative and highly efficient computational paradigms. In this context, we
explore a stream-computing approach with the emphasis on the use of
accelerators. In particular, we make use of a new generation of high
performance stream-based parallelization middleware known as InfoSphere
Streams. Its viability for managing and ensuring interoperability and integrity
of signal processing data pipelines is demonstrated in radio astronomy. IBM
InfoSphere Streams embraces the stream-computing paradigm. It is a shift from
conventional data mining techniques (involving analysis of existing data from
databases) towards real-time analytic processing. We discuss using InfoSphere
Streams for effective DPDM in radio astronomy and propose a way in which
InfoSphere Streams can be utilized for large antennae arrays. We present a
case-study: the InfoSphere Streams implementation of an autocorrelating
spectrometer, and using this example we discuss the advantages of the
stream-computing approach and the utilization of hardware accelerators
Model Configuration And Data Management In The Short-Term Water Information Forecasting Tools
The Short-term Water Information and Forecasting Tools (SWIFT) is a suite of tools for flood and short-term streamflow forecasting, consisting of a collection of hydrologic model components and utilities. Catchments are modeled using conceptual subareas and a node-link structure for channel routing. The tools comprise modules for calibration, model state updating, output error correction, ensemble runs and data assimilation. Given the combinatorial nature of the modelling experiments and the sub-daily time steps typically used for simulations, the volume of model configurations and time series data is substantial and its management is not trivial. SWIFT is currently used mostly for research purposes but has also been used operationally, with intersecting but significantly different requirements. Early versions of SWIFT used mostly ad-hoc text files handled via Fortran code, with limited use of netCDF for time series data. The configuration and data handling modules have since been redesigned. The model configuration now follows a design where the data model is decoupled from the on-disk persistence mechanism. For research purposes the preferred on-disk format is JSON, to leverage numerous software libraries in a variety of languages, while retaining the legacy option of custom tab-separated text formats when it is a preferred access arrangement for the researcher. By decoupling data model and data persistence, it is much easier to interchangeably use for instance relational databases to provide stricter provenance and audit trail capabilities in an operational flood forecasting context. For the time series data, given the volume and required throughput, text based formats are usually inadequate. A schema derived from CF conventions has been designed to efficiently handle time series for SWIFT
Prospects for large-scale financial systems simulation
As the 21st century unfolds, we find ourselves having to control, support, manage or otherwise cope with large-scale complex adaptive systems to an extent that is unprecedented in human history. Whether we are concerned with issues of food security, infrastructural resilience, climate change, health care, web science, security, or financial stability, we face problems that combine scale, connectivity, adaptive dynamics, and criticality. Complex systems simulation is emerging as the key scientific tool for dealing with such complex adaptive systems. Although a relatively new paradigm, it is one that has already established a track record in fields as varied as ecology (Grimm and Railsback, 2005), transport (Nagel et al., 1999), neuroscience (Markram, 2006), and ICT (Bullock and Cliff, 2004). In this report, we consider the application of simulation methodologies to financial systems, assessing the prospects for continued progress in this line of research
Recommended from our members
Collecting and utilising crowdsourced data for numerical weather prediction: propositions from the meeting held in Copenhagen, 4–December 5, 2018
In December 2018, the Danish Meteorological Institute organised an international meeting on the subject of crowdsourced data in numerical weather prediction (NWP) and weather forecasting. The meeting, spanning 2 days, gathered experts on crowdsourced data from both meteorological institutes and universities from Europe and the United States. Scientific presentations highlighted a vast array of possibilities and progress being made globally. Subjects include data from vehicles, smartphones, and private weather stations. Two groups were created to discuss open questions regarding the collection and use of crowdsourced data from different observing platforms. Common challenges were identified and potential solutions were discussed. While most of the work presented was preliminary, the results shared suggested that crowdsourced observations have the potential to enhance NWP. A common platform for sharing expertise, data, and results would help crowdsourced data realise this potential
- …