18 research outputs found

    How Will Astronomy Archives Survive The Data Tsunami?

    Full text link
    The field of astronomy is starting to generate more data than can be managed, served and processed by current techniques. This paper has outlined practices for developing next-generation tools and techniques for surviving this data tsunami, including rigorous evaluation of new technologies, partnerships between astronomers and computer scientists, and training of scientists in high-end software engineering engineering skills.Comment: 8 pages, 3 figures; ACM Queue. Vol 9, Number 10, October 2011 (http://queue.acm.org/detail.cfm?id=2047483

    GPU Accelerated Particle Visualization with Splotch

    Get PDF
    Splotch is a rendering algorithm for exploration and visual discovery in particle-based datasets coming from astronomical observations or numerical simulations. The strengths of the approach are production of high quality imagery and support for very large-scale datasets through an effective mix of the OpenMP and MPI parallel programming paradigms. This article reports our experiences in re-designing Splotch for exploiting emerging HPC architectures nowadays increasingly populated with GPUs. A performance model is introduced for data transfers, computations and memory access, to guide our re-factoring of Splotch. A number of parallelization issues are discussed, in particular relating to race conditions and workload balancing, towards achieving optimal performances. Our implementation was accomplished by using the CUDA programming paradigm. Our strategy is founded on novel schemes achieving optimized data organisation and classification of particles. We deploy a reference simulation to present performance results on acceleration gains and scalability. We finally outline our vision for future work developments including possibilities for further optimisations and exploitation of emerging technologies.Comment: 25 pages, 9 figures. Astronomy and Computing (2014

    Data Science as a New Frontier for Design

    Full text link
    The purpose of this paper is to contribute to the challenge of transferring know-how, theories and methods from design research to the design processes in information science and technologies. More specifically, we shall consider a domain, namely data-science, that is becoming rapidly a globally invested research and development axis with strong imperatives for innovation given the data deluge we are currently facing. We argue that, in order to rise to the data-related challenges that the society is facing, data-science initiatives should ensure a renewal of traditional research methodologies that are still largely based on trial-error processes depending on the talent and insights of a single (or a restricted group of) researchers. It is our claim that design theories and methods can provide, at least to some extent, the much-needed framework. We will use a worldwide data-science challenge organized to study a technical problem in physics, namely the detection of Higgs boson, as a use case to demonstrate some of the ways in which design theory and methods can help in analyzing and shaping the innovation dynamics in such projects.Comment: International Conference on Engineering Design, Jul 2015, Milan, Ital

    Research Cloud Data Communities

    Get PDF
    Big Data, big science, the data deluge, these are topics we are hearing about more and more in our research pursuits. Then, through media hype, comes cloud computing, the saviour that is going to resolve our Big Data issues. However, it is difficult to pinpoint exactly what researchers can actually do with data and with clouds, how they get to exactly solve their Big Data problems, and how they get help in using these relatively new tools and infrastructure. Since the beginning of 2012, the NeCTAR Research Cloud has been running at the University of Melbourne, attracting over 1,650 users from around the country. This has not only provided an unprecedented opportunity for researchers to employ clouds in their research, but it has also given us an opportunity to clearly understand how researchers can more easily solve their Big Data problems. The cloud is now used daily, from running web servers and blog sites, through to hosting virtual laboratories that can automatically create hundreds of servers depending on research demand. Of course, it has also helped us understand that infrastructure isn’t everything. There are many other skillsets needed to help researchers from the multitude of disciplines use the cloud effectively. How can we solve Big Data problems on cloud infrastructure? One of the key aspects are communities based on research platforms: Research is built on collaboration, connection and community, and researchers employ platforms daily, whether as bio-imaging platforms, computational platforms or cloud platforms (like DropBox). There are some important features which enabled this to work.. Firstly, the borders to collaboration are eased, allowing communities to access infrastructure that can be instantly built to be completely open, through to completely closed, all managed securely through (nationally) standardised interfaces. Secondly, it is free and easy to build servers and infrastructure, but it is also cheap to fail, allowing for experimentation not only at a code-level, but at a server or infrastructure level as well. Thirdly, this (virtual) infrastructure can be shared with collaborators, moving the practice of collaboration from sharing papers and code to sharing servers, pre-configured and ready to go. And finally, the underlying infrastructure is built with Big Data in mind, co-located with major data storage infrastructure and high-performance computers, and interconnected with high-speed networks nationally to research instruments. The research cloud is fundamentally new in that it easily allows communities of researchers, often connected by common geography (research precincts), discipline or long-term established collaborations, to build open, collaborative platforms. These open, sharable, and repeatable platforms encourage coordinated use and development, evolving to common community-oriented methods for Big Data access and data manipulation. In this paper we discuss in detail critical ingredients in successfully establishing these communities, as well as some outcomes as a result of these communities and their collaboration enabling platforms. We consider astronomy as an exemplar of a research field that has already looked to the cloud as a solution to the ensuing data tsunami

    Long-term digital preservation: a digital humanities topic?

    Get PDF
    "We argue that the so-called Digital Humanities fail to meet conventional criteria to be an accredited field of study on a par with Literature, Chemistry, Computer Science, and Civil Engineering, or even a specialized professorial emphasis such as Ancient History or Nuclear Physics. The argument uses long-term digital preservation as an example to argue that Digital Humanities proponents' case for their research agenda does not merit financial support, emphasizing practical aspects over subjective theory." (author's abstract

    The role in the Virtual Astronomical Observatory in the era of massive data sets

    Full text link

    Data-Intensive architecture for scientific knowledge discovery

    Get PDF
    This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology

    Towards the prediction of molecular parameters from astronomical emission lines using Neural Networks

    Get PDF
    Molecular astronomy is a field that is blooming in the era of large observatories such as the Atacama Large Millimeter/Submillimeter Array (ALMA). With modern, sensitive, and high spectral resolution radio telescopes like ALMA and the Square Kilometer Array, the size of the data cubes is rapidly escalating, generating a need for powerful automatic analysis tools. This work introduces MolPred, a pilot study to perform predictions of molecular parameters such as excitation temperature (Tex) and column density (log(N)) from input spectra by the use of neural networks. We used as test cases the spectra of CO, HCO+, SiO and CH3CN between 80 and 400 GHz. Training spectra were generated with MADCUBA, a state-of-the-art spectral analysis tool. Our algorithm was designed to allow the generation of predictions for multiple molecules in parallel. Using neural networks, we can predict the column density and excitation temperature of these molecules with a mean absolute error of 8.5% for CO, 4.1% for HCO+, 1.5% for SiO and 1.6% for CH3CN. The prediction accuracy depends on the noise level, line saturation, and number of transitions. We performed predictions upon real ALMA data. The values predicted by our neural network for this real data differ by 13% from the MADCUBA values on average. Current limitations of our tool include not considering linewidth, source size, multiple velocity components, and line blending
    corecore