271,148 research outputs found

    Adaptive clustering procedure for continuous gravitational wave searches

    Full text link
    In hierarchical searches for continuous gravitational waves, clustering of candidates is an important postprocessing step because it reduces the number of noise candidates that are followed-up at successive stages [1][7][12]. Previous clustering procedures bundled together nearby candidates ascribing them to the same root cause (be it a signal or a disturbance), based on a predefined cluster volume. In this paper, we present a procedure that adapts the cluster volume to the data itself and checks for consistency of such volume with what is expected from a signal. This significantly improves the noise rejection capabilities at fixed detection threshold, and at fixed computing resources for the follow-up stages, this results in an overall more sensitive search. This new procedure was employed in the first Einstein@Home search on data from the first science run of the advanced LIGO detectors (O1) [11].Comment: 11 pages, 9 figures, 2 tables; v1: initial submission; v2: journal review, copyedited version; v3: fixed typo in Fig

    Distributed Semantic Web Data Management in HBase and MySQL Cluster

    Full text link
    Various computing and data resources on the Web are being enhanced with machine-interpretable semantic descriptions to facilitate better search, discovery and integration. This interconnected metadata constitutes the Semantic Web, whose volume can potentially grow the scale of the Web. Efficient management of Semantic Web data, expressed using the W3C's Resource Description Framework (RDF), is crucial for supporting new data-intensive, semantics-enabled applications. In this work, we study and compare two approaches to distributed RDF data management based on emerging cloud computing technologies and traditional relational database clustering technologies. In particular, we design distributed RDF data storage and querying schemes for HBase and MySQL Cluster and conduct an empirical comparison of these approaches on a cluster of commodity machines using datasets and queries from the Third Provenance Challenge and Lehigh University Benchmark. Our study reveals interesting patterns in query evaluation, shows that our algorithms are promising, and suggests that cloud computing has a great potential for scalable Semantic Web data management.Comment: In Proc. of the 4th IEEE International Conference on Cloud Computing (CLOUD'11

    A Deep Halpha Survey of Galaxies in the Two Nearby Clusters Abell1367 and Coma: The Halpha Luminosity Functions

    Full text link
    We present a deep wide field Halpha imaging survey of the central regions of the two nearby clusters of galaxies Coma and Abell1367, taken with the WFC at the INT2.5m telescope. We determine for the first time the Schechter parameters of the Halpha luminosity function (LF) of cluster galaxies. The Halpha LFs of Abell1367 and Coma are compared with each other and with that of Virgo, estimated using the B band LF by Sandage et al. (1985) and a L(Halpha) vs M_B relation. Typical parameters of phi^* ~ 10^0.00+-0.07 Mpc^-3, L^* ~ 10^41.25+- 0.05 erg sec^-1 and alpha ~ -0.70+-0.10 are found for the three clusters. The best fitting parameters of the cluster LFs differ from those found for field galaxies, showing flatter slopes and lower scaling luminosities L^*. Since, however, our Halpha survey is significantly deeper than those of field galaxies, this result must be confirmed on similarly deep measurements of field galaxies. By computing the total SFR per unit volume of cluster galaxies, and taking into account the cluster density in the local Universe, we estimate that the contribution of clusters like Coma and Abell1367 is approximately 0.25% of the SFR per unit volume of the local Universe.Comment: 19 pages, 11 figures, accepted for publication in A&

    GPU-Accelerated BWT Construction for Large Collection of Short Reads

    Full text link
    Advances in DNA sequencing technology have stimulated the development of algorithms and tools for processing very large collections of short strings (reads). Short-read alignment and assembly are among the most well-studied problems. Many state-of-the-art aligners, at their core, have used the Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome (typical example, NCBI human genome). Recently, BWT has also found its use in string-graph assembly, for indexing the reads (i.e., raw data from DNA sequencers). In a typical data set, the volume of reads is tens of times of the sequenced genome and can be up to 100 Gigabases. Note that a reference genome is relatively stable and computing the index is not a frequent task. For reads, the index has to computed from scratch for each given input. The ability of efficient BWT construction becomes a much bigger concern than before. In this paper, we present a practical method called CX1 for constructing the BWT of very large string collections. CX1 is the first tool that can take advantage of the parallelism given by a graphics processing unit (GPU, a relative cheap device providing a thousand or more primitive cores), as well as simultaneously the parallelism from a multi-core CPU and more interestingly, from a cluster of GPU-enabled nodes. Using CX1, the BWT of a short-read collection of up to 100 Gigabases can be constructed in less than 2 hours using a machine equipped with a quad-core CPU and a GPU, or in about 43 minutes using a cluster with 4 such machines (the speedup is almost linear after excluding the first 16 minutes for loading the reads from the hard disk). The previously fastest tool BRC is measured to take 12 hours to process 100 Gigabases on one machine; it is non-trivial how BRC can be parallelized to take advantage a cluster of machines, let alone GPUs.Comment: 11 page

    Realfast: Real-Time, Commensal Fast Transient Surveys with the Very Large Array

    Full text link
    Radio interferometers have the ability to precisely localize and better characterize the properties of sources. This ability is having a powerful impact on the study of fast radio transients, where a few milliseconds of data is enough to pinpoint a source at cosmological distances. However, recording interferometric data at millisecond cadence produces a terabyte-per-hour data stream that strains networks, computing systems, and archives. This challenge mirrors that of other domains of science, where the science scope is limited by the computational architecture as much as the physical processes at play. Here, we present a solution to this problem in the context of radio transients: realfast, a commensal, fast transient search system at the Jansky Very Large Array. Realfast uses a novel architecture to distribute fast-sampled interferometric data to a 32-node, 64-GPU cluster for real-time imaging and transient detection. By detecting transients in situ, we can trigger the recording of data for those rare, brief instants when the event occurs and reduce the recorded data volume by a factor of 1000. This makes it possible to commensally search a data stream that would otherwise be impossible to record. This system will search for millisecond transients in more than 1000 hours of data per year, potentially localizing several Fast Radio Bursts, pulsars, and other sources of impulsive radio emission. We describe the science scope for realfast, the system design, expected outcomes, and ways real-time analysis can help in other fields of astrophysics.Comment: Accepted to ApJS Special Issue on Data; 11 pages, 4 figure
    • …
    corecore