4,326 research outputs found

    Fast Structural Search in Phylogenetic Databases

    Get PDF
    As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P. The ā€œclosenessā€ is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising

    RADAR: a web server for RNA data analysis and research

    Get PDF
    RADAR is a web server that provides a multitude of functionality for RNA data analysis and research. It can align structure-annotated RNA sequences so that both sequence and structure information are taken into consideration during the alignment process. This server is capable of performing pairwise structure alignment, multiple structure alignment, database search and clustering. In addition, RADAR provides two salient features: (i) constrained alignment of RNA secondary structures, and (ii) prediction of the consensus structure for a set of RNA sequences. RADAR will be able to assist scientists in performing many important RNA mining operations, including the understanding of the functionality of RNA sequences, the detection of RNA structural motifs and the clustering of RNA molecules, among others. The web server together with a software package for download is freely accessible at http://datalab.njit.edu/biodata/rna/RSmatch/server.htm and http://www.ccrnp.ncifcrf.gov/~bshapiro

    Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks

    Full text link
    Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.Comment: 16 pages, 8 figure

    MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach

    Full text link
    Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections, are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.Comment: 19 pages, 5 figure

    Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification

    Full text link
    We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values provided by NASA's Space Science Data Coordinated Archive and predicts, as output, the SYM-H index value at time point t + w hours for a given time point t where w is 1 or 2. By incorporating Bayesian inference into the learning framework, SYMHnet can quantify both aleatoric (data) uncertainty and epistemic (model) uncertainty when predicting future SYM-H indices. Experimental results show that SYMHnet works well at quiet time and storm time, for both 1-minute and 5-minute resolution data. The results also show that SYMHnet generally performs better than related machine learning methods. For example, SYMHnet achieves a forecast skill score (FSS) of 0.343 compared to the FSS of 0.074 of a recent gradient boosting machine (GBM) method when predicting SYM-H indices (1 hour in advance) in a large storm (SYM-H = -393 nT) using 5-minute resolution data. When predicting the SYM-H indices (2 hours in advance) in the large storm, SYMHnet achieves an FSS of 0.553 compared to the FSS of 0.087 of the GBM method. In addition, SYMHnet can provide results for both data and model uncertainty quantification, whereas the related methods cannot.Comment: 28 pages, 8 figure

    A Closer Look at Small-Scale Magnetic Flux Ropes in the Solar Wind at 1 AU: Results from Improved Automated Detection

    Full text link
    Small-scale interplanetary magnetic flux ropes (SMFRs) are similar to ICMEs in magnetic structure, but are smaller and do not exhibit ICME plasma signatures. We present a computationally efficient and GPU-powered version of the single-spacecraft automated SMFR detection algorithm based on the Grad-Shafranov (GS) technique. Our algorithm is capable of processing higher resolution data, eliminates selection bias caused by a fixed \avg{B} threshold, has improved detection criteria demonstrated to have better results on an MHD simulation, and recovers full 2.5D cross sections using GS reconstruction. We used it to detect 512,152 SMFRs from 27 years (1996 to 2022) of 3-second cadence \emph{Wind} measurements. Our novel findings are: (1) the radial density of SMFRs at 1 au (āˆ¼1{\sim}1 per \si{10^6\kilo\meter}) and filling factor (āˆ¼{\sim}35\%) are independent of solar activity, distance to the heliospheric current sheet (HCS), and solar wind plasma type, although the minority of SMFRs with diameters greater than āˆ¼{\sim}0.01 au have a strong solar activity dependence; (2) SMFR diameters follow a log-normal distribution that peaks below the resolved range (ā‰³104\gtrsim 10^4 km), although the filling factor is dominated by SMFRs between 10510^5 to 10610^6 km; (3) most SMFRs at 1 au have strong field-aligned flows like those from PSP measurements; (4) in terms of diameter dd, SMFR poloidal flux āˆd1.2\propto d^{1.2}, axial flux āˆd2.0\propto d^{2.0}, average twist number āˆdāˆ’0.8\propto d^{-0.8}, current density āˆdāˆ’0.8\propto d^{-0.8}, and helicity āˆd3.2\propto d^{3.2}. Implications for the origin of SMFRs and switchbacks are briefly discussed. The new algorithm and SMFR dataset are made freely available

    A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data

    Full text link
    Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle in which consistent time-sequence vector magnetograms have been available through the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look into another major instrument, namely the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data archive of SOHO/MDI covers more active solar cycle 23 with many large flares. However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a new deep learning method, named MagNet, to learn from combined LOS magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations collected by the Big Bear Solar Observatory (BBSO), and to generate vector components Bx' and By', which would form vector magnetograms with observed LOS data. In this way, we can expand the availability of vector magnetograms to the period from 1996 to present. Experimental results demonstrate the good performance of the proposed method. To our knowledge, this is the first time that deep learning has been used to generate photospheric vector magnetograms of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.Comment: 15 pages, 6 figure
    • ā€¦
    corecore