4,326 research outputs found
Fast Structural Search in Phylogenetic Databases
As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. We propose structural search techniques that, given a query or pattern tree P and a database of phylogenies D, find trees in D that are sufficiently close to P. The āclosenessā is a measure of the topological relationships in P that are found to be the same or similar in a tree D in D. We develop a filtering technique that accelerates searches and present algorithms for rooted and unrooted trees where the trees can be weighted or unweighted. Experimental results on comparing the similarity measure with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate that the proposed approach is promising
RADAR: a web server for RNA data analysis and research
RADAR is a web server that provides a multitude of functionality for RNA data analysis and research. It can align structure-annotated RNA sequences so that both sequence and structure information are taken into consideration during the alignment process. This server is capable of performing pairwise structure alignment, multiple structure alignment, database search and clustering. In addition, RADAR provides two salient features: (i) constrained alignment of RNA secondary structures, and (ii) prediction of the consensus structure for a set of RNA sequences. RADAR will be able to assist scientists in performing many important RNA mining operations, including the understanding of the functionality of RNA sequences, the detection of RNA structural motifs and the clustering of RNA molecules, among others. The web server together with a software package for download is freely accessible at http://datalab.njit.edu/biodata/rna/RSmatch/server.htm and http://www.ccrnp.ncifcrf.gov/~bshapiro
Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks
Obtaining high-quality magnetic and velocity fields through Stokes inversion
is crucial in solar physics. In this paper, we present a new deep learning
method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight
(LOS) velocities and Doppler widths from Stokes profiles collected by the Near
InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope
(GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is
prepared by a Milne-Eddington (ME) inversion code used by BBSO. We
quantitatively assess SDNN, comparing its inversion results with those obtained
by the ME inversion code and related machine learning (ML) algorithms such as
multiple support vector regression, multilayer perceptrons and a pixel-level
convolutional neural network. Major findings from our experimental study are
summarized as follows. First, the SDNN-inferred LOS velocities are highly
correlated to the ME-calculated ones with the Pearson product-moment
correlation coefficient being close to 0.9 on average. Second, SDNN is faster,
while producing smoother and cleaner LOS velocity and Doppler width maps, than
the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps
than those from the related ML algorithms, demonstrating the better learning
capability of SDNN than the ML algorithms. Finally, comparison between the
inversion results of ME and SDNN based on GST/NIRIS and those from the
Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in
flare-prolific active region NOAA 12673 is presented. We also discuss
extensions of SDNN for inferring vector magnetic fields with empirical
evaluation.Comment: 16 pages, 8 figure
MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach
Gene regulation is a series of processes that control gene expression and its
extent. The connections among genes and their regulatory molecules, usually
transcription factors, and a descriptive model of such connections, are known
as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand
the inner workings of the cell and the complexity of gene interactions. To
date, numerous algorithms have been developed to infer gene regulatory
networks. However, as the number of identified genes increases and the
complexity of their interactions is uncovered, networks and their regulatory
mechanisms become cumbersome to test. Furthermore, prodding through
experimental results requires an enormous amount of computation, resulting in
slow data processing. Therefore, new approaches are needed to expeditiously
analyze copious amounts of experimental data resulting from cellular GRNs. To
meet this need, cloud computing is promising as reported in the literature.
Here we propose new MapReduce algorithms for inferring gene regulatory networks
on a Hadoop cluster in a cloud environment. These algorithms employ an
information-theoretic approach to infer GRNs using time-series microarray data.
Experimental results show that our MapReduce program is much faster than an
existing tool while achieving slightly better prediction accuracy than the
existing tool.Comment: 19 pages, 5 figure
Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification
We propose a novel deep learning framework, named SYMHnet, which employs a
graph neural network and a bidirectional long short-term memory network to
cooperatively learn patterns from solar wind and interplanetary magnetic field
parameters for short-term forecasts of the SYM-H index based on 1-minute and
5-minute resolution data. SYMHnet takes, as input, the time series of the
parameters' values provided by NASA's Space Science Data Coordinated Archive
and predicts, as output, the SYM-H index value at time point t + w hours for a
given time point t where w is 1 or 2. By incorporating Bayesian inference into
the learning framework, SYMHnet can quantify both aleatoric (data) uncertainty
and epistemic (model) uncertainty when predicting future SYM-H indices.
Experimental results show that SYMHnet works well at quiet time and storm time,
for both 1-minute and 5-minute resolution data. The results also show that
SYMHnet generally performs better than related machine learning methods. For
example, SYMHnet achieves a forecast skill score (FSS) of 0.343 compared to the
FSS of 0.074 of a recent gradient boosting machine (GBM) method when predicting
SYM-H indices (1 hour in advance) in a large storm (SYM-H = -393 nT) using
5-minute resolution data. When predicting the SYM-H indices (2 hours in
advance) in the large storm, SYMHnet achieves an FSS of 0.553 compared to the
FSS of 0.087 of the GBM method. In addition, SYMHnet can provide results for
both data and model uncertainty quantification, whereas the related methods
cannot.Comment: 28 pages, 8 figure
A Closer Look at Small-Scale Magnetic Flux Ropes in the Solar Wind at 1 AU: Results from Improved Automated Detection
Small-scale interplanetary magnetic flux ropes (SMFRs) are similar to ICMEs
in magnetic structure, but are smaller and do not exhibit ICME plasma
signatures. We present a computationally efficient and GPU-powered version of
the single-spacecraft automated SMFR detection algorithm based on the
Grad-Shafranov (GS) technique. Our algorithm is capable of processing higher
resolution data, eliminates selection bias caused by a fixed \avg{B}
threshold, has improved detection criteria demonstrated to have better results
on an MHD simulation, and recovers full 2.5D cross sections using GS
reconstruction. We used it to detect 512,152 SMFRs from 27 years (1996 to 2022)
of 3-second cadence \emph{Wind} measurements. Our novel findings are: (1) the
radial density of SMFRs at 1 au ( per \si{10^6\kilo\meter}) and
filling factor (35\%) are independent of solar activity, distance to
the heliospheric current sheet (HCS), and solar wind plasma type, although the
minority of SMFRs with diameters greater than 0.01 au have a strong
solar activity dependence; (2) SMFR diameters follow a log-normal distribution
that peaks below the resolved range ( km), although the filling
factor is dominated by SMFRs between to km; (3) most SMFRs at 1
au have strong field-aligned flows like those from PSP measurements; (4) in
terms of diameter , SMFR poloidal flux , axial flux
, average twist number , current density
, and helicity . Implications for the origin
of SMFRs and switchbacks are briefly discussed. The new algorithm and SMFR
dataset are made freely available
A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data
Solar activity is usually caused by the evolution of solar magnetic fields.
Magnetic field parameters derived from photospheric vector magnetograms of
solar active regions have been used to analyze and forecast eruptive events
such as solar flares and coronal mass ejections. Unfortunately, the most recent
solar cycle 24 was relatively weak with few large flares, though it is the only
solar cycle in which consistent time-sequence vector magnetograms have been
available through the Helioseismic and Magnetic Imager (HMI) on board the Solar
Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look
into another major instrument, namely the Michelson Doppler Imager (MDI) on
board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data
archive of SOHO/MDI covers more active solar cycle 23 with many large flares.
However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a
new deep learning method, named MagNet, to learn from combined LOS
magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations
collected by the Big Bear Solar Observatory (BBSO), and to generate vector
components Bx' and By', which would form vector magnetograms with observed LOS
data. In this way, we can expand the availability of vector magnetograms to the
period from 1996 to present. Experimental results demonstrate the good
performance of the proposed method. To our knowledge, this is the first time
that deep learning has been used to generate photospheric vector magnetograms
of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.Comment: 15 pages, 6 figure
- ā¦