Search CORE

38 research outputs found

IsoTree: A New Framework for De novo Transcriptome Assembly from RNA-seq Reads

Author: Feng Haodi
Xu Ying
Zhang Chi
Zhao Jin
Zhu Daming
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2018
Field of study

High-throughput sequencing of mRNA has made the deep and efficient probing of transcriptome more affordable. However, the vast amounts of short RNA-seq reads make de novo transcriptome assembly an algorithmic challenge. In this work, we present IsoTree, a novel framework for transcripts reconstruction in the absence of reference genomes. Unlike most of de novo assembly methods that build de Bruijn graph or splicing graph by connecting

k-mers

which are sets of overlapping substrings generated from reads, IsoTree constructs splicing graph by connecting reads directly. For each splicing graph, IsoTree applies an iterative scheme of mixed integer linear program to build a prefix tree, called isoform tree. Each path from the root node of the isoform tree to a leaf node represents a plausible transcript candidate which will be pruned based on the information of paired-end reads. Experiments showed that in most cases IsoTree performs better than other leading transcriptome assembly programs. IsoTree is available at https://github.com/Jane110111107/IsoTree

IUPUIScholarWorks

Super-Resolution of SOHO/MDI Magnetograms of Solar Active Regions Using SDO/HMI Data and an Attention-Aided Convolutional Neural Network

Author: Abduallah Yasser
Jiang Haodi
Li Qin
Wang Haimin
Wang Jason T. L.
Xu Chunhui
Xu Yan
Publication venue
Publication date: 27/03/2024
Field of study

Image super-resolution has been an important subject in image processing and recognition. Here, we present an attention-aided convolutional neural network (CNN) for solar image super-resolution. Our method, named SolarCNN, aims to enhance the quality of line-of-sight (LOS) magnetograms of solar active regions (ARs) collected by the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO). The ground-truth labels used for training SolarCNN are the LOS magnetograms collected by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO). Solar ARs consist of strong magnetic fields in which magnetic energy can suddenly be released to produce extreme space weather events, such as solar flares, coronal mass ejections, and solar energetic particles. SOHO/MDI covers Solar Cycle 23, which is stronger with more eruptive events than Cycle 24. Enhanced SOHO/MDI magnetograms allow for better understanding and forecasting of violent events of space weather. Experimental results show that SolarCNN improves the quality of SOHO/MDI magnetograms in terms of the structural similarity index measure (SSIM), Pearson's correlation coefficient (PCC), and the peak signal-to-noise ratio (PSNR).Comment: 17 pages, 7 figure

arXiv.org e-Print Archive

Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks

Author: Ahn Kwangsu
Cao Wenda
Hsu Wynne
Jiang Haodi
Li Qin
Wang Haimin
Wang Jason T. L.
Xu Yan
Publication venue: 'American Astronomical Society'
Publication date: 08/10/2022
Field of study

Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.Comment: 16 pages, 8 figure

arXiv.org e-Print Archive

A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data

Author: Abduallah Yasser
Hsu Wynne
Hu Zhihang
Jiang Haodi
Jing Ju
Li Qin
Liu Nian
Wang Haimin
Wang Jason T. L.
Xu Yan
Zhang Genwei
Publication venue
Publication date: 04/11/2022
Field of study

Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle in which consistent time-sequence vector magnetograms have been available through the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look into another major instrument, namely the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data archive of SOHO/MDI covers more active solar cycle 23 with many large flares. However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a new deep learning method, named MagNet, to learn from combined LOS magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations collected by the Big Bear Solar Observatory (BBSO), and to generate vector components Bx' and By', which would form vector magnetograms with observed LOS data. In this way, we can expand the availability of vector magnetograms to the period from 1996 to present. Experimental results demonstrate the good performance of the proposed method. To our knowledge, this is the first time that deep learning has been used to generate photospheric vector magnetograms of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.Comment: 15 pages, 6 figure

arXiv.org e-Print Archive

Comparing Phylogenetic and Deep Learning Methods to Predict Seed Dispersal Mode

Author: Xu Haodi
Publication venue: Massachusetts Institute of Technology
Publication date: 04/08/2023
Field of study

Increasing tree cover is a promising natural climate solution to reduce carbon under the pressing global warming. Seed dispersal is a key process in natural forest regrowth, where seeds are moved away from parent plants to establish new growth. Dispersal modes include biotic and abiotic methods, and vary depending on traits such as seed shape, size, and color. However, globally, data on seed dispersal modes of plant species is limited, hindering our understanding of the importance of wild animals in increasing tree cover and their role in carbon sequestration. The research goal of this study is to find a method to predict unknown seed dispersal modes with high accuracy by comparing a novel deep learning method with a typical phylogenetic imputation method. Here we show that the phylogenetic imputation method performed better than deep learning methods in predicting biotic seed dispersal mode. However, we also found that the deep learning methods demonstrate great potential in learning from community science photographs, despite their underperformance in this study. Furthermore, the study shows that incorporating a feature-extraction model could improve predictions of a single CNN model, highlighting the potential for future studies to include more models for better predictions of seed dispersal modes. We anticipate that the problems and potential improvements identified in this study relating to the deep learning method will serve as a starting point for further model development to predict the seed dispersal mode of unknown species with greater accuracy. This could involve applying multiple models, incorporating phylogenetic information with deep learning models, and including additional features. Accurately understanding how different plant species are dispersed can help scientists better predict future forest dynamics and carbon storage capacity, which is critical for studying future climate change and developing effective climate change mitigation strategies.M.Eng

DSpace@MIT

A critical review on BDE-209: Source, distribution, influencing factors, toxicity, and degradation

Author: Haodi Wu
Jing Hou
Yanli Xu
Yuqiong Sun
Publication venue: Elsevier
Publication date: 01/01/2024
Field of study

As the most widely used polybrominated diphenyl ether, BDE-209 is commonly used in polymer-based commercial and household products. Due to its unique physicochemical properties, BDE-209 is ubiquitous in a variety of environmental compartments and can be exposed to organisms in various ways and cause toxic effects. The present review outlines the current state of knowledge on the occurrence of BDE-209 in the environment, influencing factors, toxicity, and degradation. BDE-209 has been detected in various environmental matrices including air, soil, water, and sediment. Additionally, environmental factors such as organic matter, total suspended particulate, hydrodynamic, wind, and temperature affecting BDE-209 are specifically discussed. Toxicity studies suggest BDE-209 may cause systemic toxic effects on living organisms, reproductive toxicity, embryo-fetal toxicity, genetic toxicity, endocrine toxicity, neurotoxicity, immunotoxicity, and developmental toxicity, or even be carcinogenic. BDE-209 has toxic effects on organisms mainly through epigenetic regulation and induction of oxidative stress. Evidence regarding the degradation of BDE-209, including biodegradation, photodegradation, Fenton degradation, zero-valent iron degradation, chemical oxidative degradation, and microwave radiation degradation is summarized. This review may contribute to assessing the environmental risks of BDE-209 to help develop rational management plans

Directory of Open Access Journals

Nonlinear Landau-Zener tunneling in Majorana’s stellar representation

Author: Biao Wu
Haodi Liu
Qiuyi Guo
Tianji Zhou
Xu-Zong Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

By representing the evolution of a quantum state with the trajectories of the stars on a Bloch sphere, the Majorana’s stellar representation provides an intuitive way to understand quantum motion in a high dimensional projective Hilbert space. In this work we show that the Majorana’s representation offers a very interesting and intuitive way to understand the nonlinear Landau-Zener tunneling. In particular, the breakdown of adiabaticity in this tunneling phenomenon can be understood as some of the stars never reaching the south pole. We also establish a connection between the Majorana stars in the second quantized model and the single star in the mean field model by using the reduced density matrix

EDP Sciences OAI-PMH repository (1.2.0)

Time trends and future prediction of coal worker’s pneumoconiosis in opencast coal mine in China based on the APC model

Author: Bing Han
Haodi Xu
Hongbo Liu
Jinbin Sun
Wei Xian
Yuting Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2018
Field of study

Abstract Background The opencast coal mine is a specific mine differing from the underground mine. There are differences in the way into the ore body, the organization of production, transport technology and other aspects. This study aimed to describe the prevalence of CWP among ex-dust miners in opencast coal mines and estimate the incidence trend of CWP by APC model in the future. Methods All opencast miners who had been exposed to dust for at least 1 year in opencast mines were enrolled in this study. The database included demographic details, occupational history records with the date of dust exposure, physical examination records and pneumoconiosis diagnosis records. An age-period-cohort (APC) model has been carried out in order to explore the effects of the age, period and cohort on the prevalence of CWP among ex-dust opencast miners. Results 8191 opencast miners were enrolled in the study, including 259 miners with CWP and 7932 miners without CWP. The incidence density of CWP would have an increasing trend in opencast mines from 2005 to 2024. The number of possible CWP patients predicted in this period was approximately 492. Of them, 275 miners could have suffered from CWP in 2005–2014 and 217 miners would suffer from CWP in 2015–2024 among the ex-dust opencast miners. Conclusions The APC model had a goodness of fit in predicting the incidence trend of CWP in opencast coal mines. By this model, we predicted that 492 opencast miners could be diagnosed as CWP from 2005 to 2024. Therefore ex-dust opencast miners cannot be ignored and they should have regular physical examinations and detection for CWP

Directory of Open Access Journals

Quality Assessment of Sea Surface Salinity from Multiple Ocean Reanalysis Products

Author: Hailong Guo
Haodi Wang
Kaijun Ren
Peng Xu
Wen Zhang
Ziqi You
Publication venue: 'MDPI AG'
Publication date: 01/12/2022
Field of study

Sea surface salinity (SSS) is one of the Essential Climate Variables (ECVs) as defined by the Global Climate Observing System (GCOS). Acquiring high-quality SSS datasets with high spatial-temporal resolution is crucial for research on the hydrological cycle and the earth climate. This study assessed the quality of SSS data provided by five high-resolution ocean reanalysis products, including the Hybrid Coordinate Ocean Model (HYCOM) 1/12° global reanalysis, the Copernicus Global 1/12° Oceanic and Sea Ice GLORYS12 Reanalysis, the Simple Ocean Data Assimilation (SODA) reanalysis, the ECMWF Oceanic Reanalysis System 5 (ORAS5) product and the Estimating the Circulation and Climate of the Ocean Phase II (ECCO2) reanalysis. Regional comparison in the Mediterranean Sea shows that reanalysis largely depicts the accurate spatial SSS structure away from river mouths and coastal areas but slightly underestimates the mean SSS values. Better SSS reanalysis performance is found in the Levantine Sea while larger SSS uncertainties are found in the Adriatic Sea and the Aegean Sea. The global comparison with CMEMS level-4 (L4) SSS shows generally consistent large-scale structures. The mean ΔSSS between monthly gridded reanalysis data and in situ analyzed data is −0.1 PSU in the open seas between 40° S and 40° N with the mean Root Mean Square Deviation (RMSD) generally smaller than 0.3 PSU and the majority of correlation coefficients higher than 0.5. A comparison with collocated buoy salinity shows that reanalysis products well capture the SSS variations at the locations of tropical moored buoy arrays at weekly scale. Among all of the five products, the data quality of HYCOM reanalysis SSS is highest in marginal sea, GLORYS12 has the best performance in the global ocean especially in tropical regions. Comparatively, ECCO2 has the overall worst performance to reproduce SSS states and variations by showing the largest discrepancies with CMEMS L4 SSS

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Enabling High-Quality Machine Learning Model Trading on Blockchain-Based Marketplace

Author: Chunxiao Li
Enliang Xu
Haodi Wang
Shenling Wang
Yu Zhao
Yuxin Xi
Publication venue: 'MDPI AG'
Publication date: 01/06/2023
Field of study

Machine learning model sharing markets have emerged as a popular platform for individuals and companies to share and access machine learning models. These markets enable more people to benefit from the field of artificial intelligence and to leverage its advantages on a broader scale. However, these markets face challenges in designing effective incentives for model owners to share their models, and for model users to provide honest feedback on model quality. This paper proposes a novel game theoretic framework for machine learning model sharing markets that addresses these challenges. Our framework includes two main components: a mechanism for incentivizing model owners to share their models, and a mechanism for encouraging the honest evaluation of model quality by the model users. To evaluate the effectiveness of our framework, we conducted experiments and the results demonstrate that our mechanism for incentivizing model owners is effective at encouraging high-quality model sharing, and our reputation system encourages the honest evaluation of model quality

Directory of Open Access Journals