Search CORE

24 research outputs found

Illustration of GPU parallel computing in FamSeq.

Author: Gang Peng (164726)
Wenyi Wang (169392)
Yu Fan (111503)
Publication venue
Publication date
Field of study

The program can be divided into two parts: a serial part and a parallel part. The serial part is processed in a CPU and the parallel part is processed in a GPU. The program: 1. Prepare the data for parallel computing in a CPU; 2. Copy the data from CPU memory to GPU memory; 3. Parallelize the 3n jobs computing in the GPU, where n is the pedigree size; 4. Copy the results from GPU memory to CPU memory; and 5. Summarize the results in the CPU.</p

FigShare

FamSeq: A Variant Calling Program for Family-Based Sequencing Data Using Graphics Processing Units

Author: Gang Peng (164726)
Wenyi Wang (169392)
Yu Fan (111503)
Publication venue
Publication date: 01/10/2014
Field of study

<div>Various algorithms have been developed for variant calling using next-generation sequencing data, and various methods have been applied to reduce the associated false positive and false negative rates. Few variant calling programs, however, utilize the pedigree information when the family-based sequencing data are available. Here, we present a program, FamSeq, which reduces both false positive and false negative rates by incorporating the pedigree information from the Mendelian genetic model into variant calling. To accommodate variations in data complexity, FamSeq consists of four distinct implementations of the Mendelian genetic model: the Bayesian network algorithm, a graphics processing unit version of the Bayesian network algorithm, the Elston-Stewart algorithm and the Markov chain Monte Carlo algorithm. To make the software efficient and applicable to large families, we parallelized the Bayesian network algorithm that copes with pedigrees with inbreeding loops without losing calculation precision on an NVIDIA graphics processing unit. In order to compare the difference in the four methods, we applied FamSeq to pedigree sequencing data with family sizes that varied from 7 to 12. When there is no inbreeding loop in the pedigree, the Elston-Stewart algorithm gives analytical results in a short time. If there are inbreeding loops in the pedigree, we recommend the Bayesian network method, which provides exact answers. To improve the computing speed of the Bayesian network method, we parallelized the computation on a graphics processing unit. This allowed the Bayesian network method to process the whole genome sequencing data of a family of 12 individuals within two days, which was a 10-fold time reduction compared to the time required for this computation on a central processing unit.</div

Directory of Open Access Journals

PubMed Central

FigShare

The total time (in seconds) needed for computation using FamSeq at one million positions.

Author: Gang Peng (164726)
Wenyi Wang (169392)
Yu Fan (111503)
Publication venue
Publication date
Field of study

PU: processing unit; E-S: Elston-Stewart algorithm; MCMC: Markov chain Monte Carlo algorithm; BN: Bayesian network algorithm; N: No, inbreeding loops are not considered; Y: Yes, inbreeding loops are considered.aWe called only 100,000 variants due to excessive running time for the MCMC algorithm. The time shown here is 10× the time required to call 100,000 variants.bThe time in parentheses is the GPU computing time.The total time (in seconds) needed for computation using FamSeq at one million positions.</p

FigShare

Illustration of input files.

Author: Gang Peng (164726)
Wenyi Wang (169392)
Yu Fan (111503)
Publication venue
Publication date
Field of study

A.) Pedigree structure. B.) Pedigree structure file storing the pedigree structure shown in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003880#pcbi-1003880-g002" target="_blank">Fig. 2A</a>. From the left-most column to the right-most column, the data are ID, mID (mother ID), fID (father ID), gender and sample name. C.) Part of VCF file. From the VCF file, we can find that the genome of the grandfather (G-Father) was not sequenced. We add his information to the pedigree structure file to avoid ambiguity. For example, if we include only one parent of two siblings in the pedigree structure file, it will be unclear whether they are full or half siblings. The sample name in the pedigree structure file should be the same as the sample name in the VCF file. When the actual genome was not sequenced, we set the corresponding sample name as NA in the pedigree structure file.</p

FigShare

Conserved Noncoding Sequences Regulate lhx5 Expression in the Zebrafish Forebrain

Author: Fengjiao Chen (765413)
Gang Peng (164726)
Liu Sun (164721)
Publication venue
Publication date
Field of study

<div>The LIM homeobox family protein Lhx5 plays important roles in forebrain development in the vertebrates. The lhx5 gene exhibits complex temporal and spatial expression patterns during early development but its transcriptional regulation mechanisms are not well understood. Here, we have used transgenesis in zebrafish in order to define regulatory elements that drive lhx5 expression in the forebrain. Through comparative genomic analysis we identified 10 non-coding sequences conserved in five teleost species. We next examined the enhancer activities of these conserved non-coding sequences with Tol2 transposon mediated transgenesis. We found a proximately located enhancer gave rise to robust reporter EGFP expression in the forebrain regions. In addition, we identified an enhancer located at approximately 50 kb upstream of lhx5 coding region that is responsible for reporter gene expression in the hypothalamus. We also identify an enhancer located approximately 40 kb upstream of the lhx5 coding region that is required for expression in the prethalamus (ventral thalamus). Together our results suggest discrete enhancer elements control lhx5 expression in different regions of the forebrain.</div

Public Library of Science (PLOS)

FigShare

Workflow of FamSeq.

Author: Gang Peng (164726)
Wenyi Wang (169392)
Yu Fan (111503)
Publication venue
Publication date
Field of study

We use a pedigree file and a file that includes the likelihood () as the input to estimate the posterior probability () for each variant genotype. (E-S: Elston-Stewart algorithm; BN: Bayesian network method; BN-GPU: The computer needs a GPU card installed to run the GPU version of the Bayesian network method; MCMC: Markov chain Monte Carlo method; VCF: variant call format.)</p

FigShare

CNS2 contains hypothalamic enhancer activity and responses to FGF signaling.

Author: Fengjiao Chen (765413)
Gang Peng (164726)
Liu Sun (164721)
Publication venue
Publication date
Field of study

(A-B) Double in situ hybridization results indicate CNS2 contains hypothalamic enhancer activity. The hypothalamic marker nkx2.1a and nkx2.2b are stained in dark blue, reporter egfp stained in red. (C-D) SU5402 treatment severely reduces CNS2 activity. Vehicle DMSO treated embryos show restricted hypothalamic EGFP reporter expression (pointed by the arrow in C). Embryos treated with the FGF signaling inhibitor SU5402 during the segmentation stage (10-24hpf) show minimal EGFP signals in the hypothalamic region (arrow in D, n = 48/55). (E-F) SU5402 treatment down-regulates endogenous lhx5 expression in the hypothalamic region. Endogenous lhx5 shows robust expression in the hypothalamic region (pointed by the arrow in E). SU5402 treatment during the segmentation stage down-regulates lhx5 expression in the hypothalamic region (arrow in F, n = 25/28). (G) Multiple sequence alignments of the CNS2 region in the five teleost species. The identified FGF downstream factor Pea3 binding site is highlighted in blue. Lateral view of the forebrain regions of embryos at 24 hpf (A-F), anterior to the left. Scale bar: 40μm in A-B; 50μm in C-D.</p

FigShare

Region specific enhancer activity of the identified CNSs.

Author: Fengjiao Chen (765413)
Gang Peng (164726)
Liu Sun (164721)
Publication venue
Publication date
Field of study

(A-B) CNS8 and CNS9, located in the vicinity of the lhx5 promoter region give rise to broad reporter EGFP expression in the forebrain regions. (C) CNS2 located approximately 50 kb upstream of the lhx5 coding region gives rise to restricted EGFP signal in the anterior ventral forebrain. (D) CNS4 located 40 kb upstream of the lhx5 coding region, gives rise to restricted EGFP expression in the diencephalic region. (E) Vector construct gives rise to basal non-tissue specific EGFP expression in transient expression assay. Lateral view of the forebrain regions of embryos at 24 hpf, anterior to the left. Scale bar: 50μm.</p

FigShare

A self-supervised learning-based 6-DOF grasp planning method for manipulator

Author: Gang Peng (164726)
Hao Wang (39217)
Mohammad Khyam (9810281)
Xinde Li (294591)
Zhenyu Ren (11888034)
Publication venue
Publication date: 01/01/2021
Field of study

To realize a robust robotic grasping system for unknown objects in an unstructured environment, large amounts of grasp data and 3D model data for the object are required; the sizes of these data directly affect the rate of successful grasps. To reduce the time cost of data acquisition and labeling and increase the rate of successful grasps, we developed a self-supervised learning mechanism to control grasp tasks performed by manipulators. First, a manipulator automatically collects the point cloud for the objects from multiple perspectives to increase the efficiency of data acquisition. The complete point cloud for the objects is obtained using the hand-eye vision of the manipulator and the truncated signed distance function algorithm. Then, the point cloud data for the objects are used to generate a series of six-degrees-of-freedom grasp poses, and the force-closure decision algorithm is used to add the grasp quality label to each grasp pose to realize the automatic labeling of grasp data. Finally, the point cloud in the gripper closing area corresponding to each grasp pose is obtained and used to train the grasp-quality classification model for the manipulator. The results of performing actual grasping experiments demonstrate that the proposed self-supervised learning method can increase the rate of successful grasps for the manipulator

aCQUIRe

A self-supervised learning-based 6-DOF grasp planning method for manipulator

Author: Gang Peng (164726)
Hao Wang (39217)
Mohammad Khyam (9810281)
Xinde Li (294591)
Zhenyu Ren (11888034)
Publication venue
Publication date: 01/01/2021
Field of study

To realize a robust robotic grasping system for unknown objects in an unstructured environment, large amounts of grasp data and 3D model data for the object are required, the sizes of which directly affect the rate of successful grasps. To reduce the time cost of data acquisition and labeling and increase the rate of successful grasps, we developed a self-supervised learning mechanism to control grasp tasks performed by manipulators. First, a manipulator automatically collects the point cloud for the objects from multiple perspectives to increase the efficiency of data acquisition. The complete point cloud for the objects is obtained by utilizing the hand-eye vision of the manipulator, and the TSDF algorithm. Then, the point cloud data for the objects is used to generate a series of six-degrees-of-freedom grasp poses, and the force-closure decision algorithm is used to add the grasp quality label to each grasp pose to realize the automatic labeling of grasp data. Finally, the point cloud in the gripper closing area corresponding to each grasp pose is obtained; it is then used to train the grasp-quality classification model for the manipulator. The results of data acquisition experiments demonstrate that the proposed method allows high-quality data to be obtained. The simulated results prove the effectiveness of the proposed grasp-data acquisition method. The results of performing actual grasping experiments demonstrate that the proposed self-supervised learning method can increase the rate of successful grasps for the manipulator

arXiv.org e-Print Archive

aCQUIRe