Search CORE

38 research outputs found

DockerBIO: web application for efficient use of bioinformatics Docker images

Author: ChangHyuk Kwon
Jaegyoon Ahn
Jason Kim
Publication venue: 'PeerJ'
Publication date: 01/11/2018
Field of study

Background and Objective Docker is a light containerization program that shows almost the same performance as a local environment. Recently, many bioinformatics tools have been distributed as Docker images that include complex settings such as libraries, configurations, and data if needed, as well as the actual tools. Users can simply download and run them without making the effort to compile and configure them, and can obtain reproducible results. In spite of these advantages, several problems remain. First, there is a lack of clear standards for distribution of Docker images, and the Docker Hub often provides multiple images with the same objective but different uses. For these reasons, it can be difficult for users to learn how to select and use them. Second, Docker images are often not suitable as a component of a pipeline, because many of them include big data. Moreover, a group of users can have difficulties when sharing a pipeline composed of Docker images. Users of a group may modify scripts or use different versions of the data, which causes inconsistent results. Methods and Results To handle the problems described above, we developed a Java web application, DockerBIO, which provides reliable, verified, light-weight Docker images for various bioinformatics tools and for various kinds of reference data. With DockerBIO, users can easily build a pipeline with tools and data registered at DockerBIO, and if necessary, users can easily register new tools or data. Built pipelines are registered in DockerBIO, which provides an efficient running environment for the pipelines registered at DockerBIO. This enables user groups to run their pipelines without expending much effort to copy and modify them

Directory of Open Access Journals

Kelimpahan Dan Keanekaragaman Plankton Di Perairan Laguna Desa Tolongano Kecamatan Banawa Selatan

Author: Chihyun Park (199344)
Giup Jang (3800029)
Jaegyoon Ahn (199346)
Min Oh (651889)
Taekeon Lee (3800026)
Youngmi Yoon (199347)
Publication venue: Province of Central Sulawesi Government
Publication date: 01/01/2010
Field of study

Penelitian bertujuan untuk mengetahui kelimpahan dan keanekaragaman plankton yang ada di Perairan Laguna, Desa Tolongano, Kecamatan Banawa Selatan. Penelitian dilaksanakan pada bulan Juni – Juli 2009. Pengambilan sampel plankton bertempat di Perairan Laguna, Desa Tolongano, Kecamatan Banawa Selatan, Kabupaten Donggala. Identifikasi sampel dilakukan di Laboratorium Budidaya Perairan, Fakultas Pertanian, Universitas Tadulako. Metode penelitian yang digunakan adalah purpossive sampling method (penempatan titik sampel dengan sengaja). Stasiun pengambilan sampel terdiri atas 5 stasiun, dilakukan sebanyak 3 kali yaitu pada pukul 07.00, 12.00, dan 17.00 WITA. Hasil penelitian menunjukkan, bahwa kelimpahan fitoplankton dari kelas Bacillariophyceae berkisar antara 8.925 – 16.135 ind/l dan kelimpahan zooplankton dari kelas Crustacea berkisar antara 35 – 70 ind/l, indeks keanekaragaman fitoplankton dari kelas Bacillariophyceae berkisar antara 2,010 – 2,504 dan indeks keanekaragaman zooplankton dari kelas Crustacea berkisar antara 0 – 0,6931, indeks dominansi dari kelas Bacillariophyceae berkisar antara 1,1995 – 1,2326 menunjukkan ada jenis plankton yang mendominasi, yaitu Nitzchia sp

Neliti

FigShare

A Multi-Sample Based Method for Identifying Common CNVs in Normal Human Genomic Structure Using High-Resolution aCGH Data

Author: AB Olshen
AJ Iafrate
BE Stranger
C Alkan
C Xie
Chihyun Park
DF Conrad
DP Locke
E Ben-Yaacov
F Hormozdiari
F Picard
GH Perry
H Lee
H Park
H Willenbrock
J Huang
JA Berger
Jaegyoon Ahn
JC Marioni
JI Kim
K Bleakley
K Wang
KK Wong
M Wigler
NP Carter
NR Zhang
Olivier Lespinet
OM Rueda
P Hupe
PHC Eilers
QY Zhang
R Pique-Regi
R Pique-Regi
R Redon
R Tibshirani
RM Durbin
S Levy
Sanghyun Park
SJ Diskin
SP Shah
T LaFramboise
TS Price
TW Yu
WR Lai
Youngmi Yoon
Publication venue: Public Library of Science
Publication date
Field of study

BACKGROUND: It is difficult to identify copy number variations (CNV) in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH) containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample. METHODOLOGY AND PRINCIPAL FINDINGS: We developed a multi-sample-based genomic variations detector (MGVD) that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs); CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR). CONCLUSIONS AND SIGNIFICANCE: We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at: http://embio.yonsei.ac.kr/~Park/mgvd.php

Crossref

Directory of Open Access Journals

PubMed Central

Drug voyager: a computational platform for exploring unintended drug action

Author: A Arif
A Chatr-Aryamontri
A Frolkis
A Gottlieb
A Gottlieb
A Hamosh
A Skrbo
AP Davis
B Bolgár
BV Zlokovic
CC Huang
CF Schaefer
CG Lee
Chihyun Park
CJ Sherr
CS Nicolas
DS Wishart
DW Huang
EM Bublil
EN Pearce
G Jin
GB Anker
Giup Jang
GW Yardy
I Vastrik
J Lamb
J Li
Jaegyoon Ahn
KL Pierce
L Licata
L Salwinski
LR Chang
M Campillos
M Kuhn
M Silver
M Vidal
M Whirl-Carrillo
ME Mycielska
Min Oh
MR Hurle
MV Relling
N Wang
NP Tatonetti
P Gareri
P Kamineni
R Alam
RE Langley
RM Kypta
RT Dorsam
S Chakraborty
S Özen
SG Arbuck
SJ Nelson
Taekeon Lee
WC Hung
WP McGuire
XZ Zhang
Y Silberberg
YC Lee
Youngmi Yoon
YS Prakash
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

RASER: reads aligner for SNPs and editing sites of RNA.

Author: Ahn Jaegyoon,
Publication venue
Publication date: 20/04/2023
Field of study

Ezid

RASER: reads aligner for SNPs and editing sites of RNA

Author: Jaegyoon Ahn
Xinshu Xiao
Publication venue: 'Oxford University Press (OUP)'
Publication date: 30/08/2015
Field of study

Motivation: Accurate identification of genetic variants such as single-nucleotide polymorphisms (SNPs) or RNA editing sites from RNA-Seq reads is important, yet challenging, because it necessitates a very low false-positive rate in read mapping. Although many read aligners are available, no single aligner was specifically developed or tested as an effective tool for SNP and RNA editing prediction. Results: We present RASER, an accurate read aligner with novel mapping schemes and index tree structure that aims to reduce false-positive mappings due to existence of highly similar regions. We demonstrate that RASER shows the best mapping accuracy compared with other popular algorithms and highest sensitivity in identifying multiply mapped reads. As a result, RASER displays superb efficacy in unbiased mapping of the alternative alleles of SNPs and in identification of RNA editing sites. Availability and implementation: RASER is written in C++ and freely available for download at https://github.com/jaegyoonahn/RASER. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

Crossref

PubMed Central

eScholarship - University of California

An Improved Method for Prediction of Cancer Prognosis by Network Learning

Author: Ilhwan Oh
Jaegyoon Ahn
Minseon Kim
Publication venue: 'MDPI AG'
Publication date: 01/10/2018
Field of study

Accurate identification of prognostic biomarkers is an important yet challenging goal in bioinformatics. Many bioinformatics approaches have been proposed for this purpose, but there is still room for improvement. In this paper, we propose a novel machine learning-based method for more accurate identification of prognostic biomarker genes and use them for prediction of cancer prognosis. The proposed method specifies the candidate prognostic gene module by graph learning using the generative adversarial networks (GANs) model, and scores genes using a PageRank algorithm. We applied the proposed method to multiple-omics data that included copy number, gene expression, DNA methylation, and somatic mutation data for five cancer types. The proposed method showed better prediction accuracy than did existing methods. We identified many prognostic genes and their roles in their biological pathways. We also showed that the genes identified from different omics data were complementary, which led to improved accuracy in prediction using multi-omics data

Directory of Open Access Journals

Accurate Prediction of Cancer Prognosis by Exploiting Patient-Specific Cancer Driver Genes

Author: Heewon Jung
Jaegyoon Ahn
Jiwoo Park
Suyeon Lee
Publication venue: 'MDPI AG'
Publication date: 01/03/2023
Field of study

Accurate prediction of the prognoses of cancer patients and identification of prognostic biomarkers are both important for the improved treatment of cancer patients, in addition to enhanced anticancer drugs. Many previous bioinformatic studies have been carried out to achieve this goal; however, there remains room for improvement in terms of accuracy. In this study, we demonstrated that patient-specific cancer driver genes could be used to predict cancer prognoses more accurately. To identify patient-specific cancer driver genes, we first generated patient-specific gene networks before using modified PageRank to generate feature vectors that represented the impacts genes had on the patient-specific gene network. Subsequently, the feature vectors of the good and poor prognosis groups were used to train the deep feedforward network. For the 11 cancer types in the TCGA data, the proposed method showed a significantly better prediction performance than the existing state-of-the-art methods for three cancer types (BRCA, CESC and PAAD), better performance for five cancer types (COAD, ESCA, HNSC, KIRC and STAD), and a similar or slightly worse performance for the remaining three cancer types (BLCA, LIHC and LUAD). Furthermore, the case study for the identified breast cancer and cervical squamous cell carcinoma prognostic genes and their subnetworks included several pathways associated with the progression of breast cancer and cervical squamous cell carcinoma. These results suggested that heterogeneous cancer driver information may be associated with cancer prognosis

Directory of Open Access Journals

System overview.

Author: Jaegyoon Ahn (199346)
Min Oh (651889)
Youngmi Yoon (199347)
Publication venue
Publication date
Field of study

<p>(a) “Adjacency-Based Inference” measures the drug-drug (disease-disease) adjacency among known drug-disease associations, and infers new drug-disease association. “Module-Distance-Based Inference” derives drug-drug (disease-disease) gene module among known drug-disease associations, measures the distance between the gene module and disease (drug), and infers new drug-disease association. (b) Drug-disease relationship represented by score becomes features. Various machine learning based classifiers are built with those features, and predict unknown drug-disease relationship.</p

FigShare